Researcher profile

Yue Yin

Yue Yin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

A Model-Agnostic Causal Learning Framework for Recommendation using Search Data

Machine-learning based recommender systems(RSs) has become an effective means to help people automatically discover their interests. Existing models often represent the rich information for recommendation, such as items, users, and contexts, as embedding vectors and leverage them to predict users' feedback. In the view of causal analysis, the associations between these embedding vectors and users' feedback are a mixture of the causal part that describes why an item is preferred by a user, and the non-causal part that merely reflects the statistical dependencies between users and items, for example, the exposure mechanism, public opinions, display position, etc. However, existing RSs mostly ignored the striking differences between the causal parts and non-causal parts when using these embedding vectors. In this paper, we propose a model-agnostic framework named IV4Rec that can effectively decompose the embedding vectors into these two parts, hence enhancing recommendation results. Specifically, we jointly consider users' behaviors in search scenarios and recommendation scenarios. Adopting the concepts in causal analysis, we embed users' search behaviors as instrumental variables (IVs), to help decompose original embedding vectors in recommendation, i.e., treatments. IV4Rec then combines the two parts through deep neural networks and uses the combined results for recommendation. IV4Rec is model-agnostic and can be applied to a number of existing RSs such as DIN and NRHUB. Experimental results on both public and proprietary industrial datasets demonstrate that IV4Rec consistently enhances RSs and outperforms a framework that jointly considers search and recommendation.

preprint2022arXiv

Integrable semi-discretisation of the Drinfel'd--Sokolov hierarchies

We propose a novel semi-discrete Kadomtsev--Petviashvili equation with two discrete and one continuous independent variables, which is integrable in the sense of having the standard and adjoint Lax pairs, from the direct linearisation framework. By performing reductions on the semi-discrete Kadomtsev--Petviashvili equation, new semi-discrete versions of the Drinfel'd--Sokolov hierarchies associated with Kac--Moody Lie algebras $A_r^{(1)}$, $A_{2r}^{(2)}$, $C_r^{(1)}$ and $D_{r+1}^{(2)}$ are successfully constructed. A Lax pair involving the fraction of $\mathbb{Z}_\mathcal{N}$ graded matrices is also found for each of the semi-discrete Drinfel'd--Sokolov equations. Furthermore, the direct linearisation construction guarantees the existence of exact solutions of all the semi-discrete equations discussed in the paper, providing another insight into their integrability in addition to the analysis of Lax pairs.

preprint2022arXiv

RecipeSnap -- a lightweight image-to-recipe model

In this paper we want to address the problem of automation for recognition of photographed cooking dishes and generating the corresponding food recipes. Current image-to-recipe models are computation expensive and require powerful GPUs for model training and implementation. High computational cost prevents those existing models from being deployed on portable devices, like smart phones. To solve this issue we introduce a lightweight image-to-recipe prediction model, RecipeSnap, that reduces memory cost and computational cost by more than 90% while still achieving 2.0 MedR, which is in line with the state-of-the-art model. A pre-trained recipe encoder was used to compute recipe embeddings. Recipes from Recipe1M dataset and corresponding recipe embeddings are collected as a recipe library, which are used for image encoder training and image query later. We use MobileNet-V2 as image encoder backbone, which makes our model suitable to portable devices. This model can be further developed into an application for smart phones with a few effort. A comparison of the performance between this lightweight model to other heavy models are presented in this paper. Code, data and models are publicly accessible on github.

preprint2022arXiv

SerialTrack: ScalE and Rotation Invariant Augmented Lagrangian Particle Tracking

We present a new particle tracking algorithm to accurately resolve large deformation and rotational motion fields, which takes advantage of both local and global particle tracking algorithms. We call this method the ScalE and Rotation Invariant Augmented Lagrangian Particle Tracking (SerialTrack). This method builds an iterative scale and rotation invariant topology-based feature for each particle within a multi-scale tracking algorithm. The global kinematic compatibility condition is applied as a global augmented Lagrangian constraint to enhance the tracking accuracy. An open source software package implementing this numerical approach to track both 2D and 3D, incremental and cumulative deformation fields is provided.

preprint2021arXiv

On the origin of GeV spectral break for Fermi blazars: 3C 454.3

The GeV break in spectra of the blazar 3C 454.3 is a special observation feature that has been discovered by the {\it Fermi}-LAT. The origin of the GeV break in the spectra is still under debate. In order to explore the possible source of GeV spectral break in 3C 454.3, a one-zone homogeneous leptonic jet model, as well as the {\it McFit} technique are utilized for fitting the quasi-simultaneous multi-waveband spectral energy distribution (SED) of 3C 454.3. The outside border of the broad-line region (BLR) and inner dust torus are chosen to contribute radiation in the model as external, seed photons to the external-Compton process, considering the observed $γ$-ray radiation. The combination of two components, namely the Compton-scattered BLR and dust torus radiation, assuming a broken power-law distribution of emitted particles, provides a proper fitting to the multi-waveband SED of 3C 454.3 detected 2008 Aug 3 - Sept 2 and explains the GeV spectral break. We propose that the spectral break of 3C 454.3 may originate from an inherent break in the energy distribution of the emitted particles and the Klein-Nishina effect. A comparison is performed between the energy density of the 'external' photon field for the whole BLR $U_{\rm BLR}$ achieved via model fitting and that constrained from the BLR data. The distance from the position of the $γ$-ray radiation area of 3C 454.3 to the central black hole could be constrained at $\sim 0.78$pc ($\sim 4.00 R_{\rm BLR}$, the size of the BLR).

preprint2020arXiv

An empirical "high-confidence" candidate zone for $Fermi$ BL Lacertae objects

In the third catalog of active galactic nuclei detected by the $Fermi$ Large Area Telescope Clean (3LAC) sample, there are 402 blazars candidates of uncertain type (BCU). The proposed analysis will help to evaluate the potential optical classification flat spectrum radio quasars (FSRQs) versus BL Lacertae (BL Lacs) objects of BCUs, which can help to understand which is the most elusive class of blazar hidden in the Fermi sample. By studying the 3LAC sample, we found some critical values of $γ$-ray photon spectral index ($Γ_{\rm ph}$), variability index (VI) and radio flux (${\rm F_R}$) of the sources separate known FSRQs and BL Lac objects. We further utilize those values to defined an empirical &#34;high-confidence&#34; candidate zone that can be used to classify the BCUs. Within such a zone ($Γ_{\rm ph}<2.187$, log${\rm F_R}<2.258$ and ${ \rm logVI <1.702}$), we found that 120 BCUs can be classified BL Lac candidates with a higher degree of confidence (with a misjudged rate $<1\%$). Our results suggest that an empirical &#34;high confidence&#34; diagnosis is possible to distinguish the BL Lacs from the Fermi observations based on only on the direct observational data of $Γ_{\rm ph}$, VI and ${\rm F_R}$.

preprint2020arXiv

Confirmed width-Eiso and width-Liso relations in GRB: comparison with the Amati and Yonetoku relations

In this paper, we select a sample including 141 BEST time-integrated F spectra and 145 BEST peak flux P spectra observed by the Konus-Wind with known redshift to recheck the connection between the spectral width and $E_{iso}$ as well as $L_{iso}$. We define six types of absolute spectral widths. It is found that all of the rest-frame absolute spectral widths are strongly positive correlated with $E_{iso}$ as well as $L_{iso}$ for the long burst for both the F and P spectra. All of the short bursts are the outliers for width-$E_{iso}$ relation and most of the short bursts are consistent with the long bursts for the width-$L_{iso}$ relation for both F and P spectra. Moreover, all of the location energy, $E_{2}$ and $E_{1}$, corresponding to various spectral widths are also positive correlated with $E_{iso}$ as well as $L_{iso}$. We compare all of the relations with the Amati and Yonetoku relations and find the width-$E_{iso}$ and width-$L_{iso}$ relations when the widths are at about 90\% maximum of the $EF_{E}$ spectra almost overlap with Amati relation and Yonetoku relation, respectively. The correlations of $E_{2}-E_{iso}$, $E_{1}-E_{iso}$ and $E_{2}-L_{iso}$, $E_{1}-L_{iso}$ when the location energies are at 99\% maximum of the $EF_{E}$ spectra are very close to the Amati and Yonetoku relations, respectively. Therefore, we confirm the existence of tight width-$E_{iso}$ and width-$L_{iso}$ relations for long bursts. We further show that the spectral shape is indeed related to $E_{iso}$ and $L_{iso}$. The Amati and Yonetoku relations are not necessarily the best relationships to relate the energy to the $E_{iso}$ and $L_{iso}$. They may be the special cases of the width-$E_{iso}$ and width-$L_{iso}$ relations or the energy-$E_{iso}$ and energy-$L_{iso}$ relations.

preprint2020arXiv

Duluth at SemEval-2020 Task 7: Using Surprise as a Key to Unlock Humorous Headlines

We use pretrained transformer-based language models in SemEval-2020 Task 7: Assessing the Funniness of Edited News Headlines. Inspired by the incongruity theory of humor, we use a contrastive approach to capture the surprise in the edited headlines. In the official evaluation, our system gets 0.531 RMSE in Subtask 1, 11th among 49 submissions. In Subtask 2, our system gets 0.632 accuracy, 9th among 32 submissions.

preprint2019arXiv

Evaluating the classification of Fermi BCUs from the 4FGL Catalog Using Machine Learning

The recently published fourth Fermi Large Area Telescope source catalog (4FGL) reports 5065 gamma-ray sources in terms of direct observational gamma-ray properties. Among the sources, the largest population is the Active Galactic Nuclei (AGN), which consists of 3137 blazars, 42 radio galaxies, and 28 other AGNs. The blazar sample comprises 694 flat-spectrum radio quasars (FSRQs), 1131 BL Lac-type objects (BL Lacs), and 1312 blazar candidates of an unknown type (BCUs). The classification of blazars is difficult using optical spectroscopy given the limited knowledge with respect to their intrinsic properties, and the limited availability of astronomical observations. To overcome these challenges, machine learning algorithms are being investigated as alternative approaches. Using the 4FGL catalog, a sample of 3137 Fermi blazars with 23 parameters is systematically selected. Three established supervised machine learning algorithms (random forests (RFs), support vector machines (SVMs), artificial neural networks (ANNs)) are employed to general predictive models to classify the BCUs. We analyze the results for all of the different combinations of parameters. Interestingly, a previously reported trend the use of more parameters leading to higher accuracy is not found. Considering the least number of parameters used, combinations of eight, 12 or 10 parameters in the SVM, ANN, or RF generated models achieve the highest accuracy (Accuracy $\simeq$ 91.8\%, or $\simeq$ 92.9\%). Using the combined classification results from the optimal combinations of parameters, 724 BL Lac type candidates and 332 FSRQ type candidates are predicted; however, 256 remain without a clear prediction.