Source author record

Yu Lin

Yu Lin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.CA Artificial Intelligence cond-mat.mtrl-sci Social and Information Networks Computation and Language Computer Vision Data Structures and Algorithms eess.AS Information Retrieval math.CV physics.space-ph Quantitative Methods Sound

Catalog footprint

What is connected

16works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Active-SAOOD: Active Sparsely Annotated Oriented Object Detection in Remote Sensing Images

Reducing the annotation cost of oriented object detection in remote sensing remains a major challenge. Recently, sparse annotation has gained attention for effectively reducing annotation redundancy in densely remote sensing scenes. However, (1) the sparse data reliance on class-dependent sampling, and (2) the lack of in-depth investigation into the characteristics of sparse samples hinders its further development. This paper proposes an active learning-based sparsely annotated oriented object detection (SAOOD) method, termed Active-SAOOD. Based on a model state observation module, Active-SAOOD actively selects the most valuable sparse samples at the instance level that are best suited to the current model state, by jointly considering orientation, classification, and localization uncertainty, as well as inter- and intra-class diversity. This design enables SAOOD to operate stably under completely randomly initialized sparse annotations and extends its applicability to broader real-world. Experiments on multiple datasets demonstrate that Active-SAOOD significantly improves both performance and stability of existing SAOOD methods under various random sparse annotation. In particular, with only 1\% annotated ratios, it achieves a 9\% performance gain over the baseline, further enhancing the practical value of SAOOD in remote sensing. The code will be public.

preprint2026arXiv

Unifying Speech Recognition, Synthesis and Conversion with Autoregressive Transformers

Traditional speech systems typically rely on separate, task-specific models for text-to-speech (TTS), automatic speech recognition (ASR), and voice conversion (VC), resulting in fragmented pipelines that limit scalability, efficiency, and cross-task generalization. In this paper, we present General-Purpose Audio (GPA), a unified audio foundation model that integrates multiple core speech tasks within a single large language model (LLM) architecture. GPA operates on a shared discrete audio token space and supports instruction-driven task induction, enabling a single autoregressive model to flexibly perform TTS, ASR, and VC without architectural modifications. This unified design combines a fully autoregressive formulation over discrete speech tokens, joint multi-task training across speech domains, and a scalable inference pipeline that achieves high concurrency and throughput. The resulting model family supports efficient multi-scale deployment, including a lightweight 0.3B-parameter variant optimized for edge and resource-constrained environments. Together, these design choices demonstrate that a unified autoregressive architecture can achieve competitive performance across diverse speech tasks while remaining viable for low-latency, practical deployment.

preprint2022arXiv

Dirac nodal lines in the quasi-one-dimensional ternary telluride TaPtTe$_5$

A Dirac nodal-line phase, as a quantum state of topological materials, usually occur in three-dimensional or at least two-dimensional materials with sufficient symmetry operations that could protect the Dirac band crossings. Here, we report a combined theoretical and experimental study on the electronic structure of the quasi-one-dimensional ternary telluride TaPtTe$_5$, which is corroborated as being in a robust nodal-line phase with fourfold degeneracy. Our angle-resolved photoemission spectroscopy measurements show that two pairs of linearly dispersive Dirac-like bands exist in a very large energy window, which extend from a binding energy of $\sim$ 0.75 eV to across the Fermi level. The crossing points are at the boundary of Brillouin zone and form Dirac-like nodal lines. Using first-principles calculations, we demonstrate the existing of nodal surfaces on the $k_y = \pm π$ plane in the absence of spin-orbit coupling (SOC), which are protected by nonsymmorphic symmetry in TaPtTe$_5$. When SOC is included, the nodal surfaces are broken into several nodal lines. By theoretical analysis, we conclude that the nodal lines along $Y$-$T$ and the ones connecting the $R$ points are non-trivial and protected by nonsymmorphic symmetry against SOC.

preprint2022arXiv

Improving Contextual Representation with Gloss Regularized Pre-training

Though achieving impressive results on many NLP tasks, the BERT-like masked language models (MLM) encounter the discrepancy between pre-training and inference. In light of this gap, we investigate the contextual representation of pre-training and inference from the perspective of word probability distribution. We discover that BERT risks neglecting the contextual word similarity in pre-training. To tackle this issue, we propose an auxiliary gloss regularizer module to BERT pre-training (GR-BERT), to enhance word semantic similarity. By predicting masked words and aligning contextual embeddings to corresponding glosses simultaneously, the word similarity can be explicitly modeled. We design two architectures for GR-BERT and evaluate our model in downstream tasks. Experimental results show that the gloss regularizer benefits BERT in word-level and sentence-level semantic representation. The GR-BERT achieves new state-of-the-art in lexical substitution task and greatly promotes BERT sentence representation in both unsupervised and supervised STS tasks.

preprint2021arXiv

A Highly Scalable Labelling Approach for Exact Distance Queries in Complex Networks

Answering exact shortest path distance queries is a fundamental task in graph theory. Despite a tremendous amount of research on the subject, there is still no satisfactory solution that can scale to billion-scale complex networks. Labelling-based methods are well-known for rendering fast response time to distance queries; however, existing works can only construct labelling on moderately large networks (million-scale) and cannot scale to large networks (billion-scale) due to their prohibitively large space requirements and very long preprocessing time. In this work, we present novel techniques to efficiently construct distance labelling and process exact shortest path distance queries for complex networks with billions of vertices and billions of edges. Our method is based on two ingredients: (i) a scalable labelling algorithm for constructing minimal distance labelling, and (ii) a querying framework that supports fast distance-bounded search on a sparsified graph. Thus, we first develop a novel labelling algorithm that can scale to graphs at the billion-scale. Then, we formalize a querying framework for exact distance queries, which combines our proposed highway cover distance labelling with distance-bounded searches to enable fast distance computation. To speed up the labelling construction process, we further propose a parallel labelling method that can construct labelling simultaneously for multiple landmarks. We evaluated the performance of the proposed methods on 12 real-world networks. The experiments show that the proposed methods can not only handle networks with billions of vertices, but also be up to 70 times faster in constructing labelling and save up to 90\% of labelling space. In particular, our method can answer distance queries on a billion-scale network of around 8B edges in less than 1ms, on average.

preprint2021arXiv

Multiplex Bipartite Network Embedding using Dual Hypergraph Convolutional Networks

A bipartite network is a graph structure where nodes are from two distinct domains and only inter-domain interactions exist as edges. A large number of network embedding methods exist to learn vectorial node representations from general graphs with both homogeneous and heterogeneous node and edge types, including some that can specifically model the distinct properties of bipartite networks. However, these methods are inadequate to model multiplex bipartite networks (e.g., in e-commerce), that have multiple types of interactions (e.g., click, inquiry, and buy) and node attributes. Most real-world multiplex bipartite networks are also sparse and have imbalanced node distributions that are challenging to model. In this paper, we develop an unsupervised Dual HyperGraph Convolutional Network (DualHGCN) model that scalably transforms the multiplex bipartite network into two sets of homogeneous hypergraphs and uses spectral hypergraph convolutional operators, along with intra- and inter-message passing strategies to promote information exchange within and across domains, to learn effective node embedding. We benchmark DualHGCN using four real-world datasets on link prediction and node classification tasks. Our extensive experiments demonstrate that DualHGCN significantly outperforms state-of-the-art methods, and is robust to varying sparsity levels and imbalanced node distributions.

preprint2020arXiv

Asymptotics of the Charlier polynomials via difference equation methods

We derive uniform and non-uniform asymptotics of the Charlier polynomials by using difference equation methods alone. The Charlier polynomials are special in that they do not fit into the framework of the turning point theory, despite the fact that they are crucial in the Askey scheme. In this paper, asymptotic approximations are obtained respectively in the outside region, an intermediate region, and near the turning points. In particular, we obtain uniform asymptotic approximation at a pair of coalescing turning points with the aid of a local transformation. We also give a uniform approximation at the origin by applying the method of dominant balance and several matching techniques.

preprint2020arXiv

Deep Learning on Knowledge Graph for Recommender System: A Survey

Recent advances in research have demonstrated the effectiveness of knowledge graphs (KG) in providing valuable external knowledge to improve recommendation systems (RS). A knowledge graph is capable of encoding high-order relations that connect two objects with one or multiple related attributes. With the help of the emerging Graph Neural Networks (GNN), it is possible to extract both object characteristics and relations from KG, which is an essential factor for successful recommendations. In this paper, we provide a comprehensive survey of the GNN-based knowledge-aware deep recommender systems. Specifically, we discuss the state-of-the-art frameworks with a focus on their core component, i.e., the graph embedding module, and how they address practical recommendation issues such as scalability, cold-start and so on. We further summarize the commonly-used benchmark datasets, evaluation metrics as well as open-source codes. Finally, we conclude the survey and propose potential research directions in this rapidly growing field.

preprint2020arXiv

Magnetohydrodynamic with embedded particle-in-cell simulation of the Geospace Environment Modeling dayside kinetic processes challenge event

We use the MHD with embedded particle-in-cell model (MHD-EPIC) to study the Geospace Environment Modeling (GEM) dayside kinetic processes challenge event at 01:50-03:00 UT on 2015-11-18, when the magnetosphere was driven by a steady southward IMF. In the MHD-EPIC simulation, the dayside magnetopause is covered by a PIC code so that the dayside reconnection is properly handled. We compare the magnetic fields and the plasma profiles of the magnetopause crossing with the MMS3 spacecraft observations. Most variables match the observations well in the magnetosphere, in the magnetosheath, and also during the current sheet crossing. The MHD-EPIC simulation produces flux ropes, and we demonstrate that some magnetic field and plasma features observed by the MMS3 spacecraft can be reproduced by a flux rope crossing event. We use an algorithm to automatically identify the reconnection sites from the simulation results. It turns out that there are usually multiple X-lines at the magnetopause. By tracing the locations of the X-lines, we find the typical moving speed of the X-line endpoints is about 70~km/s, which is higher than but still comparable with the ground-based observations.

preprint2020arXiv

Modeling Dynamic Heterogeneous Network for Link Prediction using Hierarchical Attention with Temporal RNN

Network embedding aims to learn low-dimensional representations of nodes while capturing structure information of networks. It has achieved great success on many tasks of network analysis such as link prediction and node classification. Most of existing network embedding algorithms focus on how to learn static homogeneous networks effectively. However, networks in the real world are more complex, e.g., networks may consist of several types of nodes and edges (called heterogeneous information) and may vary over time in terms of dynamic nodes and edges (called evolutionary patterns). Limited work has been done for network embedding of dynamic heterogeneous networks as it is challenging to learn both evolutionary and heterogeneous information simultaneously. In this paper, we propose a novel dynamic heterogeneous network embedding method, termed as DyHATR, which uses hierarchical attention to learn heterogeneous information and incorporates recurrent neural networks with temporal attention to capture evolutionary patterns. We benchmark our method on four real-world datasets for the task of link prediction. Experimental results show that DyHATR significantly outperforms several state-of-the-art baselines.

preprint2019arXiv

Asymptotic approximations of the continuous Hahn polynomials and their zeros

Asymptotic approximations for the continuous Hahn polynomials and their zeros as the degree grows to infinity are established via their three-term recurrence relation. The methods are based on the uniform asymptotic expansions for difference equations developed by Wang and Wong (\textit{Numer. Math.}, 2003) and the matching technique in the complex plane developed by Wang (\textit{J. Approx. Theory}, 2014).

preprint2015arXiv

Pressure induced metallization with absence of structural transition in layered MoSe2

Layered transition-metal dichalcogenides have emerged as exciting material systems with atomically thin geometries and unique electronic properties. Pressure is a powerful tool for continuously tuning their crystal and electronic structures away from the pristine states. Here, we systematically investigated the pressurized behavior of MoSe2 up to ~ 60 GPa using multiple experimental techniques and ab -initio calculations. MoSe2 evolves from an anisotropic two-dimensional layered network to a three-dimensional structure without a structural transition, which is a complete contrast to MoS2. The role of the chalcogenide anions in stabilizing different layered patterns is underscored by our layer sliding calculations. MoSe2 possesses highly tunable transport properties under pressure, determined by the gradual narrowing of its band-gap followed by metallization. The continuous tuning of its electronic structure and band-gap in the range of visible light to infrared suggest possible energy-variable optoelectronics applications in pressurized transition-metal dichalcogenides.

preprint2014arXiv

Uniform asymptotics for discrete orthogonal polynomials on infinite nodes with an accumulation point

In this paper, we develop the Riemann-Hilbert method to study the asymptotics of discrete orthogonal polynomials on infinite nodes with an accumulation point. To illustrate our method, we consider the Tricomi-Carlitz polynomials $f_n^{(α)}(z)$ where $α$ is a positive parameter. Uniform Plancherel-Rotach type asymptotic formulas are obtained in the entire complex plane including a neighborhood of the origin, and our results agree with the ones obtained earlier in [{\it SIAM J.\;Math.\;Anal} {\bf 25} (1994)] and [{{\it Proc.\;Amer.\;Math.\;Soc.\,}{\bf138} (2010)}].

preprint2014arXiv

Weights with both absolutely continuous and discrete components: Asymptotics via the Riemann-Hilbert approach

We study the uniform asymptotics for the orthogonal polynomials with respect to weights composed of both absolutely continuous measure and discrete measure, by taking a special class of the sieved Pollazek Polynomials as an example. The Plancherel-Rotach type asymptotics of the sieved Pollazek Polynomials are obtained in the whole complex plane. The Riemann-Hilbert method is applied to derive the results. A main feature of the treatment is the appearance of a new band consisting of two adjacent intervals, one of which is a portion of the support of the absolutely continuous measure, the other is the discrete band.

preprint2013arXiv

Existence and Uniqueness of Tronquée Solutions of the Third and Fourth Painlevé Equations

It is well-known that the first and second Painlevé equations admit solutions characterised by divergent asymptotic expansions near infinity in specified sectors of the complex plane. Such solutions are pole-free in these sectors and called tronquée solutions by Boutroux. In this paper, we show that similar solutions exist for the third and fourth Painlevé equations as well.

preprint2013arXiv

Phylogenetic Analysis of Cell Types using Histone Modifications

In cell differentiation, a cell of a less specialized type becomes one of a more specialized type, even though all cells have the same genome. Transcription factors and epigenetic marks like histone modifications can play a significant role in the differentiation process. In this paper, we present a simple analysis of cell types and differentiation paths using phylogenetic inference based on ChIP-Seq histone modification data. We propose new data representation techniques and new distance measures for ChIP-Seq data and use these together with standard phylogenetic inference methods to build biologically meaningful trees that indicate how diverse types of cells are related. We demonstrate our approach on H3K4me3 and H3K27me3 data for 37 and 13 types of cells respectively, using the dataset to explore various issues surrounding replicate data, variability between cells of the same type, and robustness. The promising results we obtain point the way to a new approach to the study of cell differentiation.

Yu Lin

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Active-SAOOD: Active Sparsely Annotated Oriented Object Detection in Remote Sensing Images

Unifying Speech Recognition, Synthesis and Conversion with Autoregressive Transformers

Dirac nodal lines in the quasi-one-dimensional ternary telluride TaPtTe$_5$

Improving Contextual Representation with Gloss Regularized Pre-training

A Highly Scalable Labelling Approach for Exact Distance Queries in Complex Networks

Multiplex Bipartite Network Embedding using Dual Hypergraph Convolutional Networks

Asymptotics of the Charlier polynomials via difference equation methods

Deep Learning on Knowledge Graph for Recommender System: A Survey

Magnetohydrodynamic with embedded particle-in-cell simulation of the Geospace Environment Modeling dayside kinetic processes challenge event

Modeling Dynamic Heterogeneous Network for Link Prediction using Hierarchical Attention with Temporal RNN

Asymptotic approximations of the continuous Hahn polynomials and their zeros

Pressure induced metallization with absence of structural transition in layered MoSe2

Uniform asymptotics for discrete orthogonal polynomials on infinite nodes with an accumulation point

Weights with both absolutely continuous and discrete components: Asymptotics via the Riemann-Hilbert approach

Existence and Uniqueness of Tronquée Solutions of the Third and Fourth Painlevé Equations

Phylogenetic Analysis of Cell Types using Histone Modifications