Researcher profile

Xiao Yu

Xiao Yu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

ConFit v3: Improving Resume-Job Matching with LLM-based Re-Ranking

A reliable resume-job matching system helps a company find suitable candidates from a pool of resumes and helps a job seeker find relevant jobs from a list of job posts. While recent advances in embedding-based methods such as ConFit and ConFit v2 can efficiently retrieve candidates at scale, the lack of controllability and explainability limits their real-world adaptations. LLM-based re-rankers can address these limitations through reasoning, but existing training recipes are developed on short-document benchmarks and do not account for noise in real-world recruiting data. In this work, we first conduct a systematic analysis over the LLM re-ranker training pipeline for person-job fit, covering inference algorithm design, RL algorithm selection, data processing, and SFT distillation. We find that using multi-pass re-ranking, training with listwise RL objectives, removing noisy samples, and distilling from a stronger LLM before RL significantly improves re-ranking performance. We then aggregate these findings to train ConFit v3 with Qwen3-8B and Qwen3-32B on real-world person-job fit datasets, and find significant improvements over existing best person-job fit systems as well as strong LLMs such as GPT-5 and Claude Opus-4.5. We hope our findings provide useful insights for future research on adapting LLM-based re-rankers to person-job fit systems.

preprint2022arXiv

A recipe of training neural network-based LDPC decoders

It is known belief propagation decoding variants of LDPC codes can be unrolled easily as neural networks after assigning differed weights to message passing edges flexibly. In this paper we focus on how to determine these weights, in the form of trainable paramters, within a framework of deep learning. Firstly, a new method is proposed to generate high-quality training data via exploiting an approximation to the targeted mixture density. Then the strong positive correlation between training loss and decoding metrics is fully exposed after tracing the training evolution curves. Lastly, for the purpose of facilitating training convergence and reducing decoding complexity, we highlight the necessity of slashing the number of trainable parameters while emphasizing the locations of these survived ones, which is justified in the extensive simulation.

preprint2022arXiv

Bilinear integral operator on Morrey-Banach spaces and its application

In this paper, we give the definability of bilinear singular and fractional integral operators on Morrey-Banach space, as well as their commutators and we prove the boundedness of such operators on Morrey-Banach spaces. Moreover, the necessary condition for BMO via the bounedness of bilinear commutators on Morrey-Banach space is also given. As a application of our main results, we get the necessary conditions for BMO via the bounedness of bilinear integral operators on weighted Morrey space and Morrey space with variable exponents. Finally, we obtain the boundedness of bilinear C-Z operator on Morrey space with variable exponents.

preprint2022arXiv

Improving Model Training via Self-learned Label Representations

Modern neural network architectures have shown remarkable success in several large-scale classification and prediction tasks. Part of the success of these architectures is their flexibility to transform the data from the raw input representations (e.g. pixels for vision tasks, or text for natural language processing tasks) to one-hot output encoding. While much of the work has focused on studying how the input gets transformed to the one-hot encoding, very little work has examined the effectiveness of these one-hot labels. In this work, we demonstrate that more sophisticated label representations are better for classification than the usual one-hot encoding. We propose Learning with Adaptive Labels (LwAL) algorithm, which simultaneously learns the label representation while training for the classification task. These learned labels can significantly cut down on the training time (usually by more than 50%) while often achieving better test accuracies. Our algorithm introduces negligible additional parameters and has a minimal computational overhead. Along with improved training times, our learned labels are semantically meaningful and can reveal hierarchical relationships that may be present in the data.

preprint2022arXiv

Improving Stack Overflow question title generation with copying enhanced CodeBERT model and bi-modal information

Context: Stack Overflow is very helpful for software developers who are seeking answers to programming problems. Previous studies have shown that a growing number of questions are of low quality and thus obtain less attention from potential answerers. Gao et al. proposed an LSTM-based model (i.e., BiLSTM-CC) to automatically generate question titles from the code snippets to improve the question quality. However, only using the code snippets in the question body cannot provide sufficient information for title generation, and LSTMs cannot capture the long-range dependencies between tokens. Objective: This paper proposes CCBERT, a deep learning based novel model to enhance the performance of question title generation by making full use of the bi-modal information of the entire question body. Method: CCBERT follows the encoder-decoder paradigm and uses CodeBERT to encode the question body into hidden representations, a stacked Transformer decoder to generate predicted tokens, and an additional copy attention layer to refine the output distribution. Both the encoder and decoder perform the multi-head self-attention operation to better capture the long-range dependencies. This paper builds a dataset containing around 200,000 high-quality questions filtered from the data officially published by Stack Overflow to verify the effectiveness of the CCBERT model. Results: CCBERT outperforms all the baseline models on the dataset. Experiments on both code-only and low-resource datasets show the superiority of CCBERT with less performance degradation. The human evaluation also shows the excellent performance of CCBERT concerning both readability and correlation criteria.

preprint2022arXiv

Robust single-sideband-modulated Raman light generation for atom interferometry by FBG-based optical rectangular filtration

Low-phase-noise and pure-spectrum Raman light is vital for high-precision atom interferometry by two-photon Raman transition. A preferred and prevalent solution for Raman light generation is electro-optic phase modulation. However, phase modulation inherently brings in double sidebands, resulting in residual sideband effects of multiple laser pairs beside Raman light in atom interferometry. Based on a well-designed rectangular fiber Bragg grating and an electro-optic modulator, optical single-sideband modulation has been realized at 1560 nm with a stable suppression ratio better than -25 dB despite of intense temperature variations. After optical filtration and frequency doubling, a robust phase-coherent Raman light at 780 nm is generated with a stable SNR of better than -19 dB and facilitates measuring the local gravity successfully. This proposed all-fiber single-sideband-modulated Raman light source, characterized as robust, compact and low-priced, is practical and potential for field applications of portable atom interferometry.

preprint2021arXiv

Asymptotic spreading of KPP reactive fronts in heterogeneous shifting environments

We study the asymptotic spreading of Kolmogorov-Petrovsky-Piskunov (KPP) fronts in heterogeneous shifting habitats, with any number of shifting speeds, by further developing the method based on the theory of viscosity solutions of Hamilton-Jacobi equations. Our framework addresses both reaction-diffusion equation and integro-differential equations with a distributed time-delay. The latter leads to a class of limiting equations of Hamilton-Jacobi-type depending on the variable $x/t$ and in which the time and space derivatives are coupled together. We will first establish uniqueness results for these Hamilton-Jacobi equations using elementary arguments, and then characterize the spreading speed in terms of a reduced equation on a one-dimensional domain in the variable $s=x/t$. In terms of the standard Fisher-KPP equation, our results leads to a new class of "asymptotically homogeneous" environments which share the same spreading speed with the corresponding homogeneous environments.

preprint2020arXiv

Identification of Challenging Highway-Scenarios for the Safety Validation of Automated Vehicles Based on Real Driving Data

For a successful market launch of automated vehicles (AVs), proof of their safety is essential. Due to the open parameter space, an infinite number of traffic situations can occur, which makes the proof of safety an unsolved problem. With the so-called scenario-based approach, all relevant test scenarios must be identified. This paper introduces an approach that finds particularly challenging scenarios from real driving data (\RDDwo) and assesses their difficulty using a novel metric. Starting from the highD data, scenarios are extracted using a hierarchical clustering approach and then assigned to one of nine pre-defined functional scenarios using rule-based classification. The special feature of the subsequent evaluation of the concrete scenarios is that it is independent of the performance of the test vehicle and therefore valid for all AVs. Previous evaluation metrics are often based on the criticality of the scenario, which is, however, dependent on the behavior of the test vehicle and is therefore only conditionally suitable for finding "good" test cases in advance. The results show that with this new approach a reduced number of particularly challenging test scenarios can be derived.

preprint2020arXiv

Populations with individual variation in dispersal in heterogeneous environments: dynamics and competition with simply diffusing populations

We consider a model for a population in a heterogeneous environment, with logistic type local population dynamics, under the assumption that individuals can switch between two different nonzero rates of diffusion. Such switching behavior has been observed in some natural systems. We study how environmental heterogeneity and the rates of switching and diffusion affect the persistence of the population. The reaction diffusion systems in the models can be cooperative at some population densities and competitive at others. The results extend our previous work on similar models in homogeneous environments. We also consider competition between two populations that are ecologically identical, but where one population diffuses at a fixed rate and the other switches between two different diffusion rates. The motivation for that is to gain insight into when switching might be advantageous versus diffusing at a fixed rate. This is a variation on the classical results for ecologically identical competitors with differing fixed diffusion rates, where it is well known that the slower diffuser wins.