Researcher profile

Xingyuan Chen

Xingyuan Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Echo-LoRA: Parameter-Efficient Fine-Tuning via Cross-Layer Representation Injection

Parameter-efficient fine-tuning (PEFT) has become a practical route for adapting large language models to downstream tasks, with LoRA-style methods being particularly attractive because they are inexpensive to train and easy to deploy. Most LoRA variants, however, revise the update rule within the weight space of each layer and leave the intermediate representations formed by deeper layers largely unused. We propose Echo-LoRA, a cross-layer representation injection method for parameter-efficient fine-tuning. During training, Echo-LoRA collects boundary hidden states from deeper source layers, aggregates them into a sample-level echo representation, and uses lightweight projection and gating networks to inject the resulting signal into shallow LoRA or DoRA modules. Answer-only masking, masked distillation, and stochastic routing are used to keep this auxiliary path stable and to reduce the gap between training and inference. On eight commonsense reasoning benchmarks, Echo-LoRA exceeds the reported LoRA baselines by 5.7 percentage points on average across LLaMA-7B, LLaMA2-7B, and LLaMA3-8B. Under reproduced LoRA baselines in our unified implementation, the average gain is 3.0 points; when combined with DoRA, the gain is 2.7 points. The Echo path is discarded after training, so the deployed model keeps the original low-rank LoRA/DoRA form and adds neither inference-time parameters nor inference computation.

preprint2026arXiv

Robust Conditional Conformal Prediction via Branched Normalizing Flow

Conformal prediction (CP) constructs prediction sets with marginal coverage guarantees under the assumption that the calibration and test distributions are identical. However, under distribution shift, existing approaches primarily align marginal conformal score distributions, which is sufficient to preserve marginal coverage but does not control the conditional coverage error at individual test inputs. As a consequence, CP can remain unreliable in regions where the conditional score distributions are mismatched. In this work, we bound the conditional invalidity of CP under distribution shift in terms of the Wasserstein distance between the calibration and test distributions. This result highlights the role of invertible transport in mitigating conditional coverage degradation. Motivated by this insight, we introduce Branched Normalizing Flow (BNF), a two-branch architecture that normalizes a test input to the calibration distribution and transforms the prediction set of the normalized input back to the test distribution while preserving conditional guarantees. Empirically, BNF consistently improves conditional coverage robustness on nine datasets across a wide range of confidence levels.

preprint2022arXiv

A flexible split-step scheme for solving McKean-Vlasov Stochastic Differential Equations

We present an implicit Split-Step explicit Euler type Method (dubbed SSM) for the simulation of McKean-Vlasov Stochastic Differential Equations (MV-SDEs) with drifts of superlinear growth in space, Lipschitz in measure and non-constant Lipschitz diffusion coefficient. The scheme is designed to leverage the structure induced by the interacting particle approximation system, including parallel implementation and the solvability of the implicit equation. The scheme attains the classical $1/2$ root mean square error (rMSE) convergence rate in stepsize and closes the gap left by [18, "Simulation of McKean-Vlasov SDEs with super-linear growth" in IMA Journal of Numerical Analysis, 01 2021. draa099] regarding efficient implicit methods and their convergence rate for this class of McKean-Vlasov SDEs. A sufficient condition for mean-square contractivity of the scheme is presented. Several numerical examples are presented, including a comparative analysis to other known algorithms for this class (Taming and Adaptive time-stepping) across parallel and non-parallel implementations.

preprint2022arXiv

Application of Machine Learning Methods in Inferring Surface Water Groundwater Exchanges using High Temporal Resolution Temperature Measurements

We examine the ability of machine learning (ML) and deep learning (DL) algorithms to infer surface/ground exchange flux based on subsurface temperature observations. The observations and fluxes are produced from a high-resolution numerical model representing conditions in the Columbia River near the Department of Energy Hanford site located in southeastern Washington State. Random measurement error, of varying magnitude, is added to the synthetic temperature observations. The results indicate that both ML and DL methods can be used to infer the surface/ground exchange flux. DL methods, especially convolutional neural networks, outperform the ML methods when used to interpret noisy temperature data with a smoothing filter applied. However, the ML methods also performed well and they are can better identify a reduced number of important observations, which could be useful for measurement network optimization. Surprisingly, the ML and DL methods better inferred upward flux than downward flux. This is in direct contrast to previous findings using numerical models to infer flux from temperature observations and it may suggest that combined use of ML or DL inference with numerical inference could improve flux estimation beneath river systems.

preprint2020arXiv

Adding A Filter Based on The Discriminator to Improve Unconditional Text Generation

The autoregressive language model (ALM) trained with maximum likelihood estimation (MLE) is widely used in unconditional text generation. Due to exposure bias, the generated texts still suffer from low quality and diversity. This presents statistically as a discrepancy between the real text and generated text. Some research shows a discriminator can detect this discrepancy. Because the discriminator can encode more information than the generator, discriminator has the potentiality to improve generator. To alleviate the exposure bias, generative adversarial networks (GAN) use the discriminator to update the generator's parameters directly, but they fail by being evaluated precisely. A critical reason for the failure is the difference between the discriminator input and the ALM input. We propose a novel mechanism by adding a filter which has the same input as the discriminator. First, discriminator detects the discrepancy signals and passes to filter directly (or by learning). Then, we use the filter to reject some generated samples with a sampling-based method. Thus, the original generative distribution is revised to reduce the discrepancy. Two ALMs, RNN-based and Transformer-based, are experimented. Evaluated precisely by three metrics, our mechanism consistently outperforms the ALMs and all kinds of GANs across two benchmark data sets.

preprint2020arXiv

Distributional Discrepancy: A Metric for Unconditional Text Generation

The purpose of unconditional text generation is to train a model with real sentences, then generate novel sentences of the same quality and diversity as the training data. However, when different metrics are used for comparing the methods of unconditional text generation, contradictory conclusions are drawn. The difficulty is that both the diversity and quality of the sample should be considered simultaneously when the models are evaluated. To solve this problem, a novel metric of distributional discrepancy (DD) is designed to evaluate generators based on the discrepancy between the generated and real training sentences. However, it cannot compute the DD directly because the distribution of real sentences is unavailable. Thus, we propose a method for estimating the DD by training a neural-network-based text classifier. For comparison, three existing metrics, bi-lingual evaluation understudy (BLEU) versus self-BLEU, language model score versus reverse language model score, and Fréchet embedding distance, along with the proposed DD, are used to evaluate two popular generative models of long short-term memory and generative pretrained transformer 2 on both syntactic and real data. Experimental results show that DD is significantly better than the three existing metrics for ranking these generative models.

preprint2020arXiv

Quantum computing cryptography: Finding cryptographic Boolean functions with quantum annealing by a 2000 qubit D-wave quantum computer

As the building block in symmetric cryptography, designing Boolean functions satisfying multiple properties is an important problem in sequence ciphers, block ciphers, and hash functions. However, the search of $n$-variable Boolean functions fulfilling global cryptographic constraints is computationally hard due to the super-exponential size $\mathcal{O}(2^{2^n})$ of the space. Here, we introduce a codification of the cryptographically relevant constraints in the ground state of an Ising Hamiltonian, allowing us to naturally encode it in a quantum annealer, which seems to provide a quantum speedup. Additionally, we benchmark small $n$ cases in a D-Wave machine, showing its capacity of devising bent functions, the most relevant set of cryptographic Boolean functions. We have complemented it with local search and chain repair to improve the D-Wave quantum annealer performance related to the low connectivity. This work shows how to codify super-exponential cryptographic problems into quantum annealers and paves the way for reaching quantum supremacy with an adequately designed chip.