Source author record

Kejia Chen

Kejia Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Biological Physics cond-mat.soft Cryptography and Security physics.data-an Quantitative Methods Social and Information Networks

Catalog footprint

What is connected

8works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Confidence-Aware Alignment Makes Reasoning LLMs More Reliable

Large reasoning models often reach correct answers through flawed intermediate steps, creating a gap between final accuracy and reasoning reliability. Existing alignment strategies address this with external verifiers or massive sampling, limiting scalability. In this work, we introduce CASPO (Confidence-Aware Step-wise Preference Optimization), a framework that aligns token-level confidence with step-wise logical correctness through iterative Direct Preference Optimization, without training a separate reward model. During inference, we propose Confidence-aware Thought (CaT), which leverages this calibrated confidence to dynamically prune uncertain reasoning branches with negligible O(V) latency. Experiments across ten benchmarks and multiple model families show that CASPO consistently improves reasoning reliability and inference efficiency. CASPO scales to Qwen3-8B-Base and surpasses tree-search baselines on AIME'24 and AIME'25 without using reward-model data. We also release a step-wise dataset with confidence annotations to support fine-grained analysis of reasoning reliability. Code is available at https://github.com/Thecommonirin/CASPO.

preprint2026arXiv

Evaluating the Utility of Personal Health Records in Personalized Health AI

Patient-managed Personal Health Records (PHRs) promises to empower patients to better understand their health; but information in the record is complex, potentially hindering insights. In this study, we assess the potential of large language models (LLMs, Gemini 3.0 Flash) to provide helpful answers to user health queries, when provided clinical data from PHRs as context. A total of 2,257 user queries were drawn from 3 different distributions to represent patient questions: shorter web search queries, longer questions derived from templates of chatbot conversations, and questions patients asked to their healthcare team (patient calls). Queries were matched with de-identified PHRs (from a pool of 1,945). Gemini responses were generated (1) without PHR context; (2) with a basic summary of demographics, conditions, and medications; (3) with full, extensive clinical notes. For evaluation, we leveraged an existing rating framework (SHARP), and developed a new framework for specific error modes when interpreting PHRs. Evaluation was performed using autoraters for the full set, and with clinician ratings for a subset (n=95), with both sets of raters knowing the full PHR context. We see significant improvements in the helpfulness of answers to all question types with PHR data (p < 0.001, paired t-test). We also observe potential gains in safety, accuracy, relevance and personalization of answers. Our PHR evaluation framework further identifies gaps in LLM understanding of particular aspects of complex PHRs, such as temporal disorientation, and rare but meaningful confabulations. These results suggest potential for PHR data to help people with a wide range of user needs; and provide a framework for monitoring for gaps in LLM answers based on PHR context. This study motivates further work to assess and realize potential benefits to users from understanding their health records.

preprint2026arXiv

Mitigating Many-shot Jailbreak Attacks with One Single Demonstration

Many-shot jailbreaking (MSJ) causes safety-aligned language models to answer harmful queries by preceding them with many harmful question-answer demonstrations. We study why this attack becomes stronger as the number of demonstrations increases. Empirically, we find that MSJ induces a progressive activation drift: the representation of a fixed harmful query moves step by step away from the safety-aligned region as more harmful demonstrations are added. Theoretically, we show that this drift can be interpreted as implicit malicious fine-tuning: conditioning on N harmful demonstrations induces SGD-style updates equivalent to optimizing on the corresponding N harmful samples. This view turns the attack mechanism into a defense principle. We append a fixed one-shot safety demonstration at inference time, which induces a counteracting safety-oriented update and restores refusal behavior. The resulting method improves the model's robustness to MSJ without modifying its parameters or requiring white-box access at deployment. Code is available at https://github.com/Thecommonirin/SafeEnd.

preprint2026arXiv

Safety at One Shot: Patching Fine-Tuned LLMs with A Single Instance

Fine-tuning safety-aligned large language models (LLMs) can substantially compromise their safety. Previous approaches require many safety samples or calibration sets, which not only incur significant computational overhead during realignment but also lead to noticeable degradation in model utility. Contrary to this belief, we show that safety alignment can be fully recovered with only a single safety example, without sacrificing utility and at minimal cost. Remarkably, this recovery is effective regardless of the number of harmful examples used in fine-tuning or the size of the underlying model, and convergence is achieved within just a few epochs. Furthermore, we uncover the low-rank structure of the safety gradient, which explains why such efficient correction is possible. We validate our findings across five safety-aligned LLMs and multiple datasets, demonstrating the generality of our approach.

preprint2026arXiv

Understanding and Preserving Safety in Fine-Tuned LLMs

Fine-tuning is an essential and pervasive functionality for applying large language models (LLMs) to downstream tasks. However, it has the potential to substantially degrade safety alignment, e.g., by greatly increasing susceptibility to jailbreak attacks, even when the fine-tuning data is entirely harmless. Despite garnering growing attention in defense efforts during the fine-tuning stage, existing methods struggle with a persistent safety-utility dilemma: emphasizing safety compromises task performance, whereas prioritizing utility typically requires deep fine-tuning that inevitably leads to steep safety declination. In this work, we address this dilemma by shedding new light on the geometric interaction between safety- and utility-oriented gradients in safety-aligned LLMs. Through systematic empirical analysis, we uncover three key insights: (I) safety gradients lie in a low-rank subspace, while utility gradients span a broader high-dimensional space; (II) these subspaces are often negatively correlated, causing directional conflicts during fine-tuning; and (III) the dominant safety direction can be efficiently estimated from a single sample. Building upon these novel insights, we propose safety-preserving fine-tuning (SPF), a lightweight approach that explicitly removes gradient components conflicting with the low-rank safety subspace. Theoretically, we show that SPF guarantees utility convergence while bounding safety drift. Empirically, SPF consistently maintains downstream task performance and recovers nearly all pre-trained safety alignment, even under adversarial fine-tuning scenarios. Furthermore, SPF exhibits robust resistance to both deep fine-tuning and dynamic jailbreak attacks. Together, our findings provide new mechanistic understanding and practical guidance toward always-aligned LLM fine-tuning.

preprint2022arXiv

Layer Imbalance Aware Multiplex Network Embedding

Multiplex network embedding is an effective technique to jointly learn the low-dimensional representations of nodes across network layers. However, the number of edges among layers may vary significantly. This data imbalance will lead to performance degradation especially on the sparse layer due to learning bias and the adverse effects of irrelevant or conflicting data in other layers. In this paper, a Layer Imbalance Aware Multiplex Network Embedding (LIAMNE) method is proposed where the edges in auxiliary layers are under-sampled based on the node similarity in the embedding space of the target layer to achieve balanced edge distribution and to minimize noisy relations that are less relevant to the target layer. Real-world datasets with different degrees of layer imbalance are used for experimentation. The results demonstrate that LIAMNE significantly outperforms several state-of-the-art multiplex network embedding methods in link prediction on the target layer. Meantime, the comprehensive representation of the entire multiplex network is not compromised by the sampling method as evaluated by its performance on the node classification task.

preprint2013arXiv

Diagnosing Heterogeneous Dynamics in Single Molecule/Particle Trajectories with Multiscale Wavelets

We describe a simple automated method to extract and quantify transient heterogeneous dynamical changes from large datasets generated in single molecule/particle tracking experiments. Based on wavelet transform, the method transforms raw data to locally match dynamics of interest. This is accomplished using statistically adaptive universal thresholding, whose advantage is to avoid a single arbitrary threshold that might conceal individual variability across populations. How to implement this multiscale method is described, focusing on local confined diffusion separated by transient transport periods or hopping events, with 3 specific examples: in cell biology, biotechnology, and glassy colloid dynamics. This computationally-efficient method can run routinely on hundreds of millions of data points analyzed within an hour on a desktop personal computer.

preprint2013arXiv

Single-Molecule Observation of Long Jumps in Polymer Adsorption

Single-molecule fluorescence imaging of adsorption onto initially-bare surfaces shows that polymer chains need not localize immediately after arrival. In a system optimized to present limited adsorption sites (quartz surface to which polyethylene glycol (PEG) is exposed in aqueous solution at pH = 8.2) we find that some chains diffuse back into bulk solution and re-adsorb at some distance away, sometimes multiple times before either they localize at a stable position or else diffuse away into bulk solution. This mechanism of surface diffusion is considerably more rapid than the classical model in which adsorbed polymers crawl on surfaces while the entire molecule remains adsorbed. The trajectories with jumps follow a truncated Levy distribution of step size with limiting slope -2.5, consistent with a well-defined, rapid surface diffusion coefficient over the times we observe.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint