Source author record

Yuan Xin

Yuan Xin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

hep-th cond-mat.str-el Computation and Language cond-mat.stat-mech Cryptography and Security hep-lat Artificial Intelligence quant-ph

Catalog footprint

What is connected

6works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts

As Large Language Models (LLMs) are increasingly integrated into academic peer review, their vulnerability to adversarial prompts -- adversarial instructions embedded in submissions to manipulate outcomes -- emerges as a critical threat to scholarly integrity. To counter this, we propose a novel adversarial framework where a Generator model, trained to create sophisticated attack prompts, is jointly optimized with a Defender model tasked with their detection. This system is trained using a loss function inspired by Information Retrieval Generative Adversarial Networks, which fosters a dynamic co-evolution between the two models, forcing the Defender to develop robust capabilities against continuously improving attack strategies. The resulting framework demonstrates significantly enhanced resilience to novel and evolving threats compared to static defenses, thereby establishing a critical foundation for securing the integrity of peer review.

preprint2025arXiv

Jailbreaking Attacks vs. Content Safety Filters: How Far Are We in the LLM Safety Arms Race?

As large language models (LLMs) are increasingly deployed, ensuring their safe use is paramount. Jailbreaking, adversarial prompts that bypass model alignment to trigger harmful outputs, present significant risks, with existing studies reporting high success rates in evading common LLMs. However, previous evaluations have focused solely on the models, neglecting the full deployment pipeline, which typically incorporates additional safety mechanisms like content moderation filters. To address this gap, we present the first systematic evaluation of jailbreak attacks targeting LLM safety alignment, assessing their success across the full inference pipeline, including both input and output filtering stages. Our findings yield two key insights: first, nearly all evaluated jailbreak techniques can be detected by at least one safety filter, suggesting that prior assessments may have overestimated the practical success of these attacks; second, while safety filters are effective in detection, there remains room to better balance recall and precision to further optimize protection and user experience. We highlight critical gaps and call for further refinement of detection accuracy and usability in LLM safety systems.

preprint2022arXiv

Bootstrapping $N_f=4$ conformal QED$_3$

We present the results of a conformal bootstrap study of the presumed unitary IR fixed point of quantum electrodynamics in three dimensions (QED$_3$) coupled to $N_f=4$ two-component Dirac fermions. Specifically, we study the four-point correlators of the $SU(4)$ adjoint fermion bilinear $r$ and the monopole of lowest topological charge $\mathcal{M}_{1/2}$. Most notably, the scaling dimensions of the fermion bilinear $r$ and the monopole $\mathcal{M}_{1/2}$ are found to be constrained into a closed island with a combination of spectrum assumptions inspired by the $1/N_f$ perturbative results as well as a novel interval positivity constraint on the next-lowest-charge monopole $\mathcal{M}_1$. Bounds in this island on the $SU(4)$ and topological $U(1)_t$ conserved current central charges $c_J$, $c_J^t$, as well as on the stress tensor central charge $c_T$, are comfortably consistent with the perturbative results. Together with the scaling dimensions, this suggests that a part of estimates from the $1/N_f$ expansion -- even at $N_f=4$ -- provide a self-consistent solution to the bootstrap crossing relations, despite some of our assumptions not being strictly justified.

preprint2022arXiv

Giving Hamiltonian Truncation a Boost

We study Hamiltonian truncation in boosted frames. We consider the thermal and magnetic field deformations of the 2d Ising model using TCSA at finite momentum. We find that even with moderate momenta, the spectrum and time-dependent correlation functions become significantly less dependent on the volume of the system. This allows for a more reliable determination of infinite volume observables.

preprint2020arXiv

Introduction to Lightcone Conformal Truncation: QFT Dynamics from CFT Data

We both review and augment the lightcone conformal truncation (LCT) method. LCT is a Hamiltonian truncation method for calculating dynamical quantities in QFT in infinite volume. This document is a self-contained, pedagogical introduction and "how-to" manual for LCT. We focus on 2D QFTs which have UV descriptions as free CFTs containing scalars, fermions, and gauge fields, providing a rich starting arena for LCT applications. Along our way, we develop several new techniques and innovations that greatly enhance the efficiency and applicability of LCT. These include the development of CFT radial quantization methods for computing Hamiltonian matrix elements and a new SUSY-inspired way of avoiding state-dependent counterterms and maintaining chiral symmetry. We walk readers through the construction of their own basic LCT code, sufficient for small truncation cutoffs. We also provide a more sophisticated and comprehensive set of Mathematica packages and demonstrations that can be used to study a variety of 2D models. We guide the reader through these packages with several examples and illustrate how to obtain QFT observables, such as spectral densities and the Zamolodchikov $C$-function. Specific models considered are finite $N_c$ QCD, scalar $ϕ^4$ theory, and Yukawa theory.

preprint2020arXiv

Supersymmetric SYK model and random matrix theory

In this paper, we investigate the effect of supersymmetry on the symmetry classification of random matrix theory ensembles. We mainly consider the random matrix behaviors in the $\mathcal{N}=1$ supersymmetric generalization of the Sachdev-Ye-Kitaev (SYK) model, a toy model for the two-dimensional quantum black hole with supersymmetric constraint. Some analytical arguments and numerical results are given to show that the statistics of the supersymmetric SYK model could be interpreted as random matrix theory ensembles, with a different eight-fold classification from the original SYK model and some new features. The time-dependent evolution of the spectral form factor is also investigated, where predictions from random matrix theory are governing the late time behavior of the chaotic Hamiltonian with supersymmetry.

Yuan Xin

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts

Jailbreaking Attacks vs. Content Safety Filters: How Far Are We in the LLM Safety Arms Race?

Bootstrapping $N_f=4$ conformal QED$_3$

Giving Hamiltonian Truncation a Boost

Introduction to Lightcone Conformal Truncation: QFT Dynamics from CFT Data

Supersymmetric SYK model and random matrix theory