Source author record

Kai Tang

Kai Tang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence physics.acc-ph quant-ph Graphics physics.optics Computation and Language Computational Geometry cond-mat.supr-con hep-ex Machine Learning physics.atom-ph physics.ins-det

Catalog footprint

What is connected

12works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

expo: Exploration-prioritized policy optimization via adaptive kl regulation and gaussian curriculum sampling

Reinforcement Learning with Verifiable Rewards (RLVR) has become the standard paradigm for LLM mathematical reasoning, where Group Relative Policy Optimization (GRPO) serves as the mainstream algorithm. We point out two understudied inefficiencies existing in GRPO. First, the fixed KL penalty coefficient overly restricts policy exploration at stages where the model requires significant deviation from the reference policy. Second, uniform sampling of training questions ignores that moderately difficult problems provide the most informative gradient signals for optimization. We propose Exploration-Prioritized Policy Optimization (EXPO) with two lightweight plug-in modules. The Accuracy-Conditioned KL Scaling (AKL) dynamically adjusts KL regularization strength through a smooth nonlinear function of batch average accuracy, relaxing the penalty when the model underperforms and strengthening it when the model achieves good results. The Gaussian Curriculum Sampling (GCS) assigns sampling weights to questions following a Gaussian distribution centered at moderate accuracy around 0.5, focusing training on the model's learning frontier. We conduct extensive experiments on DeepSeek-R1-Distill-Qwen-1.5B and Qwen3-8B-Base over six mathematical reasoning benchmarks. The results show EXPO steadily surpasses vanilla GRPO. It obtains an absolute gain of 13.34 on AIME 2025 pass@32, rising from 63.33 percent to 76.67 percent, and achieves an average pass@32 improvement of 2.66 on the 8B model. The much larger performance gains on pass@32 compared with pass@1 demonstrate that EXPO effectively enlarges the model's exploration boundary under a fixed inference cost budget.

preprint2026arXiv

fg-expo: Frontier-guided exploration-prioritized policy optimization via adaptive kl and gaussian curriculum

Reinforcement Learning with Verifiable Rewards (RLVR) has become the standard paradigm for LLM mathematical reasoning, with Group Relative Policy Optimization (GRPO) serving as the dominant algorithm. We identify two overlooked inefficiencies inherent in GRPO. First, a fixed KL coefficient overly restricts policy exploration at moments when the model needs to diverge significantly from the reference policy. Second, uniform question sampling overlooks that moderately difficult problems produce the most informative gradient signals. We propose FG-ExPO, short for Frontier-Guided Exploration-Prioritized Policy Optimization, which integrates two lightweight components. Accuracy-Conditioned KL Scaling (AKL) adjusts the KL penalty strength through a smooth nonlinear function of batch average accuracy, loosening the constraint when the model performs poorly and strengthening it when the model achieves satisfactory results. Gaussian Curriculum Sampling (GCS) assigns sampling weights to questions following a Gaussian distribution centered at a moderate accuracy level around 0.5, focusing model training on its learning frontier. We conduct evaluations on DeepSeek-R1-Distill-Qwen-1.5B and Qwen3-8B-Base across six mainstream mathematical reasoning benchmarks. Experimental results demonstrate that FG-ExPO consistently outperforms vanilla GRPO. It delivers an absolute improvement of 13.34 on the AIME 2025 pass@32 metric, rising from 63.33 percent to 76.67 percent, and obtains an average pass@32 gain of 2.66 on the 8B model. The substantially larger performance gains observed on pass@32 compared to pass@1 verify that FG-ExPO enlarges the model's effective exploration space under a fixed inference budget.

preprint2026arXiv

Learning from Prompt itself: the Hierarchical Attribution Prompt Optimization

Optimization is fundamental across numerous disciplines, typically following an iterative process of refining an initial solution to enhance performance. This principle is equally critical in prompt engineering, where designing effective prompts for large language models constitutes a complex optimization challenge. A structured optimization approach requires automated or semi-automated procedures to develop improved prompts, thereby reducing manual effort, improving performance, and yielding an interpretable process. However, current prompt optimization methods often induce prompt drift, where new prompts fix prior failures but impair performance on previously successful tasks. Additionally, generating prompts from scratch can compromise interpretability. To address these limitations, this study proposes the Hierarchical Attribution Prompt Optimization (HAPO) framework, which introduces three innovations: (1) a dynamic attribution mechanism targeting error patterns in training data and prompting history, (2) semantic-unit optimization for editing functional prompt segments, and (3) multimodal-friendly progression supporting both end-to-end LLM and LLM-MLLM workflows. Applied in contexts like single/multi-image QA (e.g., OCRV2) and complex task analysis (e.g., BBH), HAPO demonstrates enhanced optimization efficiency, outperforming comparable automated prompt optimization methods and establishing an extensible paradigm for scalable prompt engineering.

preprint2022arXiv

Entanglement-Enhanced Quantum Metrology in Colored Noise by Quantum Zeno Effect

In open quantum systems, the precision of metrology inevitably suffers from the noise. {In Markovian open quantum dynamics, the precision can not be improved by using entangled probes although the measurement time is effectively shortened.} However, it was predicted over one decade ago that in a non-Markovian one, the error can be significantly reduced by the quantum Zeno effect (QZE) [Chin, Huelga, and Plenio, Phys. Rev. Lett. \textbf{109}, 233601 (2012)]. In this work, we apply a recently-developed quantum simulation approach to experimentally verify that entangled probes can improve the precision of metrology by the QZE. Up to $n=7$ qubits, we demonstrate that the precision has been improved by a factor of $n^{1/4}$, which is consistent with the theoretical prediction. Our quantum simulation approach may provide an intriguing platform for experimental verification of various quantum metrology schemes.

preprint2022arXiv

Experimental quantum simulation of non-Hermitian dynamical topological states using stochastic Schrödinger equation

Noise is ubiquitous in real quantum systems, leading to non-Hermitian quantum dynamics, and may affect the fundamental states of matter. Here we report in experiment a quantum simulation of the two-dimensional non-Hermitian quantum anomalous Hall (QAH) model using the nuclear magnetic resonance processor. Unlike the usual experiments using auxiliary qubits, we develop a stochastic average approach based on the stochastic Schrödinger equation to realize the non-Hermitian dissipative quantum dynamics, which has advantages in saving the quantum simulation sources and simplifies implementation of quantum gates. We demonstrate the stability of dynamical topology against weak noise, and observe two types of dynamical topological transitions driven by strong noise. Moreover, a region that the emergent topology is always robust regardless of the noise strength is observed. Our work shows a feasible quantum simulation approach for dissipative quantum dynamics with stochastic Schrödinger equation and opens a route to investigate non-Hermitian dynamical topological physics.

preprint2022arXiv

Experimental Realization of a Quantum Refrigerator Driven by Indefinite Causal Orders

Indefinite causal order (ICO) is playing a key role in recent quantum technologies. Here, we experimentally study quantum thermodynamics driven by ICO on nuclear spins using the nuclear magnetic resonance system. We realize the ICO of two thermalizing channels to exhibit how the mechanism works, and show that the working substance can be cooled or heated albeit it undergoes thermal contacts with reservoirs of the same temperature. Moreover, we construct a single cycle of the ICO refrigerator based on the Maxwell's demon mechanism, and evaluate its performance by measuring the work consumption and the heat energy extracted from the low-temperature reservoir. Unlike classical refrigerators in which the coefficient of performance (COP) is perversely higher the closer the temperature of the high-temperature and low-temperature reservoirs are to each other, the ICO refrigerator's COP is always bounded to small values due to the non-unit success probability in projecting the ancillary qubit to the preferable subspace. To enhance the COP, we propose and experimentally demonstrate a general framework based on the density matrix exponentiation (DME) approach, as an extension to the ICO refrigeration. The COP is observed to be enhanced by more than three times with the DME approach. Our work demonstrates a new way for non-classical heat exchange, and paves the way towards construction of quantum refrigerators on a quantum system.

preprint2020arXiv

Geodesic Distance Field-based Curved Layer Volume Decomposition for Multi-Axis Support-free Printing

This paper presents a new curved layer volume decomposition method for multi-axis support-free printing of freeform solid parts. Given a solid model to be printed that is represented as a tetrahedral mesh, we first establish a geodesic distance field embedded on the mesh, whose value at any vertex is the geodesic distance to the base of the model. Next, the model is naturally decomposed into curved layers by interpolating a number of iso-geodesic distance surfaces (IGDSs). These IGDSs morph from bottom-up in an intrinsic and smooth way owing to the nature of geodesics, which will be used as the curved printing layers that are friendly to multi-axis printing. In addition, to cater to the collision-free requirement and to improve the printing efficiency, we also propose a printing sequence optimization algorithm for determining the printing order of the IGDSs, which helps reduce the air-move path length. Ample experiments in both computer simulation and physical printing are performed, and the experimental results confirm the advantages of our method.

preprint2020arXiv

Multi-Axis Support-Free Printing of Freeform Parts with Lattice Infill Structures

In additive manufacturing, infill structures are commonly used to reduce the weight and cost of a solid part. Currently, most infill structure generation methods are based on the conventional 2.5-axis printing configuration, which, although able to satisfy the self-supporting condition on the infills, suffer from the well-known stair-case effect on the finished surface and the need of extensive support for overhang features. In this paper, based on the emerging continuous multi-axis printing configuration, we present a new lattice infill structure generation algorithm, which is able to achieve both the self-supporting condition for the infills and the support-free requirement at the boundary surface of the part. The algorithm critically relies on the use of three mutually orthogonal geodesic distance fields that are embedded in the tetrahedral mesh of the solid model. The intersection between the iso-geodesic distance surfaces of these three geodesic distance fields naturally forms the desired lattice of infill structure, while the density of the infills can be conveniently controlled by adjusting the iso-values. The lattice infill pattern in each curved slicing layer is trimmed to conform to an Eulerian graph so to generate a continuous printing path, which can effectively reduce the nozzle retractions during the printing process. In addition, to cater to the collision-free requirement and to improve the printing efficiency, we also propose a printing sequence optimization algorithm for determining a collision-free order of printing of the connected lattice infills, which seeks to reduce the air-move length of the nozzle. Ample experiments in both computer simulation and physical printing are performed, and the results give a preliminary confirmation of the advantages of our methodology.

preprint2016arXiv

Superconducting nanowire single photon detector at 532 nm and demonstration in satellite laser ranging

Superconducting nanowire single-photon detectors (SNSPDs) at a wavelength of 532 nm were designed and fabricated aiming to satellite laser ranging (SLR) applications. The NbN SNSPDs were fabricated on one-dimensional photonic crystals with a sensitive-area diameter of 42 um. The devices were coupled with multimode fiber (phi=50um) and exhibited a maximum system detection efficiency of 75% at an extremely low dark count rate of <0.1 Hz. An SLR experiment using an SNSPD at a wavelength of 532 nm was successfully demonstrated. The results showed a depth ranging with a precision of ~8.0 mm for the target satellite LARES, which is ~3,000 km away from the ground ranging station at the Sheshan Observatory.

preprint2016arXiv

Transverse beam size measurement system using visible synchrotron radiation at HLS II

An interferometer system and an imaging system using visible synchrotron radiation (SR) have been installed in HLS II storage ring. Simulations of these two systems are given using Synchrotron Radiation Workshop(SRW) code. With these two systems, the beam energy spread and the beam emittance can be measured. A detailed description of these two systems and the measurement method is given in this paper. The measurement results of beam size, emittance and energy spread are given at the end.

preprint2015arXiv

Beam size and position measurement based on logarithm processing algorithm in HLS II

A logarithm processing algorithm to measure beam transverse size and position is proposed and preliminary experimental results in Hefei Light Source II (HLS II) are given. The algorithm is based on only 4 successive channels of 16 anode channels of multianode photomultiplier tube (MAPMT) R5900U-00-L16 which has typical rise time of 0.6 ns and effective area of 0.8x16 mm for a single anode channel. In the paper, we firstly elaborate the simulation results of the algorithm with and without channel inconsistency. Then we calibrate the channel inconsistency and verify the algorithm using general current signal processor Libera Photon in low-speed scheme. Finally we get turn-by-turn beam size and position and calculate the vertical tune in high-speed scheme. The experimental results show that measured values fit well with simulation results after channel differences are calibrated and the fractional part of the tune in vertical direction is 0.3628 which is very close to the nominal value 0.3621.

preprint2015arXiv

Central frequency measurement of the HLS-II storage ring

Central frequency is a key parameter of storage rings. This paper presents the measurement of central frequency of the HLS-II storage ring using the sextupole modulation method. Firstly, the basis of central frequency measurement of the electron storage ring is briefly introduced. Then, the error sources and the optimized measurement method for the HLS-II storage ring are discussed. The workflow of the self-compiled Matlab script used in central frequency measurement is also described. In the end, the results achieved by using two methods to cross-check each other are shown. The measured value of the central frequency demonstrates that the real circumference of the HLS-II storage ring agrees well with the designed value.

Kai Tang

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

expo: Exploration-prioritized policy optimization via adaptive kl regulation and gaussian curriculum sampling

fg-expo: Frontier-guided exploration-prioritized policy optimization via adaptive kl and gaussian curriculum

Learning from Prompt itself: the Hierarchical Attribution Prompt Optimization

Entanglement-Enhanced Quantum Metrology in Colored Noise by Quantum Zeno Effect

Experimental quantum simulation of non-Hermitian dynamical topological states using stochastic Schrödinger equation

Experimental Realization of a Quantum Refrigerator Driven by Indefinite Causal Orders

Geodesic Distance Field-based Curved Layer Volume Decomposition for Multi-Axis Support-free Printing

Multi-Axis Support-Free Printing of Freeform Parts with Lattice Infill Structures

Superconducting nanowire single photon detector at 532 nm and demonstration in satellite laser ranging

Transverse beam size measurement system using visible synchrotron radiation at HLS II

Beam size and position measurement based on logarithm processing algorithm in HLS II

Central frequency measurement of the HLS-II storage ring