Source author record

Xin Lv

Xin Lv appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language eess.SP Machine Learning physics.atom-ph quant-ph

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Reinforcement learning (RL) has emerged as a critical technique for enhancing LLM-based deep search agents. However, existing approaches primarily rely on binary outcome rewards, which fail to capture the comprehensiveness and factuality of agents' reasoning process, and often lead to undesirable behaviors such as shortcut exploitation and hallucinations. To address these limitations, we propose \textbf{Citation-aware Rubric Rewards (CaRR)}, a fine-grained reward framework for deep search agents that emphasizes reasoning comprehensiveness, factual grounding, and evidence connectivity. CaRR decomposes complex questions into verifiable single-hop rubrics and requires agents to satisfy these rubrics by explicitly identifying hidden entities, supporting them with correct citations, and constructing complete evidence chains that link to the predicted answer. We further introduce \textbf{Citation-aware Group Relative Policy Optimization (C-GRPO)}, which combines CaRR and outcome rewards for training robust deep search agents. Experiments show that C-GRPO consistently outperforms standard outcome-based RL baselines across multiple deep search benchmarks. Our analysis also validates that C-GRPO effectively discourages shortcut exploitation, promotes comprehensive, evidence-grounded reasoning, and exhibits strong generalization to open-ended deep research tasks. Our code and data are available at https://github.com/THUDM/CaRR.

preprint2024arXiv

Resolved Raman sideband cooling of a single optically trapped cesium atom

We developed a resolved Raman sideband cooling scheme that can efficiently prepare a single optically trapped cesium (Cs) atom in its motional ground states. A two-photon Raman process between two outermost Zeeman sublevels in a single hyperfine state is applied to reduce the phonon number. Our scheme is less sensitive to the variation in the magnetic field than the commonly used scheme where the two outermost Zeeman sublevels belonging to the two separate ground hyperfine states are taken. Fast optical pumping with less spontaneous emission guarantees the efficiency of the cooling process. After cooling for 50 ms, 82% of the Cs atoms populate their three-dimensional ground states. Our scheme improves the long-term stability of Raman sideband cooling in the presence of magnetic field drift and is thus suitable for cooling other trapped atoms or ions with abundant magnetic sublevels.

preprint2022arXiv

A Roadmap for Big Model

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.

preprint2022arXiv

Network Coexistence Analysis of RIS-Assisted Wireless Communications

Reconfigurable intelligent surfaces (RISs) have attracted the attention of academia and industry circles because of their ability to control the electromagnetic characteristics of channel environments. However, it has been found that the introduction of an RIS may bring new and more serious network coexistence problems. It may even further deteriorate the network performance if these new network coexistence problems cannot be effectively solved. In this paper, an RIS network coexistence model is proposed and discussed in detail, and these problems are deeply analysed. Two novel RIS design mechanisms, including a novel multilayer RIS structure with an out-of-band filter and an RIS blocking mechanism, are further explored. Finally, numerical results and a discussion are given.

preprint2022arXiv

Program Transfer for Answering Complex Questions over Knowledge Bases

Program induction for answering complex questions over knowledge bases (KBs) aims to decompose a question into a multi-step program, whose execution against the KB produces the final answer. Learning to induce programs relies on a large number of parallel question-program pairs for the given KB. However, for most KBs, the gold program annotations are usually lacking, making learning difficult. In this paper, we propose the approach of program transfer, which aims to leverage the valuable program annotations on the rich-resourced KBs as external supervision signals to aid program induction for the low-resourced KBs that lack program annotations. For program transfer, we design a novel two-stage parsing framework with an efficient ontology-guided pruning strategy. First, a sketch parser translates the question into a high-level program sketch, which is the composition of functions. Second, given the question and sketch, an argument parser searches the detailed arguments from the KB for functions. During the searching, we incorporate the KB ontology to prune the search space. The experiments on ComplexWebQuestions and WebQuestionSP show that our method outperforms SOTA methods significantly, demonstrating the effectiveness of program transfer and our framework. Our codes and datasets can be obtained from https://github.com/THU-KEG/ProgramTransfer.

Xin Lv

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Resolved Raman sideband cooling of a single optically trapped cesium atom

A Roadmap for Big Model

Network Coexistence Analysis of RIS-Assisted Wireless Communications

Program Transfer for Answering Complex Questions over Knowledge Bases