Researcher profile

Yuxuan Zhou

Yuxuan Zhou contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

Beyond the All-in-One Agent: Benchmarking Role-Specialized Multi-Agent Collaboration in Enterprise Workflows

Large language model (LLM) agents are increasingly expected to operate in enterprise environments, where work is distributed across specialized roles, permission-controlled systems, and cross-departmental procedures. However, existing enterprise benchmarks largely evaluate single agents with broad tool access, while existing multi-agent benchmarks rarely capture realistic enterprise constraints such as role specialization, access control, stateful business systems, and policy-based approvals. We introduce \textsc{EntCollabBench}, a benchmark for evaluating enterprise multi-agent collaboration. \textsc{EntCollabBench} simulates a permission-isolated organization with 11 role-specialized agents across six departments and contains two evaluation subsets: a Workflow subset, where agents collaboratively modify enterprise system states, and an Approval subset, where agents make policy-grounded decisions. Evaluation is based on execution traces, database state verification, and deterministic policy adjudication rather than natural-language response judging. Experiments with representative LLM agents show that current models still struggle with end-to-end enterprise collaboration, especially in delegation, context transfer, parameter grounding, workflow closure, and decision commitment. \textsc{EntCollabBench} provides a reproducible testbed for measuring and improving agent systems intended for realistic organizational environments.

preprint2026arXiv

DemoTuner: Automatic Performance Tuning for Database Management Systems Based on Demonstration Reinforcement Learning

The performance of modern DBMSs such as MySQL and PostgreSQL heavily depends on the configuration of performance-critical knobs. Manual tuning these knobs is laborious and inefficient due to the complex and high-dimensional nature of the configuration space. Among the automated tuning methods, reinforcement learning (RL)-based methods have recently sought to improve the DBMS knobs tuning process from several different perspectives. However, they still encounter challenges with slow convergence speed during offline training. In this paper, we mainly focus on how to leverage the valuable tuning hints contained in various textual documents such as DBMS manuals and web forums to improve the offline training of RL-based methods. To this end, we propose an efficient DBMS knobs tuning framework named DemoTuner via a novel LLM-assisted demonstration reinforcement learning method. Specifically, to comprehensively and accurately mine tuning hints from documents, we design a structured chain of thought prompt to employ LLMs to conduct a condition-aware tuning hints extraction task. To effectively integrate the mined tuning hints into RL agent training, we propose a hint-aware demonstration reinforcement learning algorithm HA-DDPGfD in DemoTuner. As far as we know, DemoTuner is the first work to introduce the demonstration reinforcement learning algorithm for DBMS knobs tuning. Experimental evaluations conducted on MySQL and PostgreSQL across various workloads demonstrate that DemoTuner achieves performance gains of up to 44.01% for MySQL and 39.95% for PostgreSQL over default configurations. Compared with three representative baseline methods, DemoTuner is able to further reduce the execution time by up to 10.03%, while always consuming the least online tuning cost. Additionally, DemoTuner also exhibits superior adaptability to application scenarios with unknown workloads.

preprint2026arXiv

Omni-DeepSearch: A Benchmark for Audio-Driven Omni-Modal Deep Search

Current omni-modal benchmarks mainly evaluate models under settings where multiple modalities are provided simultaneously, while the ability to start from audio alone and actively search for cross-modal evidence remains underexplored. In this paper, we introduce \textbf{Omni-DeepSearch}, a benchmark for audio-driven omni-modal deep search. Given one or more audio clips and a related question, models must infer useful clues from audio, invoke text, image, and video search tools, and perform multi-hop reasoning to produce a short, objective, and verifiable answer. Omni-DeepSearch contains 640 samples across 15 fine-grained categories, covering four retrieval target modalities and four audio content types. A multi-stage filtering pipeline ensures audio dependence, retrieval necessity, visual modality necessity, and answer uniqueness. Experiments on recent closed-source and open-source omni-modal models show that this task remains highly challenging: the strongest evaluated model, Gemini-3-Pro, achieves only 43.44\% average accuracy. Further analyses illustrate key bottlenecks in audio entity inference, query formulation, tool-use reliability, multi-hop retrieval, and cross-modal verification. These results highlight audio-driven omni-modal deep search as an important and underexplored direction for future multimodal agents.

preprint2023arXiv

Multi-Level Variational Spectroscopy using a Programmable Quantum Simulator

Energy spectroscopy is a powerful tool with diverse applications across various disciplines. The advent of programmable digital quantum simulators opens new possibilities for conducting spectroscopy on various models using a single device. Variational quantum-classical algorithms have emerged as a promising approach for achieving such tasks on near-term quantum simulators, despite facing significant quantum and classical resource overheads. Here, we experimentally demonstrate multi-level variational spectroscopy for fundamental many-body Hamiltonians using a superconducting programmable digital quantum simulator. By exploiting symmetries, we effectively reduce circuit depth and optimization parameters allowing us to go beyond the ground state. Combined with the subspace search method, we achieve full spectroscopy for a 4-qubit Heisenberg spin chain, yielding an average deviation of 0.13 between experimental and theoretical energies, assuming unity coupling strength. Our method, when extended to 8-qubit Heisenberg and transverse-field Ising Hamiltonians, successfully determines the three lowest energy levels. In achieving the above, we introduce a circuit-agnostic waveform compilation method that enhances the robustness of our simulator against signal crosstalk. Our study highlights symmetry-assisted resource efficiency in variational quantum algorithms and lays the foundation for practical spectroscopy on near-term quantum simulators, with potential applications in quantum chemistry and condensed matter physics.

preprint2022arXiv

Magnetic frustration in the cubic double perovskite Ba2NiIrO6

Hybrid transition metal oxides continue to attract attention due to their multiple degrees of freedom ($e.g.$, lattice, charge, spin, and orbital) and versatile properties. Here we investigate the magnetic and electronic properties of the newly synthesized double perovskite Ba$_2$NiIrO$_6$, using crystal field theory, superexchange model analysis, density functional calculations, and parallel tempering Monte Carlo (PTMC) simulations. Our results indicate that Ba$_2$NiIrO$_6$ has the Ni$^{2+}$ ($t_{2g}^{6}e_{g}^{2}$)-Ir$^{6+}$ ($t_{2g}^{3}$) charge states. The first nearest-neighboring (1NN) Ni$^{2+}$-Ir$^{6+}$ ions prefer a ferromagnetic (FM) coupling as expected from the Goodenough-Kanamori-Anderson rules, which contradicts the experimental antiferromagnetic (AF) order in Ba$_2$NiIrO$_6$. We find that the strong 2NN AF couplings are frustrated in the fcc sublattices, and they play a major role in determining the observed AF ground state. We also prove that the $J_{\rm eff}$ = 3/2 and $J_{\rm eff}$ = 1/2 states induced by spin-orbit coupling, which would be manifested in low-dimensional (e.g., layered) iridates, are however not the case for cubic Ba$_2$NiIrO$_6$. Our PTMC simulations show that when the long-range (2NN and 3NN) AF interactions are included, an AF transition with $T_{\rm N}$ = 66 K would be obtained and it is well comparable with the experimental 51 K. Meanwhile, we propose a possible 2$\times$2$\times$2 noncollinear AF structure for Ba$_2$NiIrO$_6$.

preprint2022arXiv

Optimal charging of a superconducting quantum battery

Quantum batteries are miniature energy storage devices and play a very important role in quantum thermodynamics. In recent years, quantum batteries have been extensively studied, but limited in theoretical level. Here we report the experimental realization of a quantum battery based on superconducting qubits. Our model explores dark and bright states to achieve stable and powerful charging processes, respectively. Our scheme makes use of the quantum adiabatic brachistochrone, which allows us to speed up the {battery ergotropy injection. Due to the inherent interaction of the system with its surrounding, the battery exhibits a self-discharge, which is shown to be described by a supercapacitor-like self-discharging mechanism. Our results paves the way for proposals of new superconducting circuits able to store extractable work for further usage.

preprint2022arXiv

SP-ViT: Learning 2D Spatial Priors for Vision Transformers

Recently, transformers have shown great potential in image classification and established state-of-the-art results on the ImageNet benchmark. However, compared to CNNs, transformers converge slowly and are prone to overfitting in low-data regimes due to the lack of spatial inductive biases. Such spatial inductive biases can be especially beneficial since the 2D structure of an input image is not well preserved in transformers. In this work, we present Spatial Prior-enhanced Self-Attention (SP-SA), a novel variant of vanilla Self-Attention (SA) tailored for vision transformers. Spatial Priors (SPs) are our proposed family of inductive biases that highlight certain groups of spatial relations. Unlike convolutional inductive biases, which are forced to focus exclusively on hard-coded local regions, our proposed SPs are learned by the model itself and take a variety of spatial relations into account. Specifically, the attention score is calculated with emphasis on certain kinds of spatial relations at each head, and such learned spatial foci can be complementary to each other. Based on SP-SA we propose the SP-ViT family, which consistently outperforms other ViT models with similar GFlops or parameters. Our largest model SP-ViT-L achieves a record-breaking 86.3% Top-1 accuracy with a reduction in the number of parameters by almost 50% compared to previous state-of-the-art model (150M for SP-ViT-L vs 271M for CaiT-M-36) among all ImageNet-1K models trained on 224x224 and fine-tuned on 384x384 resolution w/o extra data.

preprint2022arXiv

Table-based Fact Verification with Self-adaptive Mixture of Experts

The table-based fact verification task has recently gained widespread attention and yet remains to be a very challenging problem. It inherently requires informative reasoning over natural language together with different numerical and logical reasoning on tables (e.g., count, superlative, comparative). Considering that, we exploit mixture-of-experts and present in this paper a new method: Self-adaptive Mixture-of-Experts Network (SaMoE). Specifically, we have developed a mixture-of-experts neural network to recognize and execute different types of reasoning -- the network is composed of multiple experts, each handling a specific part of the semantics for reasoning, whereas a management module is applied to decide the contribution of each expert network to the verification result. A self-adaptive method is developed to teach the management module combining results of different experts more efficiently without external knowledge. The experimental results illustrate that our framework achieves 85.1% accuracy on the benchmark dataset TabFact, comparable with the previous state-of-the-art models. We hope our framework can serve as a new baseline for table-based verification. Our code is available at https://github.com/THUMLP/SaMoE.

preprint2020arXiv

Quantum Control via Stimulated Raman User-defined Passage

Stimulated Raman adiabatic passage (STIRAP) is a widely-used technique of coherent state-to-state manipulation for many applications in physics, chemistry, and beyond. The adiabatic evolution of the state involved in STIRAP, called adiabatic passage, guarantees its robustness against control errors, but also leads to problems of low efficiency and decoherence. Here we propose and experimentally demonstrate an alternative approach, termed stimulated Raman "user-defined" passage (STIRUP), where a parameterized state is employed for constructing desired evolutions to replace the adiabatic passage in STIRAP. The user-defined passages can be flexibly designed for optimizing different objectives for different tasks, e.g. minimizing leakage error. To experimentally benchmark its performance, we apply STIRUP to the task of coherent state transfer in a superconducting Xmon qutrit. We found that STIRUP completed the transfer more then four times faster than STIRAP with enhanced robustness, and achieved a fidelity of 99.5%, which is the highest among all recent experiments based on STIRAP and its variants. In practice, STIRUP differs from STIRAP only in the design of driving pulses; therefore, most existing applications of STIRAP can be readily implemented with STIRUP.