Researcher profile

Qi Gu

Qi Gu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

DORA: A Scalable Asynchronous Reinforcement Learning System for Language Model Training

Reinforcement learning (RL) has become a critical paradigm for LLM post-training, yet the rollout phase -- accounting for 50--80% of total step time -- is bottlenecked by skewed generation: long-tailed trajectories indispensable for model performance block the entire training pipeline. Asynchronous training offers a natural remedy by overlapping generation with training, but introduces a fundamental tension between efficiency and algorithmic correctness. We identify three constraints in asynchronous training to preserve convergence: intra-trajectory policy consistency, data integrity, and bounded staleness. Existing approaches fail to intrinsically address the long-tailed trajectory problem, which is further exacerbated by the imbalance characteristic of Mix-of-Experts models, or deviate from the standard RL training formulation, thereby hindering model convergence. Therefore, we propose DORA (Dynamic ORchestration for Asynchronous Rollout), which addresses this challenge through algorithm-system co-design. DORA introduces multi-version streaming rollout, a novel asynchronous paradigm that maintains multiple policy versions concurrently -- simultaneously achieving full bubble elimination without compromising algorithmic constraints. Experimental results demonstrate that our DORA system achieves substantial improvements in throughput -- up to 2--3 times higher than state-of-the-art systems on open-source benchmarks -- without compromising convergence. Furthermore, in large-scale industrial applications with tens of thousands of accelerators, DORA accelerates RL training by 2--4 times compared to synchronous training across various scenarios. The resultant open-source models, LongCat-Flash-Thinking, exhibit competitive performance on complex reasoning benchmarks, matching the capability of most advanced LLMs.

preprint2022arXiv

System-level Simulation of Reconfigurable Intelligent Surface assisted Wireless Communications System

Reconfigurable intelligent surface (RIS) is an emerging technique employing metasurface to reflect the signal from the source node to the destination node. By smartly reconfiguring the electromagnetic (EM) properties of the metasurface and adjusting the EM parameters of the reflected radio waves, RIS can turn the uncontrollable propagation environment into an artificially reconfigurable space, and thus, can significantly increase the communications capacity and improve the coverage of the system. In this paper, we investigate the far field channel in which the line-of-sight (LOS) propagation is dominant. We propose an antenna model that can characterize the radiation patterns of realistic RIS elements, and consider the signal power received from the two-hop path through RIS. System-level simulations of network performance under various scenarios and parameter.

preprint2021arXiv

Performance Comparison between Reconfigurable Intelligent Surface and Relays: Theoretical Methods and a Perspective from Operator

Reconfigurable intelligent surface (RIS) is an emerging technique employing metasurface to reflect the signal from the source node to the destination node without consuming any energy. Not only the spectral efficiency but also the energy efficiency can be improved through RIS. Essentially, RIS can be considered as a passive relay between the source and destination node. On the other hand, a relay node in a traditional relay network has to be active, which indicates that it will consume energy when it is relaying the signal or information between the source and destination nodes. In this paper, we compare the performances between RIS and active relay for a general multiple-input multiple-output (MIMO) system. To make the comparison fair and comprehensive, both the performances of RIS and active relay are optimized with best-effort. In terms of the RIS, transmit beamforming and reflecting coefficient at the RIS are jointly optimized so as to maximize the end-to-end throughput. Although the optimization problem is non-convex, it is transformed equivalently to a weighted mean-square error (MSE) minimization problem and an alternating optimization problem is proposed, which can ensure the convergence to a stationary point. In terms of active relay, both half duplex relay (HDR) and full duplex relay (FDR) are considered. End-to-end throughput is maximized via an alternating optimization method. Numerical results are presented to demonstrate the effectiveness of the proposed algorithm. Finally, comparisons between RIS and relays are investigated from the perspective of system model, performance, deployment and controlling method.

preprint2021arXiv

Phonon Stability and Sound Velocity of Quantum Droplets in a Boson Mixture

Quantum droplets have been realized in experiments on binary boson mixtures and dipolar Bose gases. In these systems, the mean-field energy of the Bose-Einstein condensation is attractive, and the repulsive Lee-Huang-Yang energy is crucial for stability. The Bogoliubov theory incorrectly predicts that the phonon mode is dynamically unstable in the long-wavelength limit. In this work, we go beyond the Bogoliubov theory to study how the phonon mode is stabilized in the quantum droplet of a binary boson mixture. Similar to Beliaev's approach to a single-component Bose gas, we compute higher-order contributions to the self-energy of the boson propagator. We find that the interaction between spin and phonon excitations is the key for the phonon stability. We obtain the sound velocity which can be tested by measuring the superfluid critical velocity of the droplet in experiments. Beliaev damping of this quantum droplet is also discussed.

preprint2020arXiv

Deep learning for thermal plasma simulation: solving 1-D arc model as an example

Numerical modelling is an essential approach to understanding the behavior of thermal plasmas in various industrial applications. We propose a deep learning method for solving the partial differential equations in thermal plasma models. In this method a deep feed-forward neural network is constructed to surrogate the solution of the model. A loss function is designed to measure the discrepancy between the neural network and the equations describing thermal plasmas. A good neural network is obtained by minimizing this loss function. We demonstrate the power of this deep learning method by solving a 1-D arc decaying model which is consist of three cases: stationary arc, transient arc without considering radial velocity, and transient arc with radial velocity respectively. The results show that the deep neural networks have excellent ability to express the differential equations describing thermal plasmas. This could bring us a new and prospective numerical tool for thermal plasma modelling.

preprint2019arXiv

Bound-State Band Reconstruction and Resonance in Spin-1/2 Bose Gas with 1D Spin-Orbit Coupling

In this work, we study two-body bound states in two-component Bose gas with a one-dimensional (1D) spin-orbit coupling (SOC) induced by Raman lasers. The finite Raman coupling strength generates coupling among three spin channels, resulting in the reconstruction of three bound-state bands. In addition, multiple resonances can be induced at finite scattering lengths. By tuning the interaction in one intra-species channel, one bound-state band can be lifted and three resonances can be achieved at different center-of-mass momenta, which can be observable under current experimental conditions in ${}^{87}$Rb atoms.