Researcher profile

Qiqi Wang

Qiqi Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

DiRL: An Efficient Post-Training Framework for Diffusion Language Models

Diffusion Language Models (dLLMs) have emerged as promising alternatives to Auto-Regressive (AR) models. While recent efforts have validated their pre-training potential and accelerated inference speeds, the post-training landscape for dLLMs remains underdeveloped. Existing methods suffer from computational inefficiency and objective mismatches between training and inference, severely limiting performance on complex reasoning tasks such as mathematics. To address this, we introduce DiRL, an efficient post-training framework that tightly integrates FlexAttention-accelerated blockwise training with LMDeploy-optimized inference. This architecture enables a streamlined online model update loop, facilitating efficient two-stage post-training (Supervised Fine-Tuning followed by Reinforcement Learning). Building on this framework, we propose DiPO, the first unbiased Group Relative Policy Optimization (GRPO) implementation tailored for dLLMs. We validate our approach by training DiRL-8B-Instruct on high-quality math data. Our model achieves state-of-the-art math performance among dLLMs and surpasses comparable models in the Qwen2.5 series on several benchmarks.

preprint2022arXiv

A data driven heuristic for rapid convergence of Scheduled Relaxation Jacobi schemes

The Scheduled Relaxation Jacobi (SRJ) method is a viable candidate as a high performance linear solver for elliptic partial differential equations (PDEs). The method greatly improves the convergence of the standard Jacobi iteration by applying a sequence of $M$ well-chosen overrelaxation and underrelaxation factors in each cycle of the algorithm to effectively attenuate the solution error. In previous work, optimal SRJ schemes (sets of relaxation factors) have been derived to accelerate convergence for specific discretizations of elliptic PDEs. In this work, we develop a family of SRJ schemes which can be applied to solve elliptic PDEs regardless of the specific discretization employed. To achieve favorable convergence, we train an algorithm to select which scheme in this family to apply at each cycle of the linear solve process, based on convergence data collected from applying these schemes to the one-dimensional Poisson equation. The automatic selection heuristic that is developed based on this limited data is found to provide good convergence for a wide range of problems.

preprint2022arXiv

Approximating linear response of physical chaos

Parametric derivatives of statistics are highly desired quantities in prediction, design optimization and uncertainty quantification. In the presence of chaos, the rigorous computation of these quantities is certainly possible, but mathematically complicated and computationally expensive. Based on Ruelle's formalism, this paper shows that the sophisticated linear response algorithm can be dramatically simplified in higher-dimensional systems featuring a statistical homogeneity in the physical space. We argue that the contribution of the SRB (Sinai-Ruelle-Bowen) measure change, which is an integral part of the full linear response, can be completely neglected if the objective function is appropriately aligned with unstable manifolds. This abstract condition could potentially be satisfied by a vast family of real-world chaotic systems, regardless of the physical meaning and mathematical form of the objective function and perturbed parameter. We demonstrate several numerical examples that support these conclusions and that present the use and performance of a reduced linear response algorithm. In the numerical experiments, we consider physical models described by differential equations, including Lorenz 63, Lorenz 96, and Kuramoto-Sivashinsky.

preprint2022arXiv

Assessment of Detached Eddy Simulation and Sliding Mesh Interface in Predicting Tiltrotor Performance in Helicopter and Airplane Modes

This paper presents numerical investigation on performance and flow field of the full-scale XV-15 tiltrotor in both helicopter mode (hovering flight and forward flight) and aeroplane propeller mode using Detached Eddy Simulation, in which the movement of the rotor is achieved using a Sliding Mesh Interface. Comparison of our CFD results against experiment data and other CFD results is performed and presented.

preprint2022arXiv

DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning

Any-to-any voice conversion problem aims to convert voices for source and target speakers, which are out of the training data. Previous works wildly utilize the disentangle-based models. The disentangle-based model assumes the speech consists of content and speaker style information and aims to untangle them to change the style information for conversion. Previous works focus on reducing the dimension of speech to get the content information. But the size is hard to determine to lead to the untangle overlapping problem. We propose the Disentangled Representation Voice Conversion (DRVC) model to address the issue. DRVC model is an end-to-end self-supervised model consisting of the content encoder, timbre encoder, and generator. Instead of the previous work for reducing speech size to get content, we propose a cycle for restricting the disentanglement by the Cycle Reconstruct Loss and Same Loss. The experiments show there is an improvement for converted speech on quality and voice similarity.

preprint2022arXiv

Efficient computation of linear response of chaotic attractors with one-dimensional unstable manifolds

This paper presents the space-split sensitivity or the S3 algorithm to transform Ruelle's linear response formula into a well-conditioned ergodic-averaging computation. We prove a decomposition of Ruelle's formula that is differentiable on the unstable manifold, which we assume to be one-dimensional. This decomposition of Ruelle's formula ensures that one of the resulting terms, the stable contribution, can be computed using a regularized tangent equation, similar to in a non-chaotic system. The remaining term, known as the unstable contribution, is regularized and converted into an efficiently computable ergodic average. In this process, we develop new algorithms, which may be useful beyond linear response, to compute i) a fundamental statistical quantity we introduce called the density gradient, and ii) the unstable derivatives of the regularized tangent vector field and the unstable direction. We prove that the S3 algorithm, which combines these computational ingredients that enter the stable and unstable contribution, converges like a Monte Carlo approximation of Ruelle's formula. The algorithm presented here is hence a first step toward full-fledged applications of sensitivity analysis in chaotic systems, wherever such applications have been limited due to lack of availability of long-term sensitivities.

preprint2021arXiv

A trajectory-driven algorithm for differentiating SRB measures on unstable manifolds

SRB measures are limiting stationary distributions describing the statistical behavior of chaotic dynamical systems. Directional derivatives of SRB measure densities conditioned on unstable manifolds are critical in the sensitivity analysis of hyperbolic chaos. These derivatives, known as the SRB density gradients, are by-products of the regularization of Lebesgue integrals appearing in the original linear response expression. In this paper, we propose a novel trajectory-driven algorithm for computing the SRB density gradient defined for systems with high-dimensional unstable manifolds. We apply the concept of measure preservation together with the chain rule on smooth manifolds. Due to the recursive one-step nature of our derivations, the proposed procedure is memory-efficient and can be naturally integrated with existing Monte Carlo schemes widely used in computational chaotic dynamics. We numerically show the exponential convergence of our scheme, analyze the computational cost, and present its use in the context of Monte Carlo integration.

preprint2020arXiv

A BeiDou Signal Acquisition Approach Using Variable Length Data Accumulation based on Signal Delay and Multiplication

The secondary modulation with the NeumannHoffman code increases the possibility of bit sign transition. Unlike other GNSS signals, there is no pilot component for synchronization in BeiDou B1/B3 signals, which increases the complexity in acquisition. A previous study has shown that the delay and multiplication (DAM) method is able to eliminate the bit sign transition problem, but it only applies to pretty strong signals. In this paper, a DAM-based BeiDou signal acquisition approach, called variable length data accumulation (VLDA), is proposed to acquire weak satellite signals. Firstly, the performance of DAM method versus the different delays is analyzed. The DAM operation not only eliminates bit sign transition, but it also increases noise power. Secondly, long-term signal is periodically accumulated to improve signal intensity in order to acquire weak signals. While considering the Doppler frequency shift of ranging codes, the signal length must be compensated before accumulating long-term signal. Finally, the fast-Fourier-transform based parallel code phase algorithm are used for acquisition. The simulation results indicate that the proposed VLDA method has better acquisition sensitivity than traditional non-coherent integration method under the same calculation amount. The VLDA method only requires approximately 27.5% of calculations to achieve the same acquisition sensitivity (35 dBHz). What is more, the actual experimental results verify the feasibility of the VLDA method. It can be concluded that the proposed approach is an effective and feasible method for solving the bit sign transition problem.

preprint2020arXiv

A computable realization of Ruelle's formula for linear response of statistics in chaotic systems

We present a computable reformulation of Ruelle's linear response formula for chaotic systems. The new formula, called Space-Split Sensitivity or S3, achieves an error convergence of the order ${\cal O}(1/\sqrt{N})$ using $N$ phase points. The reformulation is based on splitting the overall sensitivity into that to stable and unstable components of the perturbation. The unstable contribution to the sensitivity is regularized using ergodic properties and the hyperbolic structure of the dynamics. Numerical examples of uniformly hyperbolic attractors are used to validate the S3 formula against a naïve finite-difference calculation; sensitivities match closely, with far fewer sample points required by S3.

preprint2020arXiv

Ergodic Sensitivity Analysis of One-Dimensional Chaotic Maps

Sensitivity analysis in chaotic dynamical systems is a challenging task from a computational point of view. In this work, we present a numerical investigation of a novel approach, known as the space-split sensitivity or S3 algorithm. The S3 algorithm is an ergodic-averaging method to differentiate statistics in ergodic, chaotic systems, rigorously based on the theory of hyperbolic dynamics. We illustrate S3 on one-dimensional chaotic maps, revealing its computational advantage over naive finite difference computations of the same statistical response. In addition, we provide an intuitive explanation of the key components of the S3 algorithm, including the density gradient function.

preprint2020arXiv

Hierarchical Jacobi Iteration for Structured Matrices on GPUs using Shared Memory

High fidelity scientific simulations modeling physical phenomena typically require solving large linear systems of equations which result from discretization of a partial differential equation (PDE) by some numerical method. This step often takes a vast amount of computational time to complete, and therefore presents a bottleneck in simulation work. Solving these linear systems efficiently requires the use of massively parallel hardware with high computational throughput, as well as the development of algorithms which respect the memory hierarchy of these hardware architectures to achieve high memory bandwidth. In this paper, we present an algorithm to accelerate Jacobi iteration for solving structured problems on graphics processing units (GPUs) using a hierarchical approach in which multiple iterations are performed within on-chip shared memory every cycle. A domain decomposition style procedure is adopted in which the problem domain is partitioned into subdomains whose data is copied to the shared memory of each GPU block. Jacobi iterations are performed internally within each block's shared memory, avoiding the need to perform expensive global memory accesses every step. We test our algorithm on the linear systems arising from discretization of Poisson's equation in 1D and 2D, and observe speedup in convergence using our shared memory approach compared to a traditional Jacobi implementation which only uses global memory on the GPU. We observe a x8 speedup in convergence in the 1D problem and a nearly x6 speedup in the 2D case from the use of shared memory compared to a conventional GPU approach.