Researcher profile

Yuan Wu

Yuan Wu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2026arXiv

AGGC: Adaptive Group Gradient Clipping for Stabilizing Large Language Model Training

To stabilize the training of Large Language Models (LLMs), gradient clipping is a nearly ubiquitous heuristic used to alleviate exploding gradients. However, traditional global norm clipping erroneously presupposes gradient homogeneity across different functional modules, leading to an adverse "spill-over" effect where volatile parameters force unnecessary scaling on stable ones. To overcome this, we propose Adaptive Group-wise Gradient Clipping (AGGC). AGGC partitions parameters into groups based on functional types and regulates each according to its historical behavior using an Exponential Moving Average (EMA). Specifically, it constructs an adaptive interval to simultaneously mitigate gradient explosion and vanishing, while employing a time-dependent scheduling mechanism to balance exploration and convergence. Experiments on LLaMA 2-7B, Mistral-7B, and Gemma-7B models show that AGGC consistently outperforms LoRA and frequently surpasses Full Fine-Tuning. On the GSM8K benchmark, Mistral-7B fine-tuned with AGGC achieves an accuracy of 72.93%, exceeding LoRA's 69.5%. AGGC also effectively stabilizes Reinforcement Learning with Verifiable Rewards (RLVR), enhancing the logic deduction of Qwen 2.5 and Llama 3.2 models. Experimental results demonstrate that AGGC effectively addresses the limitations of traditional gradient clipping methods, particularly in overcoming gradient heterogeneity, by utilizing a modular, adaptive clipping strategy to stabilize the training process. Due to its lightweight design, AGGC can be seamlessly integrated into existing post-training pipelines with negligible overhead.

preprint2026arXiv

GPS-Synchronized Monitoring of Core-collapse Supernova Bursts with PandaX-4T via Coherent Elastic Neutrino Nuclear Scattering

The landmark detection of neutrinos from SN1987A marked the dawn of neutrino astrophysics. The neutrino burst provided essential insights into fundamental properties of neutrinos, and served as key probes of stellar evolution and supernova dynamics. The recent advancement in coherent elastic neutrino-nucleus scattering enables the detection of core-collapse supernova burst neutrinos using tonne-scale liquid xenon detectors originally designed for dark matter direct detection. Leveraging this capability, we developed and deployed an online supernova monitoring system for the PandaX-4T experiment. This system features a GPS module with millisecond-level timing precision, a low false-alarm rate, and high sensitivity to galactic core-collapse supernova explosion events. The methodology is robust, directly scalable, and planned for implementation in the next-generation PandaX-20T experiment.

preprint2026arXiv

Height-Guided Projection Reparameterization for Camera-LiDAR Occupancy

3D occupancy prediction aims to infer dense, voxel-wise scene semantics from sensor observations, where the 2D-to-3D view transformation serves as a crucial step in bridging image features and volumetric representations. Most previous methods rely on a fixed projection space, where 3D reference points are uniformly sampled along pillars. However, such sampling struggles to capture the sparsity and height variations of real-world scenes, leading to ambiguous correspondences and unreliable feature aggregation. To address these challenges, we propose HiPR, a camera-LiDAR occupancy framework with Height-Guided Projection Reparameterization. HiPR first encodes LiDAR into a BEV height map to capture the maximum height of the point cloud. HiPR then adjusts the sampling range of each pillar using the height prior, enabling adaptive reparameterization of the projection space. As a result, the projected points are redistributed into geometrically meaningful regions rather than fixed ranges. Meanwhile, we mask out the invalid parts of the height map to avoid misleading the feature aggregation. In addition, to alleviate the training instability caused by noisy LiDAR-derived heights, we introduce a training-time Progressive Height Conditioning strategy, which gradually transitions the conditioning signal from ground-truth heights to LiDAR heights. Extensive experiments demonstrate that HiPR consistently outperforms existing state-of-the-art methods while maintaining real-time inference. The code and pretrained models can be found at https://github.com/yanzq95/HiPR.

preprint2023arXiv

A Survey on Evaluation of Large Language Models

Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes increasingly critical, not only at the task level, but also at the society level for better understanding of their potential risks. Over the past years, significant efforts have been made to examine LLMs from various perspectives. This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate, where to evaluate, and how to evaluate. Firstly, we provide an overview from the perspective of evaluation tasks, encompassing general natural language processing tasks, reasoning, medical usage, ethics, educations, natural and social sciences, agent applications, and other areas. Secondly, we answer the `where' and `how' questions by diving into the evaluation methods and benchmarks, which serve as crucial components in assessing performance of LLMs. Then, we summarize the success and failure cases of LLMs in different tasks. Finally, we shed light on several future challenges that lie ahead in LLMs evaluation. Our aim is to offer invaluable insights to researchers in the realm of LLMs evaluation, thereby aiding the development of more proficient LLMs. Our key point is that evaluation should be treated as an essential discipline to better assist the development of LLMs. We consistently maintain the related open-source materials at: https://github.com/MLGroupJLU/LLM-eval-survey.

preprint2022arXiv

Achieving Multi-beam Gain in Intelligent Reflecting Surface Assisted Wireless Energy Transfer

Intelligent reflecting surface (IRS) is a promising technology to boost the efficiency of wireless energy transfer (WET) systems. However, for a multiuser WET system, simultaneous multi-beam energy transmission is generally required to achieve the maximum performance, which may not be implemented by using the IRS having only a single set of coefficients. As a result, it remains unknowns how to exploit the IRS to approach such a performance upper bound. To answer this question, we aim to maximize the total harvested energy of a multiuser WET system subject to the user fairness constraints and the non-linear energy harvesting model. We first consider the static IRS beamforming scheme, which shows that the optimal IRS reflection matrix obtained by applying semidefinite relaxation is indeed of high rank in general as the number of energy receivers (ERs) increases, due to which the resulting rank-one solution by applying Gaussian Randomization may lead to significant loss. To achieve the multi-beam gain, we then propose a general time-division based novel framework by exploiting the IRS's dynamic passive beamforming. Moreover, it is able to achieve a good balance between the system performance and complexity by controlling the number of IRS shift patterns. Finally, we also propose a time-division multiple access (TDMA) based passive beamforming design for performance comparison. Simulation results demonstrate the necessity of multi-beam transmission and the superiority of the proposed dynamic IRS beamforming scheme over existing schemes.

preprint2022arXiv

Co-Regularized Adversarial Learning for Multi-Domain Text Classification

Multi-domain text classification (MDTC) aims to leverage all available resources from multiple domains to learn a predictive model that can generalize well on these domains. Recently, many MDTC methods adopt adversarial learning, shared-private paradigm, and entropy minimization to yield state-of-the-art results. However, these approaches face three issues: (1) Minimizing domain divergence can not fully guarantee the success of domain alignment; (2) Aligning marginal feature distributions can not fully guarantee the discriminability of the learned features; (3) Standard entropy minimization may make the predictions on unlabeled data over-confident, deteriorating the discriminability of the learned features. In order to address the above issues, we propose a co-regularized adversarial learning (CRAL) mechanism for MDTC. This approach constructs two diverse shared latent spaces, performs domain alignment in each of them, and punishes the disagreements of these two alignments with respect to the predictions on unlabeled data. Moreover, virtual adversarial training (VAT) with entropy minimization is incorporated to impose consistency regularization to the CRAL method. Experiments show that our model outperforms state-of-the-art methods on two MDTC benchmarks.

preprint2022arXiv

Erasing the which-path information of photons

Which-path information of a quantum particle in interferometers is the key to infer the past of quantum particle. It arises many extensive discussions including quantum complementarity and path-visibility relation. The basic of these discussions are the description, detection and control of which-path information. In this article, we focus on the investigation of multidimensional which-path information in nested Mach-Zehnder interferometer. A general expression of which-path information is given and can be partially extracted by different detection method. Further analysis shows that the which-path information can be controlled by the phase differences and beam splitting ratios between the arms of nested Mach-Zehnder interferometer. Moreover, a new which-path information elimination phenomenon has been predicted and demonstrated experimentally. Our work can help to understand the physics of quantum particles, potentially apply to quantum information process and quantum metrology.

preprint2022arXiv

Maximum Batch Frobenius Norm for Multi-Domain Text Classification

Multi-domain text classification (MDTC) has obtained remarkable achievements due to the advent of deep learning. Recently, many endeavors are devoted to applying adversarial learning to extract domain-invariant features to yield state-of-the-art results. However, these methods still face one challenge: transforming original features to be domain-invariant distorts the distributions of the original features, degrading the discriminability of the learned features. To address this issue, we first investigate the structure of the batch classification output matrix and theoretically justify that the discriminability of the learned features has a positive correlation with the Frobenius norm of the batch output matrix. Based on this finding, we propose a maximum batch Frobenius norm (MBF) method to boost the feature discriminability for MDTC. Experiments on two MDTC benchmarks show that our MBF approach can effectively advance the performance of the state-of-the-art.

preprint2022arXiv

Throughput Maximization for Active Intelligent Reflecting Surface Aided Wireless Powered Communications

This paper considers an active intelligent reflecting surface (IRS)-aided wireless powered communication network (WPCN), where devices first harvest energy and then transmit information to a hybrid access point (HAP). Different from the existing works on passive IRS-aided WPCNs, this is the first work that introduces the active IRS in WPCNs. To guarantee fairness, the problem is formulated as an amplifying power-limited weighted sum throughput (WST) maximization problem, which is solved by successive convex approximation technique and fractional programming alternatively. To balance the performance and complexity tradeoff, three beamforming setups are considered at the active IRS, namely user-adaptive IRS beamforming, uplink-adaptive IRS beamforming, and static IRS beamforming. Numerical results demonstrate the significant superiority of employing active IRS in WPCNs and the benefits of dynamic IRS beamforming. Specifically, it is found that compared to the passive IRS, the active IRS not only improves the WST greatly, but also is more energy-efficient and can significantly extend the transmission coverage. Moreover, different from the symmetric deployment strategy of passive IRS, it is more preferable to deploy the active IRS near the devices.

preprint2021arXiv

Conditional Adversarial Networks for Multi-Domain Text Classification

In this paper, we propose conditional adversarial networks (CANs), a framework that explores the relationship between the shared features and the label predictions to impose more discriminability to the shared features, for multi-domain text classification (MDTC). The proposed CAN introduces a conditional domain discriminator to model the domain variance in both shared feature representations and class-aware information simultaneously and adopts entropy conditioning to guarantee the transferability of the shared features. We provide theoretical analysis for the CAN framework, showing that CAN's objective is equivalent to minimizing the total divergence among multiple joint distributions of shared features and label predictions. Therefore, CAN is a theoretically sound adversarial network that discriminates over multiple distributions. Evaluation results on two MDTC benchmarks show that CAN outperforms prior methods. Further experiments demonstrate that CAN has a good ability to generalize learned knowledge to unseen domains.

preprint2021arXiv

Learning Based Task Offloading in Digital Twin Empowered Internet of Vehicles

Mobile edge computing has become an effective and fundamental paradigm for futuristic autonomous vehicles to offload computing tasks. However, due to the high mobility of vehicles, the dynamics of the wireless conditions, and the uncertainty of the arrival computing tasks, it is difficult for a single vehicle to determine the optimal offloading strategy. In this paper, we propose a Digital Twin (DT) empowered task offloading framework for Internet of Vehicles. As a software agent residing in the cloud, a DT can obtain both global network information by using communications among DTs, and historical information of a vehicle by using the communications within the twin. The global network information and historical vehicular information can significantly facilitate the offloading. In specific, to preserve the precious computing resource at different levels for most appropriate computing tasks, we integrate a learning scheme based on the prediction of futuristic computing tasks in DT. Accordingly, we model the offloading scheduling process as a Markov Decision Process (MDP) to minimize the long-term cost in terms of a trade off between task latency, energy consumption, and renting cost of clouds. Simulation results demonstrate that our algorithm can effectively find the optimal offloading strategy, as well as achieve the fast convergence speed and high performance, compared with other existing approaches.

preprint2021arXiv

Mixup Regularized Adversarial Networks for Multi-Domain Text Classification

Using the shared-private paradigm and adversarial training has significantly improved the performances of multi-domain text classification (MDTC) models. However, there are two issues for the existing methods. First, instances from the multiple domains are not sufficient for domain-invariant feature extraction. Second, aligning on the marginal distributions may lead to fatal mismatching. In this paper, we propose a mixup regularized adversarial network (MRAN) to address these two issues. More specifically, the domain and category mixup regularizations are introduced to enrich the intrinsic features in the shared latent space and enforce consistent predictions in-between training instances such that the learned features can be more domain-invariant and discriminative. We conduct experiments on two benchmarks: The Amazon review dataset and the FDU-MTL dataset. Our approach on these two datasets yields average accuracies of 87.64\% and 89.0\% respectively, outperforming all relevant baselines.

preprint2021arXiv

Visualizing Deep Learning-based Radio Modulation Classifier

Deep learning has recently been successfully applied in automatic modulation classification by extracting and classifying radio features in an end-to-end way. However, deep learning-based radio modulation classifiers are lack of interpretability, and there is little explanation or visibility into what kinds of radio features are extracted and chosen for classification. In this paper, we visualize different deep learning-based radio modulation classifiers by introducing a class activation vector. Specifically, both convolutional neural networks (CNN) based classifier and long short-term memory (LSTM) based classifier are separately studied, and their extracted radio features are visualized. Extensive numerical results show both the CNN-based classifier and LSTM-based classifier extract similar radio features relating to modulation reference points. In particular, for the LSTM-based classifier, its obtained radio features are similar to the knowledge of human experts. Our numerical results indicate the radio features extracted by deep learning-based classifiers greatly depend on the contents carried by radio signals, and a short radio sample may lead to misclassification.

preprint2020arXiv

Atom-light hybrid quantum gyroscope

A new type of atom-light hybrid quantum gyroscope (ALHQG) is proposed due to its high rotation sensitivity. It consists of an optical Sagnac loop to couple rotation rate and an atomic ensemble as quantum beam splitter/recombiner (QBS/C) based on atomic Raman amplification process to realize the splitting and recombination of the optical wave and the atomic spin wave. The rotation sensitivity can be enhanced by the quantum correlation between Sagnac loop and QBS/C. The optimal working condition is investigated to achieve the best sensitivity. The numerical results show that the rotation sensitivity can beat the standard quantum limit (SQL) in ideal condition. Even in the presence of the attenuation under practical condition, the best sensitivity of the ALHQG can still beat the SQL and is better than that of a fiber optic gyroscope (FOG). Such an ALHQG could be practically applied for modern inertial navigation system.

preprint2020arXiv

Dual Mixup Regularized Learning for Adversarial Domain Adaptation

Recent advances on unsupervised domain adaptation (UDA) rely on adversarial learning to disentangle the explanatory and transferable features for domain adaptation. However, there are two issues with the existing methods. First, the discriminability of the latent space cannot be fully guaranteed without considering the class-aware information in the target domain. Second, samples from the source and target domains alone are not sufficient for domain-invariant feature extracting in the latent space. In order to alleviate the above issues, we propose a dual mixup regularized learning (DMRL) method for UDA, which not only guides the classifier in enhancing consistent predictions in-between samples, but also enriches the intrinsic structures of the latent space. The DMRL jointly conducts category and domain mixup regularizations on pixel level to improve the effectiveness of models. A series of empirical studies on four domain adaptation benchmarks demonstrate that our approach can achieve the state-of-the-art.