Researcher profile

Li Wei

Li Wei contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Efficient Context Scaling with LongCat ZigZag Attention

We introduce LongCat ZigZag Attention (LoZA), which is a sparse attention scheme designed to transform any existing full-attention models into sparse versions with rather limited compute budget. In long-context scenarios, LoZA can achieve significant speed-ups both for prefill-intensive (e.g., retrieval-augmented generation) and decode-intensive (e.g., tool-integrated reasoning) cases. Specifically, by applying LoZA to LongCat-Flash during mid-training, we serve LongCat-Flash-Exp as a long-context foundation model that can swiftly process up to 1 million tokens, enabling efficient long-term reasoning and long-horizon agentic capabilities.

preprint2026arXiv

ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated Merging

Despite the rapid advancements in large language model (LLM) development, fine-tuning them for specific tasks often results in the catastrophic forgetting of their general, language-based reasoning abilities. This work investigates and addresses this challenge in the context of the Generative Retrieval (GenRetrieval) task. During GenRetrieval fine-tuning, we find this forgetting occurs rapidly and correlates with the distance between the fine-tuned and original model parameters. Given these observations, we propose ORBIT, a novel approach that actively tracks the distance between fine-tuned and initial model weights, and uses a weight averaging strategy to constrain model drift during GenRetrieval fine-tuning when this inter-model distance exceeds a maximum threshold. Our results show that ORBIT retains substantial text and retrieval performance by outperforming both common continual learning baselines and related regularization methods that also employ weight averaging.

preprint2022arXiv

Channel Estimation for RIS-Empowered Multi-User MISO Wireless Communications

Reconfigurable Intelligent Surfaces (RISs) have been recently considered as an energy-efficient solution for future wireless networks due to their fast and low-power configuration, which has increased potential in enabling massive connectivity and low-latency communications. Accurate and low-overhead channel estimation in RIS-based systems is one of the most critical challenges due to the usually large number of RIS unit elements and their distinctive hardware constraints. In this paper, we focus on the uplink of a RIS-empowered multi-user Multiple Input Single Output (MISO) uplink communication systems and propose a channel estimation framework based on the parallel factor decomposition to unfold the resulting cascaded channel model. We present two iterative estimation algorithms for the channels between the base station and RIS, as well as the channels between RIS and users. One is based on alternating least squares (ALS), while the other uses vector approximate message passing to iteratively reconstruct two unknown channels from the estimated vectors. To theoretically assess the performance of the ALS-based algorithm, we derived its estimation Cramér-Rao Bound (CRB). We also discuss the downlink achievable sum rate computation with estimated channels and different precoding schemes for the base station. Our extensive simulation results show that our algorithms outperform benchmark schemes and that the ALS technique achieves the CRB. It is also demonstrated that the sum rate using the estimated channels always reach that of perfect channels under various settings, thus, verifying the effectiveness and robustness of the proposed estimation algorithms.

preprint2022arXiv

Joint Channel Estimation and Signal Recovery for RIS-Empowered Multi-User Communications

Reconfigurable intelligent surfaces (RISs) have been recently considered as a promising candidate for energy-efficient solutions in future wireless networks. Their dynamic and lowpower configuration enables coverage extension, massive connectivity, and low-latency communications. Due to a large number of unknown variables referring to the RIS unit elements and the transmitted signals, channel estimation and signal recovery in RIS-based systems are the ones of the most critical technical challenges. To address this problem, we focus on the RIS-assisted wireless communication system and present two joint channel estimation and signal recovery schemes based on message passing algorithms in this paper. Specifically, the proposed bidirectional scheme applies the Taylor series expansion and Gaussian approximation to simplify the sum-product procedure in the formulated problem. In addition, the inner iteration that adopts two variants of approximate message passing algorithms is incorporated to ensure robustness and convergence. Two ambiguities removal methods are also discussed in this paper. Our simulation results show that the proposed schemes show the superiority over the state-of-art benchmark method. We also provide insights on the impact of different RIS parameter settings on the proposed schemes.

preprint2022arXiv

Multi-hop RIS-Empowered Terahertz Communications: A DRL-based Hybrid Beamforming Design

Wireless communication in the TeraHertz band (0.1--10 THz) is envisioned as one of the key enabling technologies for the future sixth generation (6G) wireless communication systems scaled up beyond massive multiple input multiple output (Massive-MIMO) technology. However, very high propagation attenuations and molecular absorptions of THz frequencies often limit the signal transmission distance and coverage range. Benefited from the recent breakthrough on the reconfigurable intelligent surfaces (RIS) for realizing smart radio propagation environment, we propose a novel hybrid beamforming scheme for the multi-hop RIS-assisted communication networks to improve the coverage range at THz-band frequencies. Particularly, multiple passive and controllable RISs are deployed to assist the transmissions between the base station (BS) and multiple single-antenna users. We investigate the joint design of digital beamforming matrix at the BS and analog beamforming matrices at the RISs, by leveraging the recent advances in deep reinforcement learning (DRL) to combat the propagation loss. To improve the convergence of the proposed DRL-based algorithm, two algorithms are then designed to initialize the digital beamforming and the analog beamforming matrices utilizing the alternating optimization technique. Simulation results show that our proposed scheme is able to improve 50\% more coverage range of THz communications compared with the benchmarks. Furthermore, it is also shown that our proposed DRL-based method is a state-of-the-art method to solve the NP-hard beamforming problem, especially when the signals at RIS-assisted THz communication networks experience multiple hops.

preprint2022arXiv

Multi-User Holographic MIMO Surfaces: Channel Modeling and Spectral Efficiency Analysis

The multi-user Holographic Multiple-Input and Multiple-Output Surface (MU-HMIMOS) paradigm, which is capable of realizing large continuous apertures with minimal power consumption, has been recently considered as an energyefficient solution for future wireless networks, offering increased flexibility in impacting electromagnetic (EM) wave propagation according to the desired communication, localization, and sensing objectives. The tractable channel modeling in MU-HMIMOS wireless systems is one of the most critical research challenges, mainly due to the coupling effect induced by the excessively large number of closely spaced patch antennas. In this paper, we focus on this challenge for the downlink of multi-user MIMO communications and extend an EM-compliant channel model to multiuser case, which is expressed in the wavenumber domain using the Fourier plane wave approximation. Based on the presented channel model, we investigate the spectral efficiency of maximumratio transmission and Zero-Forcing (ZF) precoding schemes. We also introduce a novel hardware efficient ZF precoder, leveraging Neumann series (NS) expansion to replace the required matrix inversion operation, which is very hard to be computed in the conventional way due to the extremely large number of patch antennas in the envisioned MU-HMIMOS communication systems. In comparison with the conventional independent and identical Rayleigh fading channels that ignore antenna coupling effects, the proposed EM-compliant channel model captures the mutual couplings induced by the very small antenna spacing. Our extensive performance evaluation results demonstrate that our theoretical performance expressions approximate sufficiently well ...

preprint2022arXiv

Multi-User Wireless Communications with Holographic MIMO Surfaces: A Convenient Channel Model and Spectral Efficiency Analysis

The multi-user Holographic Multiple-Input and Multiple-Output Surface (MU-HMIMOS) paradigm, which is capable of realizing large continuous apertures with minimal power consumption and of shaping radio wave propagation at will, has been recently considered as an energy-efficient solution for future wireless networks. The tractable channel modeling of MU-HMIMOS signal propagation is one of the most critical challenges, mainly due to the coupling effect induced by the excessively large number of closely spaced patch antennas. In this paper, we focus on this challenge for downlink communications and model the electromagnetic channel in the wavenumber domain using the Fourier plane wave representation. Based on the proposed model, we devise a Zero-Forcing (ZF) precoding scheme, capitalizing on the sampled channel variance that depends on the number and spacing of the HMIMOS patch antennas, and perform a spectral efficiency analysis. Our simulation results showcase that the more patch antennas and the larger their spacing is, the performance of the considered MU-HMIMOS system improves. In addition, it is demonstrated that our theoretical performance expressions approximate sufficiently well the simulated spectral efficiency, even for the highly correlated cases, thus verifying the effectiveness and robustness of the presented analytical framework.

preprint2020arXiv

Hybrid Beamforming for RIS-Empowered Multi-hop Terahertz Communications: A DRL-based Method

Wireless communication in the TeraHertz band (0.1--10 THz) is envisioned as one of the key enabling technologies for the future six generation (6G) wireless communication systems. However, very high propagation attenuations and molecular absorptions of THz frequencies often limit the signal transmission distance and coverage range. Benefited from the recent breakthrough on the reconfigurable intelligent surfaces (RIS) for realizing smart radio propagation environment, we propose a novel hybrid beamforming scheme for the multi-hop RIS-assisted communication networks to improve the coverage range at THz-band frequencies. We investigate the joint design of digital beamforming matrix at the BS and analog beamforming matrices at the RISs, by leveraging the recent advances in deep reinforcement learning (DRL) to combat the propagation loss. Simulation results show that our proposed scheme is able to improve 50\% more coverage range of THz communications compared with the benchmarks. Furthermore, it is also shown that our proposed DRL-based method is a state-of-the-art method to solve the NP-bard beamforming problem, especially when the signals at RIS-empowered THz communication networks experience multiple hops.

preprint2020arXiv

Parallel Factor Decomposition Channel Estimation in RIS-Assisted Multi-User MISO Communication

Reconfigurable Intelligent Surfaces (RISs) have been recently considered as an energy-efficient solution for future wireless networks due to their fast and low power configuration enabling massive connectivity and low latency communications. Channel estimation in RIS-based systems is one of the most critical challenges due to the large number of reflecting unit elements and their distinctive hardware constraints. In this paper, we focus on the downlink of a RIS-assisted multi-user Multiple Input Single Output (MISO) communication system and present a method based on the PARAllel FACtor (PARAFAC) decomposition to unfold the resulting cascaded channel model. The proposed method includes an alternating least squares algorithm to iteratively estimate the channel between the base station and RIS, as well as the channels between RIS and users. Our selective simulation results show that the proposed iterative channel estimation method outperforms a benchmark scheme using genie-aided information. We also provide insights on the impact of different RIS settings on the proposed algorithm.

preprint2020arXiv

TEAM: An Taylor Expansion-Based Method for Generating Adversarial Examples

Although Deep Neural Networks(DNNs) have achieved successful applications in many fields, they are vulnerable to adversarial examples.Adversarial training is one of the most effective methods to improve the robustness of DNNs, and it is generally considered as solving a saddle point problem that minimizes risk and maximizes perturbation.Therefore, powerful adversarial examples can effectively replicate the situation of perturbation maximization to solve the saddle point problem.The method proposed in this paper approximates the output of DNNs in the input neighborhood by using the Taylor expansion, and then optimizes it by using the Lagrange multiplier method to generate adversarial examples. If it is used for adversarial training, the DNNs can be effectively regularized and the defects of the model can be improved.