Researcher profile

Wenqi Zhang

Wenqi Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Fast Collaborative Inference via Distributed Speculative Decoding

Speculative decoding accelerates large language model (LLM) inference by allowing a small draft model to predict multiple future tokens for verification by a larger target model. In AI-native radio access networks (AI-RAN), this enables device-edge collaborative inference but introduces significant uplink overhead, as existing distributed speculative decoding schemes transmit full vocabulary logits at every step. We propose a sparsify-then-sample strategy, Truncated Sparse Logits Transmission (TSLT), which transmits only the logits and indices of a truncated candidate set. We provide theoretical guarantees showing that the acceptance rate is preserved under TSLT. TSLT is further extended to multi-candidate case, where multiple draft candidates per step increase acceptance probability. Experiments show that TSLT significantly reduces uplink communication while maintaining end-to-end inference latency and model quality, demonstrating its effectiveness for scalable, communication-efficient distributed LLM inference in future AI-RAN systems.

preprint2026arXiv

Ground What You See: Hallucination-Resistant MLLMs via Caption Feedback, Diversity-Aware Sampling, and Conflict Regularization

While Multimodal Large Language Models (MLLMs) have achieved remarkable success across diverse tasks, their practical deployment is severely hindered by hallucination issues, which become particularly acute during Reinforcement Learning (RL) optimization. This paper systematically analyzes the root causes of hallucinations in MLLMs under RL training, identifying three critical factors: (1) an over-reliance on chained visual reasoning, where inaccurate initial descriptions or redundant information anchor subsequent inferences to incorrect premises; (2) insufficient exploration diversity during policy optimization, leading the model to generate overly confident but erroneous outputs; and (3) destructive conflicts between training samples, where Neural Tangent Kernel (NTK) similarity causes false associations and unstable parameter updates. To address these challenges, we propose a comprehensive framework comprising three core modules. First, we enhance visual localization by introducing dedicated planning and captioning stages before the reasoning phase, employing a quality-based caption reward to ensure accurate initial anchoring. Second, to improve exploration, we categorize samples based on the mean and variance of their reward distributions, prioritizing samples with high variance to focus the model on diverse and informative data. Finally, to mitigate sample interference, we regulate NTK similarity by grouping sample pairs and applying an InfoNCE loss to push overly similar pairs apart and pull dissimilar ones closer, thereby guiding gradient interactions toward a balanced range. Experimental results demonstrate that our proposed method significantly reduces hallucination rates and effectively enhances the inference accuracy of MLLMs.

preprint2026arXiv

PragLocker: Protecting Agent Intellectual Property in Untrusted Deployments via Non-Portable Prompts

LLM agents rely on prompts to implement task-specific capabilities based on foundation LLMs, making agent prompts valuable intellectual property. However, in untrusted deployments, adversaries can copy and reuse these prompts with other proprietary LLMs, causing economic losses. To protect these prompts, we identify four key challenges: proactivity, runtime protection, usability, and non-portability that existing approaches fail to address. We present PragLocker, a prompt protection scheme that satisfies these requirements. PragLocker constructs function-preserving obfuscated prompts by anchoring semantics with code symbols and then using target-model feedback to inject noise, yielding prompts that only work on the target LLM. Experiments across multiple agent systems, datasets, and foundation LLMs show that PragLocker substantially reduces cross-LLM portability, maintains target performance, and remains robust against adaptive attackers.

preprint2026arXiv

Towards a Theoretical Framework for Robust Node Deployment in Cooperative ISAC Networks

This paper investigates node deployment strategies for robust multi-node cooperative localization in integrated sensing and communication (ISAC) networks.We first analyze how steering vector correlation across different positions affects localization performance and introduce a novel distance-weighted correlation metric to characterize this effect. Building upon this insight, we propose a deployment optimization framework that minimizes the maximum weighted steering vector correlation by optimizing simultaneously node positions and array orientations, thereby enhancing worst-case network robustness. Then, a genetic algorithm (GA) is developed to solve this min-max optimization, yielding optimized node positions and array orientations. Extensive simulations using both multiple signal classification (MUSIC) and neural-network (NN)-based localization validate the effectiveness of the proposed methods, demonstrating significant improvements in robust localization performance.

preprint2026arXiv

Towards Steering without Sacrifice: Principled Training of Steering Vectors for Prompt-only Interventions

Recently, steering vectors (SVs) have emerged as an effective and lightweight approach to steer behaviors of large language models (LLMs), among which fine-tuned SVs are more effective than optimization-free ones. However, current approaches to fine-tuned SVs suffer from two limitations. First, they require careful selection of steering factors on a per-SV basis to balance steering effectiveness and generation quality at inference time. Second, they operate as full-sequence SVs (FSSVs), which can sacrifice generation quality regardless of factor selection due to excessive intervention on the model generation process. To address the first limitation, we propose joint training of steering factors and directions, such that post-hoc factor selection is no longer required. Using neural network scaling theory, we find that moderately large initialization sizes and learning rates for steering factors are essential for stability and efficiency of joint training. To tackle the second limitation, we draw inspiration from representation fine-tuning and introduce Prompt-only SV (PrOSV), an SV that intervenes only on a few prompt tokens. Our empirical results show that PrOSV outperforms traditional FSSVs on AxBench when using our joint training scheme. We also find that PrOSV achieves a better tradeoff between general model utility and adversarial robustness than FSSV.

preprint2025arXiv

No Vision, No Wearables: 5G-based 2D Human Pose Recognition with Integrated Sensing and Communications

With the increasing maturity of contactless human pose recognition (HPR) technology, indoor interactive applications have raised higher demands for natural, controller-free interaction methods. However, current mainstream HPR solutions relying on vision or radio-frequency (RF) (including WiFi, radar) still face various challenges in practical deployment, such as privacy concerns, susceptibility to occlusion, dedicated equipment and functions, and limited sensing resolution and range. 5G-based integrated sensing and communication (ISAC) technology, by merging communication and sensing functions, offers a new approach to address these challenges in contactless HPR. We propose a practical 5G-based ISAC system capable of inferring 2D HPR from uplink sounding reference signals (SRS). Specifically, rich features are extracted from multiple domains and employ an encoder to achieve unified alignment and representation in a latent space. Subsequently, low-dimensional features are fused to output the human pose state. Experimental results demonstrate that in typical indoor environments, our proposed 5G-based ISAC HPR system significantly outperforms current mainstream baseline solutions in HPR performance, providing a solid technical foundation for universal human-computer interaction.