Source author record

Zhiyuan Jiang

Zhiyuan Jiang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Networking and Internet Architecture eess.SP Artificial Intelligence Computer Vision eess.SY Systems and Control Computation and Language Machine Learning Robotics

Catalog footprint

What is connected

10works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Tone Matters: The Impact of Linguistic Tone on Hallucination in VLMs

Vision-Language Models (VLMs) are increasingly used in safety-critical applications that require reliable visual grounding. However, these models often hallucinate details that are not present in the image to satisfy user prompts. While recent datasets and benchmarks have been introduced to evaluate systematic hallucinations in VLMs, many hallucination behaviors remain insufficiently characterized. In particular, prior work primarily focuses on object presence or absence, leaving it unclear how prompt phrasing and structural constraints can systematically induce hallucinations. In this paper, we investigate how different forms of prompt pressure influence hallucination behavior. We introduce Ghost-100, a procedurally generated dataset of synthetic scenes in which key visual details are deliberately removed, enabling controlled analysis of absence-based hallucinations. Using a structured 5-Level Prompt Intensity Framework, we vary prompts from neutral queries to toxic demands and rigid formatting constraints. We evaluate three representative open-weight VLMs: MiniCPM-V 2.6-8B, Qwen2-VL-7B, and Qwen3-VL-8B. Across all three models, hallucination rates do not increase monotonically with prompt intensity. All models exhibit reductions at higher intensity levels at different thresholds, though not all show sustained reduction under maximum coercion. These results suggest that current safety alignment is more effective at detecting semantic hostility than structural coercion, revealing model-specific limitations in handling compliance pressure. Our dataset is available at: https://github.com/bli1/tone-matters

preprint2026arXiv

Wow, wo, val! A Comprehensive Embodied World Model Evaluation Turing Test

As world models gain momentum in Embodied AI, an increasing number of works explore using video foundation models as predictive world models for downstream embodied tasks like 3D prediction or interactive generation. However, before exploring these downstream tasks, video foundation models still have two critical questions unanswered: (1) whether their generative generalization is sufficient to maintain perceptual fidelity in the eyes of human observers, and (2) whether they are robust enough to serve as a universal prior for real-world embodied agents. To provide a standardized framework for answering these questions, we introduce the Embodied Turing Test benchmark: WoW-World-Eval (Wow,wo,val). Building upon 609 robot manipulation data, Wow-wo-val examines five core abilities, including perception, planning, prediction, generalization, and execution. We propose a comprehensive evaluation protocol with 22 metrics to assess the models' generation ability, which achieves a high Pearson Correlation between the overall score and human preference (>0.93) and establishes a reliable foundation for the Human Turing Test. On Wow-wo-val, models achieve only 17.27 on long-horizon planning and at best 68.02 on physical consistency, indicating limited spatiotemporal consistency and physical reasoning. For the Inverse Dynamic Model Turing Test, we first use an IDM to evaluate the video foundation models' execution accuracy in the real world. However, most models collapse to $\approx$ 0% success, while WoW maintains a 40.74% success rate. These findings point to a noticeable gap between the generated videos and the real world, highlighting the urgency and necessity of benchmarking World Model in Embodied AI.

preprint2022arXiv

A Hard and Soft Hybrid Slicing Framework for Service Level Agreement Guarantee via Deep Reinforcement Learning

Network slicing is a critical driver for guaranteeing the diverse service level agreements (SLA) in 5G and future networks. Recently, deep reinforcement learning (DRL) has been widely utilized for resource allocation in network slicing. However, existing related works do not consider the performance loss associated with the initial exploration phase of DRL. This paper proposes a new performance-guaranteed slicing strategy with a soft and hard hybrid slicing setting. Mainly, a common slice setting is applied to guarantee slices' SLA when training the neural network. Moreover, the resource of the common slice tends to precisely redistribute to slices with the training of DRL until it converges. Furthermore, experiment results confirm the effectiveness of our proposed slicing framework: the slices' SLA of the training phase can be guaranteed, and the proposed algorithm can achieve the near-optimal performance in terms of the SLA satisfaction ratio, isolation degree and spectrum maximization after convergence.

preprint2022arXiv

Prediction-based Hybrid Slicing Framework for Service Level Agreement Guarantee in Mobility Scenarios: A Deep Learning Approach

Network slicing is a critical driver for guaranteeing the diverse service level agreements (SLA) in 5G and future networks. Inter-slice radio resource allocation (IS-RRA) in the radio access network (RAN) is very important. However, user mobility brings new challenges for optimal IS-RRA. This paper first proposes a soft and hard hybrid slicing framework where a common slice is introduced to realize a trade-off between isolation and spectrum efficiency (SE). To address the challenges posed by user mobility, we propose a two-step deep learning-based algorithm: joint long short-term memory (LSTM)-based network state prediction and deep Q network (DQN)-based slicing strategy. In the proposal, LSTM networks are employed to predict traffic demand and the location of each user in a slicing window level. Moreover, channel gain is mapped by location and a radio map. Then, the predicted channel gain and traffic demand are input to the DQN to output the precise slicing adjustment. Finally, experiment results confirm the effectiveness of our proposed slicing framework: the slices' SLA can be guaranteed well, and the proposed algorithm can achieve near-optimal performance in terms of the SLA satisfaction ratio, isolation degree and SE.

preprint2021arXiv

Predictive Wireless Based Status Update for Communication-Agnostic Sampling

In a wireless network that conveys status updates from sources (i.e., sensors) to destinations, one of the key issues studied by existing literature is how to design an optimal source sampling strategy on account of the communication constraints which are often modeled as queues. In this paper, an alternative perspective is presented -- a novel status-aware communication scheme, namely \emph{parallel communications}, is proposed which allows sensors to be communication-agnostic. Specifically, the proposed scheme can determine, based on an online prediction functionality, whether a status packet is worth transmitting considering both the network condition and status prediction, such that sensors can generate status packets without communication constraints. We evaluate the proposed scheme on a Software-Defined-Radio (SDR) test platform, which is integrated with a collaborative autonomous driving simulator, i.e., Simulation-of-Urban-Mobility (SUMO), to produce realistic vehicle control models and road conditions. The results show that with online status predictions, the channel occupancy is significantly reduced, while guaranteeing low status recovery error. Then the framework is applied to two scenarios: a multi-density platooning scenario, and a flight formation control scenario. Simulation results show that the scheme achieves better performance on the network level, in terms of keeping the minimum safe distance in both vehicle platooning and flight control.

preprint2020arXiv

Age of Information Optimized MAC in V2X Sidelink via Piggyback-Based Collaboration

Real-time status update in future vehicular networks is vital to enable control-level cooperative autonomous driving. Cellular Vehicle-to-Everything (C-V2X), as one of the most promising vehicular wireless technologies, adopts a Semi-Persistent Scheduling (SPS) based Medium-Access-Control (MAC) layer protocol for its sidelink communications. Despite the recent and ongoing efforts to optimize SPS, very few work has considered the status update performance of SPS. In this paper, Age of Information (AoI) is first leveraged to evaluate the MAC layer performance of C-V2X sidelink. Critical issues of SPS, i.e., persistent packet collisions and Half-Duplex (HD) effects, are identified to hinder its AoI performance. Therefore, a piggyback-based collaboration method is proposed accordingly, whereby vehicles collaborate to inform each other of potential collisions and collectively afford HD errors, while entailing only a small signaling overhead. Closed-form AoI performance is derived for the proposed scheme, optimal configurations for key parameters are hence calculated, and the convergence property is proved for decentralized implementation. Simulation results show that compared with the standardized SPS and its state-of-the-art enhancement schemes, the proposed scheme shows significantly better performance, not only in terms of AoI, but also of conventional metrics such as transmission reliability.

preprint2020arXiv

Deep Reinforcement Learning-Based Beam Tracking for Low-Latency Services in Vehicular Networks

Ultra-Reliable and Low-Latency Communications (URLLC) services in vehicular networks on millimeter-wave bands present a significant challenge, considering the necessity of constantly adjusting the beam directions. Conventional methods are mostly based on classical control theory, e.g., Kalman filter and its variations, which mainly deal with stationary scenarios. Therefore, severe application limitations exist, especially with complicated, dynamic Vehicle-to-Everything (V2X) channels. This paper gives a thorough study of this subject, by first modifying the classical approaches, e.g., Extended Kalman Filter (EKF) and Particle Filter (PF), for non-stationary scenarios, and then proposing a Reinforcement Learning (RL)-based approach that can achieve the URLLC requirements in a typical intersection scenario. Simulation results based on a commercial ray-tracing simulator show that enhanced EKF and PF methods achieve packet delay more than $10$ ms, whereas the proposed deep RL-based method can reduce the latency to about $6$ ms, by extracting context information from the training data.

preprint2020arXiv

Revealing Much While Saying Less: Predictive Wireless for Status Update

Wireless communications for status update are becoming increasingly important, especially for machine-type control applications. Existing work has been mainly focused on Age of Information (AoI) optimizations. In this paper, a status-aware predictive wireless interface design, networking and implementation are presented which aim to minimize the status recovery error of a wireless networked system by leveraging online status model predictions. Two critical issues of predictive status update are addressed: practicality and usefulness. Link-level experiments on a Software-Defined-Radio (SDR) testbed are conducted and test results show that the proposed design can significantly reduce the number of wireless transmissions while maintaining a low status recovery error. A Status-aware Multi-Agent Reinforcement learning neTworking solution (SMART) is proposed to dynamically and autonomously control the transmit decisions of devices in an ad hoc network based on their individual statuses. System-level simulations of a multi dense platooning scenario are carried out on a road traffic simulator. Results show that the proposed schemes can greatly improve the platooning control performance in terms of the minimum safe distance between successive vehicles, in comparison with the AoI-optimized status-unaware and communication latency-optimized schemes---this demonstrates the usefulness of our proposed status update schemes in a real-world application.

preprint2014arXiv

Achievable Rates of FDD Massive MIMO Systems with Spatial Channel Correlation

It is well known that the performance of frequency-division-duplex (FDD) massive MIMO systems with i.i.d. channels is disappointing compared with that of time-division-duplex (TDD) systems, due to the prohibitively large overhead for acquiring channel state information at the transmitter (CSIT). In this paper, we investigate the achievable rates of FDD massive MIMO systems with spatially correlated channels, considering the CSIT acquisition dimensionality loss, the imperfection of CSIT and the regularized-zero-forcing linear precoder. The achievable rates are optimized by judiciously designing the downlink channel training sequences and user CSIT feedback codebooks, exploiting the multiuser spatial channel correlation. We compare our achievable rates with TDD massive MIMO systems, i.i.d. FDD systems, and the joint spatial division and multiplexing (JSDM) scheme, by deriving the deterministic equivalents of the achievable rates, based on popular channel models. It is shown that, based on the proposed eigenspace channel estimation schemes, the rate-gap between FDD systems and TDD systems is significantly narrowed, even approached under moderate number of base station antennas. Compared to the JSDM scheme, our proposal achieves dimensionality-reduction channel estimation without channel pre-projection, and higher throughput for moderate number of antennas and moderate to large channel coherence time, though at higher computational complexity.

preprint2014arXiv

Dynamic Channel Acquisition in MU-MIMO

Multiuser multiple-input-multiple-output (MU-MIMO) systems are known to be hindered by dimensionality loss due to channel state information (CSI) acquisition overhead. In this paper, we investigate user-scheduling in MU-MIMO systems on account of CSI acquisition overhead, where a base station dynamically acquires user channels to avoid choking the system with CSI overhead. The genie-aided optimization problem (GAP) is first formulated to maximize the Lyapunov-drift every scheduling step, incorporating user queue information and taking channel fluctuations into consideration. The scheduling scheme based on GAP, namely the GAP-rule, is proved to be throughput-optimal but practically infeasible, and thus serves as a performance bound. In view of the implementation overhead and delay unfairness of the GAP-rule, the T-frame dynamic channel acquisition scheme and the power-law DCA scheme are further proposed to mitigate the implementation overhead and delay unfairness, respectively. Both schemes are based on the GAP-rule and proved throughput-optimal. To make the schemes practically feasible, we then propose the heuristic schemes, queue-based quantized-block-length user scheduling scheme (QQS), T-frame QQS, and power-law QQS, which are the practical versions of the aforementioned GAP-based schemes, respectively. The QQS-based schemes substantially decrease the complexity, and also perform fairly close to the optimum. Numerical results evaluate the proposed schemes under various system parameters.

Zhiyuan Jiang

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Tone Matters: The Impact of Linguistic Tone on Hallucination in VLMs

Wow, wo, val! A Comprehensive Embodied World Model Evaluation Turing Test

A Hard and Soft Hybrid Slicing Framework for Service Level Agreement Guarantee via Deep Reinforcement Learning

Prediction-based Hybrid Slicing Framework for Service Level Agreement Guarantee in Mobility Scenarios: A Deep Learning Approach

Predictive Wireless Based Status Update for Communication-Agnostic Sampling

Age of Information Optimized MAC in V2X Sidelink via Piggyback-Based Collaboration

Deep Reinforcement Learning-Based Beam Tracking for Low-Latency Services in Vehicular Networks

Revealing Much While Saying Less: Predictive Wireless for Status Update

Achievable Rates of FDD Massive MIMO Systems with Spatial Channel Correlation

Dynamic Channel Acquisition in MU-MIMO