Researcher profile

Yuhang Wang

Yuhang Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

A Closed-loop, State-centric, Multi-agent Framework for Passenger Load Estimation from Heterogeneous Data Streams

To support operations and passenger-facing services, transit agencies need reliable passenger load trajectories. Currently, load estimates are typically inferred from imperfect sensing systems rather than fully observed, and the accuracy of modern automatic passenger counting (APC) systems still varies with station layout, flow intensity, and operating conditions. To address the challenges of robust passenger load estimation from heterogeneous data streams, including incremental count errors, evidence conflicts, and context-dependent sensor reliability, we propose a closed-loop, state-centric, multi-agent framework. This method enforces physical feasibility at every step, allocates trust dynamically among evidence sources, and feeds physics-derived violation residuals back into training for robustness improvement. The architecture consists of a unified stop-event backbone, a coupled Perception--Physical--Fusion loop for stop-by-stop inference, and optional trip-level macro-correction and closed-loop calibration modules.

preprint2026arXiv

EntroCoT: Enhancing Chain-of-Thought via Adaptive Entropy-Guided Segmentation

Chain-of-Thought (CoT) prompting has significantly enhanced the mathematical reasoning capabilities of Large Language Models. We find existing fine-tuning datasets frequently suffer from the "answer right but reasoning wrong" probelm, where correct final answers are derived from hallucinated, redundant, or logically invalid intermediate steps. This paper proposes EntroCoT, a unified framework for automatically identifying and refining low-quality CoT supervision traces. EntroCoT first proposes an entropy-based mechanism to segment the reasoning trace into multiple steps at uncertain junctures, and then introduces a Monte Carlo rollout-based mechanism to evaluate the marginal contribution of each step. By accurately filtering deceptive reasoning samples, EntroCoT constructs a high-quality dataset where every intermediate step in each reasoning trace facilitates the final answer. Extensive experiments on mathematical benchmarks demonstrate that fine-tuning on the subset constructed by EntroCoT consistently outperforms the baseslines of full-dataset supervision.

preprint2026arXiv

Hybrid Distillation with CoT Guidance for Edge-Drone Control Code Generation

With large language models demonstrating significant potential in code generation tasks, their application to onboard control of resource-constrained Unmanned Aerial Vehicles has emerged as an important research direction. However, a notable contradiction exists between the high resource consumption of large models and the real-time, lightweight requirements of UAV platforms. This paper proposes an integrated approach that combines knowledge distillation, chain-of-thought guidance, and supervised fine-tuning for UAV multi-SDK control tasks, aiming to efficiently transfer complex reasoning and code generation capabilities to smaller models. Firstly, a high-quality dataset covering various mainstream UAV SDKs is constructed, featuring instruction-code-reasoning chains, and incorporates counterfactual negative samples for data augmentation, guiding the model to learn the end-to-end logic from instruction parsing to code generation. Secondly, leveraging DeepSeek-Coder-V2-Lite quantized via QLoRA as the teacher model, and based on a hybrid black-box and white-box distillation strategy, high-quality chain-of-thought soft labels are generated. These are combined with a weighted cross-entropy loss using hard labels to transfer complex reasoning capabilities to the smaller student model. Finally, through prompt tuning engineering optimized for the UAV control scenario, the model performance on core tasks such as SDK type recognition and function call matching is enhanced. Experimental results indicate that the distilled lightweight model maintains high code generation accuracy while achieving significant improvements in deployment and inference efficiency, effectively demonstrating the feasibility and superiority of our approach in achieving precise and lightweight intelligent control for UAVs

preprint2026arXiv

ICU-Bench:Benchmarking Continual Unlearning in Multimodal Large Language Models

Although Multimodal Large Language Models (MLLMs) have achieved remarkable progress across many domains, their training on large-scale multimodal datasets raises serious privacy concerns, making effective machine unlearning increasingly necessary. However, existing benchmarks mainly focus on static or short-sequence settings, offering limited support for evaluating continual privacy deletion requests in realistic deployments. To bridge this gap, we introduce ICU-Bench, a continual multimodal unlearning benchmark built on privacy-critical document data. ICU-Bench contains 1,000 privacy-sensitive profiles from two document domains, medical reports and labor contracts, with 9,500 images, 16,000 question-answer pairs, and 100 forget tasks. Additionally, new continual unlearning metrics are introduced, facilitating a comprehensive analysis of forgetting effectiveness, historical forgetting preservation, retained utility, and stability throughout the continual unlearning process. Through extensive experiments with representative unlearning methods on ICU-Bench, we show that existing methods generally struggle in continual settings and exhibit clear limitations in balancing forgetting quality, utility preservation, and scalability over long task sequences. These findings highlight the need for multimodal unlearning methods explicitly designed for continual privacy deletion.

preprint2026arXiv

Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning

The core challenge of machine unlearning is to strike a balance between target knowledge removal and non-target knowledge retention. In the context of Multimodal Large Language Models (MLLMs), this challenge becomes even more pronounced, as knowledge is further divided into visual and textual modalities that are tightly intertwined. In this paper, we introduce an MLLM unlearning approach that aims to forget target visual knowledge while preserving non-target visual knowledge and all textual knowledge. Specifically, we freeze the LLM backbone and achieve unlearning by fine-tuning the visual module. First, we propose a Contrastive Visual Forgetting (CVF) mechanism to separate target visual knowledge from retained visual knowledge, guiding the representations of target visual concepts toward appropriate regions in the feature space. Second, we identify the null space associated with retained knowledge and constrain the unlearning process within this space, thereby significantly mitigating degradation in knowledge retention. Third, beyond static unlearning scenarios, we extend our approach to continual unlearning, where forgetting requests arrive sequentially. Extensive experiments across diverse benchmarks demonstrate that our approach achieves a strong balance between effective forgetting and robust knowledge retention.

preprint2026arXiv

Out-of-Distribution Semantic Occupancy Prediction

3D semantic occupancy prediction is crucial for autonomous driving, providing a dense, semantically rich environmental representation. However, existing methods focus on in-distribution scenes, making them susceptible to Out-of-Distribution (OoD) objects and long-tail distributions, which increases the risk of undetected anomalies and misinterpretations, posing safety hazards. To address these challenges, we introduce Out-of-Distribution Semantic Occupancy Prediction, targeting OoD detection in 3D voxel space. To fill dataset gaps, we propose a Realistic Anomaly Augmentation that injects synthetic anomalies while preserving realistic spatial and occlusion patterns, enabling the creation of two datasets: VAA-KITTI and VAA-KITTI-360. Then, a novel framework that integrates OoD detection into 3D semantic occupancy prediction, OccOoD, is proposed, which uses Cross-Space Semantic Refinement (CSSR) to refine semantic predictions from complementary voxel and BEV representations, improving OoD detection. Experimental results demonstrate that OccOoD achieves state-of-the-art OoD detection with an AuROC of 65.50% and an AuPRCr of 31.83 within a 1.2m region, while maintaining competitive semantic occupancy prediction performance and generalization in real-world urban driving scenes. The established datasets and source code will be made publicly available at https://github.com/7uHeng/OccOoD.

preprint2026arXiv

Self-Guided Defense: Adaptive Safety Alignment for Reasoning Models via Synthesized Guidelines

Reasoning models have demonstrated remarkable capabilities in complex reasoning tasks. However, ensuring their safety against adversarial jailbreak prompts remains a critical challenge. Due to the covert and deceptive nature of such prompts, they can often evade built-in safety mechanisms and lead to the generation of harmful content. This underscores the need for an adaptive safety alignment approach that enables models to autonomously reinforce their defenses in response to adversarial inputs. This paper introduces the Synthesized Guideline-based Adaptive Safety Alignment (SGASA) framework, which internalizes model-generated safety guidelines to strengthen models' ability to enhance robustness against harmful adversarial prompts while minimizing unnecessary refusals of benign requests. SGASA consists of two key stages: Data Pre-synthesis, which generates safety guidelines and augmented prompts; and Alignment Fine-tuning, which leverages Supervised Fine-tuning (SFT) and Direct Preference Optimization (DPO) to embed these guidelines into the model. Extensive experiments across multiple datasets demonstrate that SGASA significantly improves model safety, validating its adaptive and scalable effectiveness.

preprint2026arXiv

Unified Modeling of Lane and Lane Topology for Driving Scene Reasoning

Autonomous vehicles need to perceive not only physical elements in the driving scene, such as lane lines and traffic lights, but also logical elements like lane centerlines and their topology. Existing lane topology reasoning methods typically follow a reasoning-by-detection paradigm, where lane topological relationships are primarily derived from lane detection results. In this paper, we propose an innovative method called Unified Modeling of Lane and Lane Topology (UniTopo), which represents the topological relationships between lanes as connected lanes, encompassing predecessor lanes, successor lanes, and their interconnections. This unified representation of lanes and lane topology allows us to simultaneously obtain both the positions and topological information of lanes within a shared perception pipeline, establishing a new paradigm for directly perceiving lane topology from original image features. We validate our method on the driving scene reasoning benchmark OpenLane-V2, which consists of two subsets, built based on Argoverse2 and nuScenes, respectively. Our method achieves TOP_ll of 30.1% and 31.8% on the two subsets, significantly surpassing the existing state-of-the-art method T^2SG by 6.0% and 8.6%.

preprint2023arXiv

Simulation of CO2 Storage using a Parameterization Method for Essential Trapping Physics: FluidFlower Benchmark Study

An efficient compositional framework is developed for simulation of CO2 storage in saline aquifers during a full-cycle injection, migration and post-migration processes. Essential trapping mechanisms, including structural, dissolution, and residual trapping, which operate at different time scales are accurately captured in the presented unified framework. In particular, a parameterization method is proposed to efficiently describe the relevant physical processes. The proposed framework is validated by comparing the dynamics of gravity-induced convective transport with that reported in the literature. Results show good agreement for both the characteristics of descending fingers and the associated dissolution rate. The developed simulator is then applied to study the FluidFlower benchmark model. An experimental setup with heterogeneous geological layers is discretized into a two-dimensional computational domain where numerical simulation is performed. Impacts of hysteresis and the diffusion of CO2 in liquid phase on the migration and trapping of CO2 plume are investigated. Inclusion of the hysteresis effect does not affect plume migration in this benchmark model, whereas diffusion plays an important role in promoting convective mixing. This work casts a promising approach to predict the migration of the CO2 plume, and to assess the amount of trapping from different mechanisms for long-term CO2 storage.

preprint2022arXiv

Detecting fake news by enhanced text representation with multi-EDU-structure awareness

Since fake news poses a serious threat to society and individuals, numerous studies have been brought by considering text, propagation and user profiles. Due to the data collection problem, these methods based on propagation and user profiles are less applicable in the early stages. A good alternative method is to detect news based on text as soon as they are released, and a lot of text-based methods were proposed, which usually utilized words, sentences or paragraphs as basic units. But, word is a too fine-grained unit to express coherent information well, sentence or paragraph is too coarse to show specific information. Which granularity is better and how to utilize it to enhance text representation for fake news detection are two key problems. In this paper, we introduce Elementary Discourse Unit (EDU) whose granularity is between word and sentence, and propose a multi-EDU-structure awareness model to improve text representation for fake news detection, namely EDU4FD. For the multi-EDU-structure awareness, we build the sequence-based EDU representations and the graph-based EDU representations. The former is gotten by modeling the coherence between consecutive EDUs with TextCNN that reflect the semantic coherence. For the latter, we first extract rhetorical relations to build the EDU dependency graph, which can show the global narrative logic and help deliver the main idea truthfully. Then a Relation Graph Attention Network (RGAT) is set to get the graph-based EDU representation. Finally, the two EDU representations are incorporated as the enhanced text representation for fake news detection, using a gated recursive unit combined with a global attention mechanism. Experiments on four cross-source fake news datasets show that our model outperforms the state-of-the-art text-based methods.

preprint2022arXiv

Three-dimensional discontinuous Galerkin based high-order gas-kinetic scheme and GPU implementation

In this paper, the discontinuous Galerkin based high-order gas-kinetic schemes (DG-HGKS) are developed for the three-dimensional Euler and Navier-Stokes equations. Different from the traditional discontinuous Galerkin (DG) methods with Riemann solvers, the current method adopts a kinetic evolution process, which is provided by the integral solution of Bhatnagar-Gross-Krook (BGK) model. In the weak formulation of DG method, a time-dependent evolution function is provided, and both inviscid and viscous fluxes can be calculated uniformly. The temporal accuracy is achieved by the two-stage fourth-order discretization, and the second-order gas-kinetic solver is adopted for the fluxes over the cell interface and the fluxes inside a cell. Numerical examples, including accuracy tests and Taylor-Green vortex problem, are presented to validate the efficiency and accuracy of DG-HGKS. Both optimal convergence and super-convergence are achieved by the current scheme. The comparison between DG-HGKS and high-order gas-kinetic scheme with weighted essential non-oscillatory reconstruction (WENO-HGKS) is also given, and the numerical performances are comparable with the approximate number of degree of freedom. To accelerate the computation, the DG-HGKS is implemented with the graphics processing unit (GPU) using compute unified device architecture (CUDA). The obtained results are also compared with those calculated by the central processing units (CPU) code in terms of the computational efficiency. The speedup of GPU code suggests the potential of high-order gas-kinetic schemes for the large scale computation.

preprint2021arXiv

Path-specific Underwater Acoustic Channel Tracking and its Application in Passive Time Reversal Mirror

We consider the underwater acoustic channel which is time-variant and doubly-spread in this work. Since conventional channel estimation and decision feedback equalizer (DFE) can not work well for this type of channel, a path-specific underwater acoustic channel tracking is proposed. It is based on the framework of Kalman filter. We provide a simplified sound propagation model as the state transition model. A multipath tracker is proposed which is tolerant of the model-mismatch. Then we can obtain the time-variant path number and path-specific parameters such as delay and Doppler scaling factor. We also consider the application of the proposed path-specific underwater acoustic channel tracking. We propose two types of passive time reversal mirror (PTRM) with our path-specific parameters for time-variant and doubly-spread underwater acoustic channel. With the path-specific parameters obtained by the proposed channel tracking, the proposed PTRM can not only match the time dispersion as conventional PTRM, but also the doubly-spread channel, since the path-specific delay and Doppler scaler factor can help to match the channel in both time and frequency domain. For extensive doubly-spread channel, we can further apply the path-specific compensation to the PTRM. Both simulations and experimental results by data from 2016 Qiandao Lake experiment show the efficiency of proposed path-specific channel tracking and proposed PTRMs with path-specific parameters.

preprint2020arXiv

Novel algorithms and high-performance cloud computing enable efficient fully quantum mechanical protein-ligand scoring

Ranking the binding of small molecules to protein receptors through physics-based computation remains challenging. Though inroads have been made using free energy methods, these fail when the underlying classical mechanical force fields are insufficient. In principle, a more accurate approach is provided by quantum mechanical density functional theory (DFT) scoring, but even with approximations, this has yet to become practical on drug discovery-relevant timescales and resources. Here, we describe how to overcome this barrier using algorithms for DFT calculations that scale on widely available cloud architectures, enabling full density functional theory, without approximations, to be applied to protein-ligand complexes with approximately 2500 atoms in tens of minutes. Applying this to a realistic example of 22 ligands binding to MCL1 reveals that density functional scoring outperforms classical free energy perturbation theory for this system. This raises the possibility of broadly applying fully quantum mechanical scoring to real-world drug discovery pipelines.