Researcher profile

Wenhao Zhang

Wenhao Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking

Despite significant progress, RGB-based trackers remain vulnerable to challenging imaging conditions, such as low illumination and fast motion. Event cameras offer a promising alternative by asynchronously capturing pixel-wise brightness changes, providing high dynamic range and high temporal resolution. However, existing event-based trackers often neglect the intrinsic spatial sparsity and temporal density of event data, while relying on a single fixed temporal-window sampling strategy that is suboptimal under varying motion dynamics. In this paper, we propose an event sparsity-aware tracking framework that explicitly models event-density variations across multiple temporal scales. Specifically, the proposed framework progressively injects sparse, medium-density, and dense event search regions into a three-stage Vision Transformer backbone, enabling hierarchical multi-density feature learning. Furthermore, we introduce a sparsity-aware Mixture-of-Experts module to encourage expert specialization under different sparsity patterns, and design a dynamic pondering strategy to adaptively adjust the inference depth according to tracking difficulty. Extensive experiments on FE240hz, COESOT, and EventVOT demonstrate that the proposed approach achieves a favorable trade-off between tracking accuracy and computational efficiency. The source code will be released on https://github.com/Event-AHU/OpenEvTracking.

preprint2026arXiv

Imaging Intermediate Melting Phases of Dual Magnetic-Field-Stabilized Wigner Crystals

The competition between Coulomb repulsion and kinetic energy in correlated systems can allow electrons to crystallize into Wigner solids. Despite researches across diverse two-dimensional Wigner platforms, the microscopic melting processes through possible intermediate phases remains largely unknown. Here, we present the visualization of electron-lattice melting in monolayer VCl3 on graphite, where two Wigner crystals coexist with markedly different critical temperatures Tc and lattice periods as stabilized by high magnetic field. One Wigner crystal possesses both record-high Tc and electron density, and undergoes melting through an intermediate nematic phase upon decreasing magnetic field. In contrast, the other Wigner crystal with a lower Tc yields a different intermediate phase during melting, exhibiting an anomalous electron liquid with an energy-independent modulation period. First-principles calculations corroborate the band-selective occupations of interface-transferred electrons in the formation of dual Wigner crystals. Our atomically resolved intermediate phases provide crucial insights into the microscopic melting pathways of Wigner crystals, enabling a phase diagram parameterized by both quantum and thermal fluctuations.

preprint2026arXiv

Spectral Visualization of Excitonic Pair Breaking at Individual Impurities in Ta2Pd3Te5

Excitonic insulators host the condensates of bound electron-hole pairs, offering a platform for studying correlated bosonic quantum states. Yet, how macroscopic coherence emerges from locally collapsed pairing remains elusive. Here, using scanning tunnelling spectroscopy, we report the impurity-induced pair breaking in an excitonic insulator Ta2Pd3Te5. Individual Te vacancies are found to generate a pair of spectral peaks within the excitonic gap. Their energies depend sensitively on the defect configurations and are continuously tunable by tip electric field, indicating controllable impurity scatterings. Spectral mapping shows spatially anisotropic and electronically coupled electron-hole components of the subgap states. These observations, together with mean-field modelling, suggest an excitonic pair-breaking origin. In the strongly electron-hole imbalanced region, a secondary pair-breaking effect, manifesting as an additional pair of subgap states with distinctly lower energies, can emerge, presenting the interplay of pairing breakings with different excitonic order parameters. Our findings demonstrate the spectroscopic 'fingerprint' of local excitonic depairing at the atomic level, offering a crucial clue to the critical behavior across excitonic condensation.

preprint2026arXiv

VideoRouter: Query-Adaptive Dual Routing for Efficient Long-Video Understanding

Video large multimodal models increasingly face a scalability bottleneck: long videos produce excessively long visual-token sequences, which sharply increase memory and latency during inference. While existing compression methods are effective in specific settings, most are either weakly query-aware or apply a fixed compression policy across frames, proving suboptimal when visual evidence is unevenly distributed over time. To address this, we present VideoRouter, a query-adaptive dual-router framework built on InternVL for budgeted evidence allocation. The Semantic Router predicts the dominant allocation policy, choosing between broad temporal coverage and adaptive high-resolution preservation, while the Image Router uses early LLM layers to score frame relevance. This enables aggressive compression on less relevant frames while preserving detail on critical evidence frames. To train both routers, we build Video-QTR-10K for allocation-policy supervision and Video-FLR-200K for frame-relevance supervision. Experiments on VideoMME, MLVU, and LongVideoBench show that VideoRouter consistently improves over the InternVL baseline under comparable or lower budgets, achieving up to a 67.9% token reduction.

preprint2025arXiv

VideoCuRL: Video Curriculum Reinforcement Learning with Orthogonal Difficulty Decomposition

Reinforcement Learning (RL) is crucial for empowering VideoLLMs with complex spatiotemporal reasoning. However, current RL paradigms predominantly rely on random data shuffling or naive curriculum strategies based on scalar difficulty metrics. We argue that scalar metrics fail to disentangle two orthogonal challenges in video understanding: Visual Temporal Perception Load and Cognitive Reasoning Depth. To address this, we propose VideoCuRL, a novel framework that decomposes difficulty into these two axes. We employ efficient, training-free proxies, optical flow and keyframe entropy for visual complexity, Calibrated Surprisal for cognitive complexity, to map data onto a 2D curriculum grid. A competence aware Diagonal Wavefront strategy then schedules training from base alignment to complex reasoning. Furthermore, we introduce Dynamic Sparse KL and Structured Revisiting to stabilize training against reward collapse and catastrophic forgetting. Extensive experiments show that VideoCuRL surpasses strong RL baselines on reasoning (+2.5 on VSI-Bench) and perception (+2.9 on VideoMME) tasks. Notably, VideoCuRL eliminates the prohibitive inference overhead of generation-based curricula, offering a scalable solution for robust video post-training.

preprint2022arXiv

Causal Inference in medicine and in health policy, a summary

A data science task can be deemed as making sense of the data or testing a hypothesis about it. The conclusions inferred from data can greatly guide us to make informative decisions. Big data has enabled us to carry out countless prediction tasks in conjunction with machine learning, such as identifying high risk patients suffering from a certain disease and taking preventable measures. However, healthcare practitioners are not content with mere predictions - they are also interested in the cause-effect relation between input features and clinical outcomes. Understanding such relations will help doctors treat patients and reduce the risk effectively. Causality is typically identified by randomized controlled trials. Often such trials are not feasible when scientists and researchers turn to observational studies and attempt to draw inferences. However, observational studies may also be affected by selection and/or confounding biases that can result in wrong causal conclusions. In this chapter, we will try to highlight some of the drawbacks that may arise in traditional machine learning and statistical approaches to analyze the observational data, particularly in the healthcare data analytics domain. We will discuss causal inference and ways to discover the cause-effect from observational studies in healthcare domain. Moreover, we will demonstrate the applications of causal inference in tackling some common machine learning issues such as missing data and model transportability. Finally, we will discuss the possibility of integrating reinforcement learning with causality as a way to counter confounding bias.

preprint2022arXiv

ECG Heartbeat classification using deep transfer learning with Convolutional Neural Network and STFT technique

Electrocardiogram (ECG) is a simple non-invasive measure to identify heart-related issues such as irregular heartbeats known as arrhythmias. While artificial intelligence and machine learning is being utilized in a wide range of healthcare related applications and datasets, many arrhythmia classifiers using deep learning methods have been proposed in recent years. However, sizes of the available datasets from which to build and assess machine learning models is often very small and the lack of well-annotated public ECG datasets is evident. In this paper, we propose a deep transfer learning framework that is aimed to perform classification on a small size training dataset. The proposed method is to fine-tune a general-purpose image classifier ResNet-18 with MIT-BIH arrhythmia dataset in accordance with the AAMI EC57 standard. This paper further investigates many existing deep learning models that have failed to avoid data leakage against AAMI recommendations. We compare how different data split methods impact the model performance. This comparison study implies that future work in arrhythmia classification should follow the AAMI EC57 standard when using any including MIT-BIH arrhythmia dataset.

preprint2022arXiv

Effect of Stacking Order on the Electronic State of 1T-TaS$_2$

New theoretical proposals and experimental findings on transition metal dichalcogenide 1T-TaS$_2$ have revived interests in its possible Mott insulating state. We perform a comprehensive scanning tunneling microscopy and spectroscopy experiment on different single-step areas in pristine 1T-TaS$_2$. After accurately determining the relative displacement of Star-of-David super-lattices in two layers, we find different stacking orders can correspond to the similar large-gap spectrum on the upper terrace. When the measurement is performed away from the step edge, the large gap spectrum can always be maintained. The stacking order seems rarely disturb the large-gap spectrum in the ideal bulk material. We conclude that the large insulating gap is from the single-layer property, which is a correlation-induced Mott gap based on the single-band Hubbard model. Specific stacking orders can perturb the state and induce a small-gap or metallic spectrum for a limited area around the step edge, which we attribute to a surface and edge phenomenon. Our work provides more evidence about the surface electronic state and deepens our understanding of the Mott insulating state in 1T-TaS$_2$.

preprint2022arXiv

Range of Motion Sensors for Monitoring Recovery of Total Knee Arthroplasty

A low-cost, accurate device to measure and record knee range of motion (ROM) is of the essential need to improve confidence in at-home rehabilitation. It is to reduce hospital stay duration and overall medical cost after Total Knee Arthroplasty (TKA) procedures. The shift in Medicare funding from pay-as-you-go to the Bundled Payments for Care Improvement (BPCI) has created a push towards at-home care over extended hospital stays. It has heavily affected TKA patients, who typically undergo physical therapy at the clinic after the procedure to ensure full recovery of ROM. In this paper, we use accelerometers to create a ROM sensor that can be integrated into the post-operative surgical dressing, so that the cost of the sensors can be included in the bundled payments. In this paper, we demonstrate the efficacy of our method in comparison to the baseline computer vision method. Our results suggest that calculating angular displacement from accelerometer sensors demonstrates accurate ROM recordings under both stationary and walking conditions. The device would keep track of angle measurements and alert the patient when certain angle thresholds have been crossed, allowing patients to recover safely at home instead of going to multiple physical therapy sessions. The affordability of our sensor makes it more accessible to patients in need.

preprint2022arXiv

Reconcile the Bulk Metallic and Surface Insulating state in 1T-TaSe$_2$

The transition metal dichalcogenides 1T-TaS$_2$ and 1T-TaSe$_2$ have been extensively studied for the complicated correlated electronic properties. The origin of different surface electronic states remains controversial. We apply scanning tunneling microscopy and spectroscopy to restudy the surface electronic state of bulk 1T-TaSe$_2$. Both insulating and metallic states are identified in different areas of the same sample. The insulating state is similar to that in 1T-TaS$_2$, concerning both the dI/dV spectrum and the orbital texture. With further investigations in single-step areas, the discrepancy of electronic states is found to be associated with different stacking orders. The insulating state is most possibly a single-layer property, modulated to a metallic state in some particular stacking orders. Both the metallic and large-gap insulating spectra, together with their corresponding stacking orders, are dominant in 1T-TaSe$_2$. The connected metallic areas lead to the metallic transport behavior. We then reconcile the bulk metallic and surface insulating state in 1T-TaSe$_2$. The rich phenomena in 1T-TaSe$_2$ deepen our understanding of the correlated electronic state in bulk 1T-TaSe$_2$ and 1T-TaS$_2$.

preprint2020arXiv

AIM 2020 Challenge on Video Extreme Super-Resolution: Methods and Results

This paper reviews the video extreme super-resolution challenge associated with the AIM 2020 workshop at ECCV 2020. Common scaling factors for learned video super-resolution (VSR) do not go beyond factor 4. Missing information can be restored well in this region, especially in HR videos, where the high-frequency content mostly consists of texture details. The task in this challenge is to upscale videos with an extreme factor of 16, which results in more serious degradations that also affect the structural integrity of the videos. A single pixel in the low-resolution (LR) domain corresponds to 256 pixels in the high-resolution (HR) domain. Due to this massive information loss, it is hard to accurately restore the missing information. Track 1 is set up to gauge the state-of-the-art for such a demanding task, where fidelity to the ground truth is measured by PSNR and SSIM. Perceptually higher quality can be achieved in trade-off for fidelity by generating plausible high-frequency content. Track 2 therefore aims at generating visually pleasing results, which are ranked according to human perception, evaluated by a user study. In contrast to single image super-resolution (SISR), VSR can benefit from additional information in the temporal domain. However, this also imposes an additional requirement, as the generated frames need to be consistent along time.

preprint2020arXiv

Large-scale Causal Approaches to Debiasing Post-click Conversion Rate Estimation with Multi-task Learning

Post-click conversion rate (CVR) estimation is a critical task in e-commerce recommender systems. This task is deemed quite challenging under the industrial setting with two major issues: 1) selection bias caused by user self-selection, and 2) data sparsity due to the rare click events. A successful conversion typically has the following sequential events: "exposure -> click -> conversion". Conventional CVR estimators are trained in the click space, but the inference is done in the entire exposure space. They fail to account for the causes of the missing data and treat them as missing at random. Hence, their estimations are highly likely to deviate from the real values by large. In addition, the data sparsity issue can also handicap many industrial CVR estimators which usually have large parameter spaces. In this paper, we propose two principled, efficient and highly effective CVR estimators for industrial CVR estimation, namely, Multi-IPW and Multi-DR. The proposed models approach the CVR estimation from a causal perspective and account for the causes of missing not at random. In addition, our methods are based on the multi-task learning framework and mitigate the data sparsity issue. Extensive experiments on industrial-level datasets show that our methods outperform the state-of-the-art CVR models.

preprint2020arXiv

Possible strain induced Mott gap collapse in 1T-TaS$_2$

Tuning the electronic properties of a matter is of fundamental interest in scientific research as well as in applications. Recently, the Mott insulator-metal transition has been reported in a pristine layered transition metal dichalcogenides 1T-TaS$_2$, with the transition triggered by an optical excitation, a gate controlled intercalation, or a voltage pulse. However, the sudden insulator-metal transition hinders an exploration of how the transition evolves. Here, we report the strain as a possible new tuning parameter to induce Mott gap collapse in 1T-TaS$_2$. In a strain-rich area, we find a mosaic state with distinct electronic density of states within different domains. In a corrugated surface, we further observe and analyze a smooth evolution from a Mott gap state to a metallic state. Our results shed new lights on the understanding of the insulator-metal transition and promote a controllable strain engineering on the design of switching devices in the future.

preprint2020arXiv

Projective Quasiparticle Interference of a Single Scatterer to Analyze the Electronic Band Structure of ZrSiS

Quasiparticle interference (QPI) of the electronic states has been widely applied in scanning tunneling microscopy (STM) to analyze the electronic band structure of materials. Single-defect induced QPI reveals defect-dependent interaction between a single atomic defect and electronic states, which deserves special attention. Due to the weak signal of single-defect-induced QPI, the signal-to-noise ratio (SNR) is relatively low in a standard two-dimensional QPI measurement. In this paper, we introduce a projective quasiparticle interference (PQPI) method, in which a one-dimensional measurement is taken along high-symmetry directions centered on a specified defect. We apply the PQPI method to a topological nodal-line semimetal ZrSiS. We focus on two special types of atomic defects that scatter the surface and bulk electronic bands. With enhanced SNR in PQPI, the energy dispersions are clearly resolved along high symmetry directions. We discuss the defect-dependent scattering of bulk bands with the non-symmorphic symmetry-enforced selection rules. Furthermore, an energy shift of the surface floating band is observed and a new branch of energy dispersion (q6) is resolved. This PQPI method can be applied to other complex materials to explore defect-dependent interactions in the future.