Researcher profile

Hao Wen

Hao Wen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Multi-Stage Bi-Atrial Segmentation Framework from 3D Late Gadolinium-Enhanced MRI using V-Net Family Models

We report our multi-stage framework designed for the problem of multi-class bi-atrial segmentation from 3D late gadolinium-enhanced (LGE) MRI of the human heart. The pipeline consists of a preprocessing step using multidimensional contrast limited adaptive histogram equalization (MCLAHE); coarse region segmentation from MCLAHE-enhanced and down-sampled MRI using a V-Net family model; and fine segmentation from the coarse region using another V-Net model. Asymmetric loss is adopted to optimize the model weights.

preprint2025arXiv

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

Multimodal Large Language Models (MLLMs) have made remarkable progress in video understanding. However, they suffer from a critical vulnerability: an over-reliance on language priors, which can lead to visual ungrounded hallucinations, especially when processing counterfactual videos that defy common sense. This limitation, stemming from the intrinsic data imbalance between text and video, is challenging to address due to the substantial cost of collecting and annotating counterfactual data. To address this, we introduce DualityForge, a novel counterfactual data synthesis framework that employs controllable, diffusion-based video editing to transform real-world videos into counterfactual scenarios. By embedding structured contextual information into the video editing and QA generation processes, the framework automatically produces high-quality QA pairs together with original-edited video pairs for contrastive training. Based on this, we build DualityVidQA, a large-scale video dataset designed to reduce MLLM hallucinations. In addition, to fully exploit the contrastive nature of our paired data, we propose Duality-Normalized Advantage Training (DNA-Train), a two-stage SFT-RL training regime where the RL phase applies pair-wise $\ell_1$ advantage normalization, thereby enabling a more stable and efficient policy optimization. Experiments on DualityVidQA-Test demonstrate that our method substantially reduces model hallucinations on counterfactual videos, yielding a relative improvement of 24.0% over the Qwen2.5-VL-7B baseline. Moreover, our approach achieves significant gains across both hallucination and general-purpose benchmarks, indicating strong generalization capability. We will open-source our dataset and code.

preprint2024arXiv

DroidBot-GPT: GPT-powered UI Automation for Android

This paper introduces DroidBot-GPT, a tool that utilizes GPT-like large language models (LLMs) to automate the interactions with Android mobile applications. Given a natural language description of a desired task, DroidBot-GPT can automatically generate and execute actions that navigate the app to complete the task. It works by translating the app GUI state information and the available actions on the smartphone screen to natural language prompts and asking the LLM to make a choice of actions. Since the LLM is typically trained on a large amount of data including the how-to manuals of diverse software applications, it has the ability to make reasonable choices of actions based on the provided information. We evaluate DroidBot-GPT with a self-created dataset that contains 33 tasks collected from 17 Android applications spanning 10 categories. It can successfully complete 39.39% of the tasks, and the average partial completion progress is about 66.76%. Given the fact that our method is fully unsupervised (no modification required from both the app and the LLM), we believe there is great potential to enhance automation performance with better app development paradigms and/or custom model training.

preprint2020arXiv

Ultra-low-frequency electromagnetic waves as signals and special counterparts of gravitational waves (from binary mergers) having tensorial and possible nontensorial polarizations

Gravitational waves (GWs, from binary merger) interacting with super-strong magnetic fields of the neutron star (in the same binary system), would lead to perturbed electromagnetic waves [EMWs, in the same frequencies of these GWs, partially in the ultra-low-frequency (ULF) band for the EMWs]. Such perturbed ULF-EMWs are not only the signals, but also a new type of special EM counterparts of the GWs. Here, generation of the perturbed ULF-EMWs is investigated for the first time, and the strengths of their magnetic components are estimated to be around 10^{-12}Tesla to 10^{-17}Tesla (in fISCO) at the Earth for various cases [not including the influence of interstellar medium (ISM)].The components with higher frequencies of the ULF-EMWs (e.g., especially produced by the GWs of the post-merger stage) above 1.8kHz (typical plasma frequency around solar system in the Milky way), could propagate through the ISM from the source until the Earth, and the perturbed ULF-EMWs will be reprocessed before they arrived at the Earth due to the ISM. Also, the waveforms of the perturbed ULF-EMWs will be modified into shapes different but related to the waveforms of the GWs, by the amplification process during the binary mergers which could amplify the magnetic fields into 10^{12}Tesla or even higher. Specific connection relationships between the polarizations of the perturbed ULF-EMWs and the polarizations (tensorial and possible nontensorial) of the GWs of binary mergers, are also addressed here. Characteristic properties of the perturbed ULF-EMWs (which would bring us some different new information of fundamental properties of the gravity and Universe) will be very helpful for extracting the signals from background noise for possible observations in the future.

preprint2017arXiv

Counting Multiplicities in a Hypersurface over a Number Field

We fix a counting function of multiplicities of algebraic points in a projective hypersurface over a number field, and take the sum over all algebraic points of bounded height and fixed degree. An upper bound for the sum with respect to this counting function will be given in terms of the degree of the hypersurface, the dimension of the singular locus, the upper bounds of height, and the degree of the field of definition.