Source author record

Hao Wen

Hao Wen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computer Vision physics.gen-ph gr-qc Machine Learning math.AG math.CV math.NT Software Engineering

Catalog footprint

What is connected

8works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Multi-Stage Bi-Atrial Segmentation Framework from 3D Late Gadolinium-Enhanced MRI using V-Net Family Models

We report our multi-stage framework designed for the problem of multi-class bi-atrial segmentation from 3D late gadolinium-enhanced (LGE) MRI of the human heart. The pipeline consists of a preprocessing step using multidimensional contrast limited adaptive histogram equalization (MCLAHE); coarse region segmentation from MCLAHE-enhanced and down-sampled MRI using a V-Net family model; and fine segmentation from the coarse region using another V-Net model. Asymmetric loss is adopted to optimize the model weights.

preprint2025arXiv

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

Multimodal Large Language Models (MLLMs) have made remarkable progress in video understanding. However, they suffer from a critical vulnerability: an over-reliance on language priors, which can lead to visual ungrounded hallucinations, especially when processing counterfactual videos that defy common sense. This limitation, stemming from the intrinsic data imbalance between text and video, is challenging to address due to the substantial cost of collecting and annotating counterfactual data. To address this, we introduce DualityForge, a novel counterfactual data synthesis framework that employs controllable, diffusion-based video editing to transform real-world videos into counterfactual scenarios. By embedding structured contextual information into the video editing and QA generation processes, the framework automatically produces high-quality QA pairs together with original-edited video pairs for contrastive training. Based on this, we build DualityVidQA, a large-scale video dataset designed to reduce MLLM hallucinations. In addition, to fully exploit the contrastive nature of our paired data, we propose Duality-Normalized Advantage Training (DNA-Train), a two-stage SFT-RL training regime where the RL phase applies pair-wise $\ell_1$ advantage normalization, thereby enabling a more stable and efficient policy optimization. Experiments on DualityVidQA-Test demonstrate that our method substantially reduces model hallucinations on counterfactual videos, yielding a relative improvement of 24.0% over the Qwen2.5-VL-7B baseline. Moreover, our approach achieves significant gains across both hallucination and general-purpose benchmarks, indicating strong generalization capability. We will open-source our dataset and code.

preprint2024arXiv

DroidBot-GPT: GPT-powered UI Automation for Android

This paper introduces DroidBot-GPT, a tool that utilizes GPT-like large language models (LLMs) to automate the interactions with Android mobile applications. Given a natural language description of a desired task, DroidBot-GPT can automatically generate and execute actions that navigate the app to complete the task. It works by translating the app GUI state information and the available actions on the smartphone screen to natural language prompts and asking the LLM to make a choice of actions. Since the LLM is typically trained on a large amount of data including the how-to manuals of diverse software applications, it has the ability to make reasonable choices of actions based on the provided information. We evaluate DroidBot-GPT with a self-created dataset that contains 33 tasks collected from 17 Android applications spanning 10 categories. It can successfully complete 39.39% of the tasks, and the average partial completion progress is about 66.76%. Given the fact that our method is fully unsupervised (no modification required from both the app and the LLM), we believe there is great potential to enhance automation performance with better app development paradigms and/or custom model training.

preprint2020arXiv

Ultra-low-frequency electromagnetic waves as signals and special counterparts of gravitational waves (from binary mergers) having tensorial and possible nontensorial polarizations

Gravitational waves (GWs, from binary merger) interacting with super-strong magnetic fields of the neutron star (in the same binary system), would lead to perturbed electromagnetic waves [EMWs, in the same frequencies of these GWs, partially in the ultra-low-frequency (ULF) band for the EMWs]. Such perturbed ULF-EMWs are not only the signals, but also a new type of special EM counterparts of the GWs. Here, generation of the perturbed ULF-EMWs is investigated for the first time, and the strengths of their magnetic components are estimated to be around 10^{-12}Tesla to 10^{-17}Tesla (in fISCO) at the Earth for various cases [not including the influence of interstellar medium (ISM)].The components with higher frequencies of the ULF-EMWs (e.g., especially produced by the GWs of the post-merger stage) above 1.8kHz (typical plasma frequency around solar system in the Milky way), could propagate through the ISM from the source until the Earth, and the perturbed ULF-EMWs will be reprocessed before they arrived at the Earth due to the ISM. Also, the waveforms of the perturbed ULF-EMWs will be modified into shapes different but related to the waveforms of the GWs, by the amplification process during the binary mergers which could amplify the magnetic fields into 10^{12}Tesla or even higher. Specific connection relationships between the polarizations of the perturbed ULF-EMWs and the polarizations (tensorial and possible nontensorial) of the GWs of binary mergers, are also addressed here. Characteristic properties of the perturbed ULF-EMWs (which would bring us some different new information of fundamental properties of the gravity and Universe) will be very helpful for extracting the signals from background noise for possible observations in the future.

preprint2017arXiv

Counting Multiplicities in a Hypersurface over a Number Field

We fix a counting function of multiplicities of algebraic points in a projective hypersurface over a number field, and take the sum over all algebraic points of bounded height and fixed degree. An upper bound for the sum with respect to this counting function will be given in terms of the degree of the hypersurface, the dimension of the singular locus, the upper bounds of height, and the degree of the field of definition.

preprint2016arXiv

Resonance of Gaussian electromagnetic field to the high frequency gravitational waves

We consider a Gaussian Beam (GB) resonant system for high frequency gravitational waves (HFGWs) detection. At present, we find the optimal signal strength in theory through setting the magnetic component of GB in a standard gaussian form. Under the synchro-resonance condition, we study the signal strength (i.e., transverse perturbative photon fluxes) from the relic HFGWs (predicted by ordinary inflationary model) and the braneworld HFGWs (from braneworld scenarios). Both of them would generate potentially detectable transverse perturbative photon fluxes (PPFs). Furthermore we find optimal system parameters and the relationship between frequency and effective width of energy fluxes accumulation.

preprint2015arXiv

A twisted $\bar{\partial}_f$-Neumann problem and Toeplitz $n$-tuples from singularity theory

A twisted $\bar{\partial}_f$-Neumann problem associated to a singularity $(\mathscr{O}_n,f)$ is established. By constructing the connection to the Koszul complex for toeplitz $n$-tuples $(f_1,\cdots,f_n)$ on Bergman spaces $B^0(D)$, we can solve this $\bar{\partial}_f$-Neumann problem. Moreover, we can compute the cohomology of the $L^2$ holomorphic Koszul complex $(B^*(D),\partial f\wedge)$ explicitly

preprint2015arXiv

Signal photon flux generated by high-frequency relic gravitational waves

The power spectrum of primordial tensor perturbations $\mathcal{P}_t$ increases rapidly in high frequency region if the spectral index $n_t>0$. It is shown that the amplitude of relic gravitational wave $h_t$($5\times10^9$Hz) varies from $10^{-36}$ to $10^{-25}$ while $n_t$ varies from $-6.25\times 10^{-3}$ to $0.87$. High frequency gravitational waves detector that is proposed by F.-Y. Li detects gravitational waves through observing the perturbed photon flux that is generated by interaction between the relic gravitational waves and electromagnetic system. It is shown that the perturbative photon flux $N_x^1$($5\times10^9$Hz) varies from $1.40\times10^{-4}\rm s^{-1}$ to $2.85\times10^{7}\rm s^{-1}$ while $n_t$ varies from $-6.25\times 10^{-3}$ to $0.87$. Correspondingly, the ratio of the transverse perturbative photon flux $N_x^1$ to the background photon flux varies from $10^{-28}$ to $10^{-16}$.

Hao Wen

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Multi-Stage Bi-Atrial Segmentation Framework from 3D Late Gadolinium-Enhanced MRI using V-Net Family Models

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

DroidBot-GPT: GPT-powered UI Automation for Android

Ultra-low-frequency electromagnetic waves as signals and special counterparts of gravitational waves (from binary mergers) having tensorial and possible nontensorial polarizations

Counting Multiplicities in a Hypersurface over a Number Field

Resonance of Gaussian electromagnetic field to the high frequency gravitational waves

A twisted $\bar{\partial}_f$-Neumann problem and Toeplitz $n$-tuples from singularity theory

Signal photon flux generated by high-frequency relic gravitational waves