Researcher profile

Haotian Liu

Haotian Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

Beyond Target-Level: ISAC-Enabled Event-Level Sensing for Behavioral Intention Prediction

Integrated Sensing and Communication (ISAC) holds great promise for enabling event-level sensing, such as behavioral intention prediction (BIP) in autonomous driving, particularly under non-line-of-sight (NLoS) or adverse weather conditions where conventional sensors degrade. However, as a key instance of event-level sensing, ISAC-based BIP remains unexplored. To address this gap, we propose an ISAC-enabled BIP framework and validate its feasibility and effectiveness through extensive simulations. Our framework achieves robust performance in safety-critical scenarios, improving the F1-score by 11.4% over sensor-based baselines in adverse weather, thereby demonstrating ISAC's potential for intelligent event-level sensing.

preprint2026arXiv

CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing

While GUI agents have made significant progress in web navigation and basic operating system tasks, their capabilities in professional creative workflows remain largely underexplored. To bridge this gap, we introduce Cutverse, a benchmark designed to systematically evaluate autonomous GUI agents in realistic media post-production environments. We curate expert demonstrations across 7 professional applications (e.g., Premiere Pro, Photoshop), covering 186 complex, long-horizon tasks grounded in authentic editing workflows, involving dense multimodal interfaces and tightly coupled interaction sequences. To support scalable evaluation, we develop a lightweight parser that transforms raw screen recordings and low-level interaction logs into structured, compositional GUI action trajectories with precise grounding. Extensive evaluations reveal that existing agents achieve only 36.0\% task success on realistic media editing tasks, underscoring the challenges posed by complex, long-horizon media post-production workflows in our benchmark.While current models demonstrate promising spatial grounding, multimodal alignment, and coordinated action execution, they remain limited in long-horizon reliability and domain-specific planning.

preprint2023arXiv

Learning Customized Visual Models with Retrieval-Augmented Knowledge

Image-text contrastive learning models such as CLIP have demonstrated strong task transfer ability. The high generality and usability of these visual models is achieved via a web-scale data collection process to ensure broad concept coverage, followed by expensive pre-training to feed all the knowledge into model weights. Alternatively, we propose REACT, REtrieval-Augmented CusTomization, a framework to acquire the relevant web knowledge to build customized visual models for target domains. We retrieve the most relevant image-text pairs (~3% of CLIP pre-training data) from the web-scale database as external knowledge, and propose to customize the model by only training new modualized blocks while freezing all the original weights. The effectiveness of REACT is demonstrated via extensive experiments on classification, retrieval, detection and segmentation tasks, including zero, few, and full-shot settings. Particularly, on the zero-shot classification task, compared with CLIP, it achieves up to 5.4% improvement on ImageNet and 3.7% on the ELEVATER benchmark (20 datasets).

preprint2022arXiv

Controllable Production of Degenerate Fermi Gases of $^6$Li Atoms in the 2D-3D Crossover

The many-body physics in the dimensional crossover regime attracts much attention in cold atom experiments, but yet to explore systematically. One of the technical difficulties existed in the experiments is the lack of the experimental technique to quantitatively tune the atom occupation ratio of the different lattice bands. In this letter, we report such techniques in a process of transferring a 3D Fermi gas into a 1D optical lattice, where the capability of tuning the occupation of the energy band is realized by varying the trapping potentials of the optical dipole trap (ODT) and the lattice, respectively. We could tune a Fermi gas with the occupation in the lowest band from unity to 50$\%$ quantitatively. This provides a route to experimentally study the dependence of many-body interaction on the dimensionality in a Fermi gas.

preprint2022arXiv

End-to-End Instance Edge Detection

Edge detection has long been an important problem in the field of computer vision. Previous works have explored category-agnostic or category-aware edge detection. In this paper, we explore edge detection in the context of object instances. Although object boundaries could be easily derived from segmentation masks, in practice, instance segmentation models are trained to maximize IoU to the ground-truth mask, which means that segmentation boundaries are not enforced to precisely align with ground-truth edge boundaries. Thus, the task of instance edge detection itself is different and critical. Since precise edge detection requires high resolution feature maps, we design a novel transformer architecture that efficiently combines a FPN and a transformer decoder to enable cross attention on multi-scale high resolution feature maps within a reasonable computation budget. Further, we propose a light weight dense prediction head that is applicable to both instance edge and mask detection. Finally, we use a penalty reduced focal loss to effectively train the model with point supervision on instance edges, which can reduce annotation costs. We demonstrate highly competitive instance edge detection performance compared to state-of-the-art baselines, and also show that the proposed task and loss are complementary to instance segmentation and object detection.

preprint2022arXiv

Masked Discrimination for Self-Supervised Learning on Point Clouds

Masked autoencoding has achieved great success for self-supervised learning in the image and language domains. However, mask based pretraining has yet to show benefits for point cloud understanding, likely due to standard backbones like PointNet being unable to properly handle the training versus testing distribution mismatch introduced by masking during training. In this paper, we bridge this gap by proposing a discriminative mask pretraining Transformer framework, MaskPoint}, for point clouds. Our key idea is to represent the point cloud as discrete occupancy values (1 if part of the point cloud; 0 if not), and perform simple binary classification between masked object points and sampled noise points as the proxy task. In this way, our approach is robust to the point sampling variance in point clouds, and facilitates learning rich representations. We evaluate our pretrained models across several downstream tasks, including 3D shape classification, segmentation, and real-word object detection, and demonstrate state-of-the-art results while achieving a significant pretraining speedup (e.g., 4.1x on ScanNet) compared to the prior state-of-the-art Transformer baseline. Code is available at https://github.com/haotian-liu/MaskPoint.

preprint2022arXiv

Mass Testing and Characterization of 20-inch PMTs for JUNO

Main goal of the JUNO experiment is to determine the neutrino mass ordering using a 20kt liquid-scintillator detector. Its key feature is an excellent energy resolution of at least 3 % at 1 MeV, for which its instruments need to meet a certain quality and thus have to be fully characterized. More than 20,000 20-inch PMTs have been received and assessed by JUNO after a detailed testing program which began in 2017 and elapsed for about four years. Based on this mass characterization and a set of specific requirements, a good quality of all accepted PMTs could be ascertained. This paper presents the performed testing procedure with the designed testing systems as well as the statistical characteristics of all 20-inch PMTs intended to be used in the JUNO experiment, covering more than fifteen performance parameters including the photocathode uniformity. This constitutes the largest sample of 20-inch PMTs ever produced and studied in detail to date, i.e. 15,000 of the newly developed 20-inch MCP-PMTs from Northern Night Vision Technology Co. (NNVT) and 5,000 of dynode PMTs from Hamamatsu Photonics K. K.(HPK).

preprint2022arXiv

Reducing Learning Difficulties: One-Step Two-Critic Deep Reinforcement Learning for Inverter-based Volt-Var Control

A one-step two-critic deep reinforcement learning (OSTC-DRL) approach for inverter-based volt-var control (IB-VVC) in active distribution networks is proposed in this paper. Firstly, considering IB-VVC can be formulated as a single-period optimization problem, we formulate the IB-VVC as a one-step Markov decision process rather than the standard Markov decision process, which simplifies the DRL learning task. Then we design the one-step actor-critic DRL scheme which is a simplified version of recent DRL algorithms, and it avoids the issue of Q value overestimation successfully. Furthermore, considering two objectives of VVC: minimizing power loss and eliminating voltage violation, we utilize two critics to approximate the rewards of two objectives separately. It simplifies the approximation tasks of each critic, and avoids the interaction effect between two objectives in the learning process of critic. The OSTC-DRL approach integrates the one-step actor-critic DRL scheme and the two-critic technology. Based on the OSTC-DRL, we design two centralized DRL algorithms. Further, we extend the OSTC-DRL to multi-agent OSTC-DRL for decentralized IB-VVC and design two multi-agent DRL algorithms. Simulations demonstrate that the proposed OSTC-DRL has a faster convergence rate and a better control performance, and the multi-agent OSTC-DRL works well for decentralized IB-VVC problems.

preprint2020arXiv

Two-stage Deep Reinforcement Learning for Inverter-based Volt-VAR Control in Active Distribution Networks

Model-based Vol/VAR optimization method is widely used to eliminate voltage violations and reduce network losses. However, the parameters of active distribution networks(ADNs) are not onsite identified, so significant errors may be involved in the model and make the model-based method infeasible. To cope with this critical issue, we propose a novel two-stage deep reinforcement learning (DRL) method to improve the voltage profile by regulating inverter-based energy resources, which consists of offline stage and online stage. In the offline stage, a highly efficient adversarial reinforcement learning algorithm is developed to train an offline agent robust to the model mismatch. In the sequential online stage, we transfer the offline agent safely as the online agent to perform continuous learning and controlling online with significantly improved safety and efficiency. Numerical simulations on IEEE test cases not only demonstrate that the proposed adversarial reinforcement learning algorithm outperforms the state-of-art algorithm, but also show that our proposed two-stage method achieves much better performance than the existing DRL based methods in the online application.