Researcher profile

Haodong Li

Haodong Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria

Aligning multimodal generative models with human preferences demands reward signals that respect the compositional, multi-dimensional structure of human judgment. Prevailing RLHF approaches reduce this structure to scalar or pairwise labels, collapsing nuanced preferences into opaque parametric proxies and exposing vulnerabilities to reward hacking. While recent Rubrics-as-Reward (RaR) methods attempt to recover this structure through explicit criteria, generating rubrics that are simultaneously reliable, scalable, and data-efficient remains an open problem. We introduce Auto-Rubric as Reward (ARR), a framework that reframes reward modeling from implicit weight optimization to explicit, criteria-based decomposition. Before any pairwise comparison, ARR externalizes a VLM's internalized preference knowledge as prompt-specific rubrics, translating holistic intent into independently verifiable quality dimensions. This conversion of implicit preference structure into inspectable, interpretable constraints substantially suppresses evaluation biases including positional bias, enabling both zero-shot deployment and few-shot conditioning on minimal supervision. To extend these gains into generative training, we propose Rubric Policy Optimization (RPO), which distills ARR's structured multi-dimensional evaluation into a robust binary reward, replacing opaque scalar regression with rubric-conditioned preference decisions that stabilize policy gradients. On text-to-image generation and image editing benchmarks, ARR-RPO outperforms pairwise reward models and VLM judges, demonstrating that explicitly externalizing implicit preference knowledge into structured rubrics achieves more reliable, data-efficient multimodal alignment, revealing that the bottleneck is the absence of a factorized interface, not a deficit of knowledge.

preprint2026arXiv

Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning

Contrastive learning (CL) reduces annotation cost via auto-derived supervisory signals. Since large-scale in-house CL datasets are infeasible, reliance on third-party or internet data is common. Recent studies show CL models are vulnerable to data-poisoning backdoor attacks, but their generalization and robustness are underexplored. We systematically evaluate existing data-poisoning backdoor attacks on CL, revealing limitations: poor dataset adaptability, low success rates, limited portability, and restrictive assumptions (e.g., downstream task knowledge). Interestingly, trigger samples exhibit distinguishable statistical divergence from clean samples, which inspires repurposing it as a watermark for dataset IP protection. Direct repurposing is challenging due to low success rates; we overcome this by statistical verification using a unified density metric. We further propose a multi-level watermarking scheme adapting to feature-level, soft-label, or hard-label outputs in CL. Experiments show some backdoor attacks can be repurposed as effective watermarks with trade-offs among fidelity, verifiability, and robustness. This work demonstrates weak backdoor effects become reliable signals for dataset IP protection in challenging CL settings.

preprint2024arXiv

Digger: Detecting Copyright Content Mis-usage in Large Language Model Training

Pre-training, which utilizes extensive and varied datasets, is a critical factor in the success of Large Language Models (LLMs) across numerous applications. However, the detailed makeup of these datasets is often not disclosed, leading to concerns about data security and potential misuse. This is particularly relevant when copyrighted material, still under legal protection, is used inappropriately, either intentionally or unintentionally, infringing on the rights of the authors. In this paper, we introduce a detailed framework designed to detect and assess the presence of content from potentially copyrighted books within the training datasets of LLMs. This framework also provides a confidence estimation for the likelihood of each content sample's inclusion. To validate our approach, we conduct a series of simulated experiments, the results of which affirm the framework's effectiveness in identifying and addressing instances of content misuse in LLM training processes. Furthermore, we investigate the presence of recognizable quotes from famous literary works within these datasets. The outcomes of our study have significant implications for ensuring the ethical use of copyrighted materials in the development of LLMs, highlighting the need for more transparent and responsible data management practices in this field.

preprint2022arXiv

Robust Coordinated Longitudinal Control of MAV Based on Energy State

Fixed-wing Miniature Air Vehicle (MAV) is not only coupled with longitudinal motion, but also more susceptible to wind disturbance due to its lighter weight, which brings more challenges to its altitude and airspeed controller design. Therefore, in this paper, an improved longitudinal control strategy based on energy state, is proposed to address the above-mentioned issues. The control strategy utilizes the Linear Extended State Observer (LESO) to observe the energy states and the disturbance of the MAV, and then designs a Multiple-Input Multiple-Output (MIMO) controller based on a more coordinated Total Energy Control (TEC) strategy to control the airspeed and altitude of the MAV. The performance of this control strategy has been successfully verified in a Model-in-the-Loop (MIL) simulation with Simulink, and a comparative test with the classical TEC algorithm is carried out.

preprint2020arXiv

Identification of Deep Network Generated Images Using Disparities in Color Components

With the powerful deep network architectures, such as generative adversarial networks, one can easily generate photorealistic images. Although the generated images are not dedicated for fooling human or deceiving biometric authentication systems, research communities and public media have shown great concerns on the security issues caused by these images. This paper addresses the problem of identifying deep network generated (DNG) images. Taking the differences between camera imaging and DNG image generation into considerations, we analyze the disparities between DNG images and real images in different color components. We observe that the DNG images are more distinguishable from real ones in the chrominance components, especially in the residual domain. Based on these observations, we propose a feature set to capture color image statistics for identifying DNG images. Additionally, we evaluate several detection situations, including the training-testing data are matched or mismatched in image sources or generative models and detection with only real images. Extensive experimental results show that the proposed method can accurately identify DNG images and outperforms existing methods when the training and testing data are mismatched. Moreover, when the GAN model is unknown, our methods also achieves good performance with one-class classification by using only real images for training.

preprint2017arXiv

Image Processing Operations Identification via Convolutional Neural Network

In recent years, image forensics has attracted more and more attention, and many forensic methods have been proposed for identifying image processing operations. Up to now, most existing methods are based on hand crafted features, and just one specific operation is considered in their methods. In many forensic scenarios, however, multiple classification for various image processing operations is more practical. Besides, it is difficult to obtain effective features by hand for some image processing operations. In this paper, therefore, we propose a new convolutional neural network (CNN) based method to adaptively learn discriminative features for identifying typical image processing operations. We carefully design the high pass filter bank to get the image residuals of the input image, the channel expansion layer to mix up the resulting residuals, the pooling layers, and the activation functions employed in our method. The extensive results show that the proposed method can outperform the currently best method based on hand crafted features and three related methods based on CNN for image steganalysis and/or forensics, achieving the state-of-the-art results. Furthermore, we provide more supplementary results to show the rationality and robustness of the proposed model.