Researcher profile

Harold Haodong Chen

Harold Haodong Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

Go with Your Gut: Scaling Confidence for Autoregressive Image Generation

Test-time scaling (TTS) has demonstrated remarkable success in enhancing large language models, yet its application to next-token prediction (NTP) autoregressive (AR) image generation remains largely uncharted. Existing TTS approaches for visual AR (VAR), which rely on frequent partial decoding and external reward models, are ill-suited for NTP-based image generation due to the inherent incompleteness of intermediate decoding results. To bridge this gap, we introduce ScalingAR, the first TTS framework specifically designed for NTP-based AR image generation that eliminates the need for early decoding or auxiliary rewards. ScalingAR leverages token entropy as a novel signal in visual token generation and operates at two complementary scaling levels: (i) Profile Level, which streams a calibrated confidence state by fusing intrinsic and conditional signals; and (ii) Policy Level, which utilizes this state to adaptively terminate low-confidence trajectories and dynamically schedule guidance for phase-appropriate conditioning strength. Experiments on both general and compositional benchmarks show that ScalingAR (1) improves base models by 12.5% on GenEval and 15.2% on TIIF-Bench, (2) efficiently reduces visual token consumption by 62.0% while outperforming baselines, and (3) successfully enhances robustness, mitigating performance drops by 26.0% in challenging scenarios.

preprint2026arXiv

RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data

The scalability of robotic manipulation is fundamentally bottlenecked by the scarcity of task-aligned physical interaction data. While vision-language models (VLMs) and video generation models (VGMs) hold promise for autonomous data synthesis, they suffer from semantic-spatial misalignment and physical hallucinations, respectively. To bridge this gap, we introduce RoboEvolve, a novel framework that couples a VLM planner and a VGM simulator into a mutually reinforcing co-evolutionary loop. Operating purely on unlabeled seed images, RoboEvolve leverages a cognitive-inspired dual-phase mechanism: (i) daytime exploration fosters physically grounded behavioral discovery through a semantic-controlled multi-granular reward, and (ii) nighttime consolidation mines "near-miss" failures to stabilize policy optimization. Guided by an autonomous progressive curriculum, the system naturally scales from simple atomic actions to complex tasks. Extensive experiments demonstrate that RoboEvolve (I) achieves superior effectiveness, elevating base planners by 30 absolute points and amplifying simulator success by 48% on average; (II) exhibits extreme data efficiency, surpassing fully supervised baselines with merely 500 unlabeled seeds--a 50x reduction; and (III) demonstrates robust continual learning without catastrophic forgetting.