Researcher profile

Xinyu Qiu

Xinyu Qiu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

Unified Generation and Self-Verification for Vision-Language Models via Advantage Decoupled Preference Optimization

Parallel test-time scaling typically trains separate generation and verification models, incurring high training and inference costs. We propose Advantage Decoupled Preference Optimization (ADPO), a unified reinforcement learning framework that jointly learns answer generation and self-verification within a single policy. ADPO introduces two innovations: a preference verification reward improving verification capability and a decoupled optimization mechanism enabling synergistic optimization of generation and verification. Specifically, the preference verification reward computes mean verification scores from positive and negative samples as decision thresholds, providing positive feedback when prediction correctness aligns with answer correctness. Meanwhile, the advantage decoupled optimization computes separate advantages for generation and verification, applies token masks to isolate gradients, and combines masked GRPO objectives, preserving generation quality while calibrating verification scores. ADPO achieves up to +34.1% higher verification AUC and -53.5% lower inference time, with significant gains of +2.8%/+1.4% accuracy on MathVista/MMMU, +1.9 cIoU on ReasonSeg, and +1.7%/+1.0% step success rate on AndroidControl/GUI Odyssey.

preprint2022arXiv

Quantum cost of dense coding and teleportation

The quantum cost is a key ingredient to evaluate the quality of quantum protocols from a practical viewpoint. We show that the quantum cost of d-dimensional dense coding protocol is equal to d+3 when transmitting the classical message (0,0), and that is equal to d+4 when transmitting other classical message. It appears linear growth with the dimension and thus makes sense for implementation. In contrast, the quantum cost of high-dimensional teleportation protocols is equal to 13, which is the maximum value of the cost for the two-dimensional case. As an application, we establish the relation between the quantum cost and fidelity of dense coding protocols in terms of four typical noise scenario.