Researcher profile

Chenxiao Zhao

Chenxiao Zhao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

On-policy self-distillation, where a student is pulled toward a copy of itself conditioned on privileged context (e.g., a verified solution or feedback), offers a promising direction for advancing reasoning capability without a stronger external teacher. Yet in math reasoning the gains are inconsistent, even when the same approach succeeds elsewhere. A pointwise mutual information analysis traces the failure to the privileged context itself: it inflates the teacher's confidence on tokens already implied by the solution (structural connectives, verifiable claims) and deflates it on deliberation tokens ("Wait", "Let", "Maybe") that drive multi-step search. We propose Anti-Self-Distillation (AntiSD), which ascends a divergence between student and teacher rather than descending it: this reverses the per-token sign and yields a naturally bounded advantage in one step. An entropy-triggered gate disables the term once the teacher entropy collapses, completing a drop-in replacement for default self-distillation. Across five models from 4B to 30B parameters on math reasoning benchmarks, AntiSD reaches the GRPO baseline's accuracy in 2 to 10x fewer training steps and improves final accuracy by up to 11.5 points. AntiSD opens a path to scalable self-improvement, where a language model bootstraps its own reasoning through its training signal.

preprint2026arXiv

From Generic Correlation to Input-Specific Credit in On-Policy Self Distillation

On-policy self-distillation has emerged as a promising paradigm for post-training language models, in which the model conditions on environment feedback to serve as its own teacher, providing dense token-level rewards without external teacher models or step-level annotations. Despite its empirical success, what this reward actually measures and what kind of credit it assigns remain unclear. Under a posterior-compatibility interpretation of feedback conditioning, standard in the implicit-reward literature, we show that the self-distillation token reward is a Bayesian filtering increment whose trajectory sum is exactly the pointwise mutual information between the response and the feedback given the input. This pMI can be raised by input-specific reasoning or by input-generic shortcuts, so we further decompose the teacher log-probability along the input axis. Based on this analysis, we propose CREDIT (Contrastive REward from DIsTillation), which isolates the input-specific component with a batch-contrastive baseline. At the sequence level, CREDIT is a teacher-side surrogate for a contrastive pMI objective that also penalizes responses remaining likely under unrelated inputs. Across coding, scientific reasoning, and tool-use benchmarks on two model families, CREDIT delivers the strongest aggregate performance at negligible additional compute.

preprint2020arXiv

Combining quantum spin hall effect and superconductivity in few-layer stanene

Stanene was proposed to be a quantum spin hall insulator containing topological edges states and a time reversal invariant topological superconductor hosting helical Majorana edge mode. Recently, experimental evidences of existence of topological edge states have been found in monolayer stanene films and superconductivity has been observed in few-layer stanene films excluding single layer. An integrated system with both topological edge states and superconductivity are higly pursued as a possible platform to realize topological superconductivity. Few-layer stanene show great potential to meet this requirement and is highly desired in experiment. Here we successfully grow few-layer stanene on bismuth (111) substrate. Both topological edge states and superconducting gaps are observed by in-situ scanning tunneling microscopy/spectroscopy (STM/STS). Our results take a further step towards topological superconductivity by stanene films.

preprint2020arXiv

Stanene: A Good Platform for Topological Insulator and Topological Superconductor

Two dimensional (2D) topological insulators (TIs) and topological superconductors (TSCs) have been intensively studied for recent years due to its great potential for dissipationless electron transportation and fault-tolerant quantum computing, respectively. Here we focus on stanene, the tin analogue of graphene, to give a brief review of its development as a candidate for both 2D TI and TSC. Stanene is proposed to be a TI with a large gap of 0.3 eV, and its topological properties are sensitive to various factors, e.g., the lattice constants, chemical functionalization and layer thickness, which offer various methods for phase tunning. Experimentally, the inverted gap and edge states are observed recently, which are strong evidence for TI. In addition, stanene is also predicted to be a time reversal invariant TSC by breaking inversion symmetry, supporting helical Majorana edge modes. The layer-dependent superconductivity of stanene is recently confirmed by both transport and scanning tunneling microscopy measurements. This review gives a detailed introduction to stanene and its topological properties and some prospects are also discussed.