Researcher profile

Xincheng Yao

Xincheng Yao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

HTPO: Towards Exploration-Exploitation Balanced Policy Optimization via Hierarchical Token-level Objective Control

Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a pivotal technique for enhancing the reasoning capabilities of Large Language Models (LLMs). However, the de facto practice of mainstream RL algorithms is to treat all tokens of one response equally and assign the same optimization objective to each token, failing to provide granular guidance for the reasoning process. While in Chain-of-Thought (CoT) reasoning, different tokens usually play distinct roles. Therefore, the current RL algorithms lack an effective mechanism to dynamically balance the exploration-exploitation trade-off during learning. To this end, we propose Hierarchical Token-level Objective Control Policy Optimization (HTPO), a novel RL algorithm that takes the divide-and-conquer idea to hierarchically partition the response tokens into specific functional groups from three aspects (i.e., prompt difficulty, answer correctness, and token entropy). Within each group, according to the contributions to exploration or exploitation, we design specialized optimization objectives to facilitate the effective execution of each token's expected functionality. In this way, HTPO can achieve a more balanced exploration-exploitation trade-off. Extensive experiments on challenging reasoning benchmarks validate the superiority of our HTPO algorithm, which significantly outperforms the strong DAPO baseline (e.g., +8.6% and +6.7% on AIME'24 and AIME'25, respectively). When scaling test-time compute, the HTPO-trained model maintains a consistent performance advantage over the DAPO baseline, and the gap widens as the sampling budget increases, validating that our adaptive token-level control method fosters effective exploration without sacrificing exploitation performance. Code will be at https://github.com/xcyao00/HTPO.

preprint2022arXiv

ADC-Net: An Open-Source Deep Learning Network for Automated Dispersion Compensation in Optical Coherence Tomography

Chromatic dispersion is a common problem to degrade the system resolution in optical coherence tomography (OCT). This study is to develop a deep learning network for automated dispersion compensation (ADC-Net) in OCT. The ADC-Net is based on a redesigned UNet architecture which employs an encoder-decoder pipeline. The input section encompasses partially compensated OCT B-scans with individual retinal layers optimized. Corresponding output is a fully compensated OCT B-scans with all retinal layers optimized. Two numeric parameters, i.e., peak signal to noise ratio (PSNR) and structural similarity index metric computed at multiple scales (MS-SSIM), were used for objective assessment of the ADC-Net performance. Comparative analysis of training models, including single, three, five, seven and nine input channels were implemented. The five-input channels implementation was observed as the optimal mode for ADC-Net training to achieve robust dispersion compensation in OCT

preprint2022arXiv

Depth-resolved vascular profile features for artery-vein classification in OCT and OCT angiography of human retina

This study is to characterize reflectance profiles of retinal blood vessels in optical coherence tomography (OCT), and to validate these vascular features to guide artery-vein classification in OCT angiography (OCTA) of human retina. Depth-resolved OCT reveals unique features of retinal arteries and veins. Retinal arteries show hyper-reflective boundaries at both upper (inner side towards the vitreous) and lower (outer side towards the choroid) walls. In contrary, retinal veins reveal hyper-reflectivity at the upper boundary only. Uniform lumen intensity was observed in both small and large arteries. However, the vein lumen intensity was dependent on the vessel size. Small veins exhibit a hyper-reflective zone at the bottom half of the lumen, while large veins show a hypo-reflective zone at the bottom half of the lumen

preprint2022arXiv

Normalized Blood Flow Index in Optical Coherence Tomography Angiography Provides a Sensitive Biomarker of Early Diabetic Retinopathy

Purpose: To evaluate the sensitivity of normalized blood flow index (NBFI) for detecting early diabetic retinopathy (DR). Methods: Optical coherence tomography angiography (OCTA) images of 30 eyes from 20 healthy controls, 21 eyes of diabetic patients with no DR (NoDR) and 26 eyes from 22 patients with mild non-proliferative DR (NPDR) were analyzed in this study. The OCTA images were centered on the fovea and covered a 6 mm x 6 mm area. Enface projections of the superficial vascular plexus (SVP) and the deep capillary plexus (DCP) were obtained for the quantitative OCTA feature analysis. Three quantitative OCTA features were examined: blood vessel density (BVD), blood flow flux (BFF), and normalized blood flow index (NBFI). Each feature was calculated from both the SVP and DCP and their sensitivity to distinguish the three cohorts of the study were evaluated. Results: The only quantitative feature that was capable of distinguishing between all three cohorts was NBFI in the DCP image. Comparative study revealed that both BVD and BFF were able to distinguish the controls from NoDR and mild NPDR. However, neither BVD nor BFF was sensitive enough to separate NoDR from the healthy controls. Conclusion: The NBFI has been demonstrated as a sensitive biomarker of early DR, revealing retinal blood flow abnormality better than traditional BVD and BFF. The NBFI in the DCP was verified as the most sensitive biomarker, supporting that diabetes affects the DCP earlier than SVP in DR.

preprint2020arXiv

AV-Net: Deep learning for fully automated artery-vein classification in optical coherence tomography angiography

This study is to demonstrate deep learning for automated artery-vein (AV) classification in optical coherence tomography angiography (OCTA). The AV-Net, a fully convolutional network (FCN) based on modified U-shaped CNN architecture, incorporates enface OCT and OCTA to differentiate arteries and veins. For the multi-modal training process, the enface OCT works as a near infrared fundus image to provide vessel intensity profiles, and the OCTA contains blood flow strength and vessel geometry features. A transfer learning process is also integrated to compensate for the limitation of available dataset size of OCTA, which is a relatively new imaging modality. By providing an average accuracy of 86.75%, the AV-Net promises a fully automated platform to foster clinical deployment of differential AV analysis in OCTA.