Researcher profile

Yucheng Guo

Yucheng Guo contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

AdaptiveLoad: Towards Efficient Video Diffusion Transformer Training

In video generation models, particularly world models, training large-scale video diffusion Transformers (such as DiT and MMDiT) poses significant computational challenges due to the extreme variance in sequence lengths within mixed-mode datasets. Existing bucket-based data loading strategies typically rely on "equal token length" constraints. This approach fails to account for the quadratic complexity of self-attention mechanisms, leading to severe load imbalance and underutilization of GPU resources. This paper proposes \textit{AdaptiveLoad}, an integrated optimization framework consisting of two core components: (1) A dual-constraint adaptive load balancing system, which eliminates long-sequence bottlenecks by simultaneously limiting memory consumption and computational load ($B \times S^p \le M_{\text{comp}}$); (2) A fused LayerNorm-Modulate CUDA kernel, which utilizes a D-tile coalesced reduction strategy to increase throughput and alleviate memory pressure. Experimental results on the Wan 2.1 world model demonstrate that our method reduces the computational imbalance rate from 39\% to 18.9\%, improves peak VRAM utilization efficiency by 22.7\%, and achieves an overall training throughput increase of 27.2\%.

preprint2026arXiv

D-VLA: A High-Concurrency Distributed Asynchronous Reinforcement Learning Framework for Vision-Language-Action Models

The rapid evolution of Embodied AI has enabled Vision-Language-Action (VLA) models to excel in multimodal perception and task execution. However, applying Reinforcement Learning (RL) to these massive models in large-scale distributed environments faces severe systemic bottlenecks, primarily due to the resource conflict between high-fidelity physical simulation and the intensive VRAM/bandwidth demands of deep learning. This conflict often leaves overall throughput constrained by execution-phase inefficiencies. To address these challenges, we propose D-VLA, a high-concurrency, low-latency distributed RL framework for large-scale embodied foundation models. D-VLA introduces "Plane Decoupling," physically isolating high-frequency training data from low-frequency weight control to eliminate interference between simulation and optimization. We further design a four-thread asynchronous "Swimlane" pipeline, enabling full parallel overlap of sampling, inference, gradient computation, and parameter distribution. Additionally, a dual-pool VRAM management model and topology-aware replication resolve memory fragmentation and optimize communication efficiency. Experiments on benchmarks like LIBERO show that D-VLA significantly outperforms mainstream RL frameworks in throughput and sampling efficiency for billion-parameter VLA models. In trillion-parameter scalability tests, our framework maintains exceptional stability and linear speedup, providing a robust system for high-performance general-purpose embodied agents.

preprint2026arXiv

RPO: Fine-Tuning Visual Generative Models via Rich Vision-Language Preferences

Traditional preference tuning methods for LLMs/Visual Generative Models often rely solely on reward model labeling, which can be opaque, offer limited insights into the rationale behind preferences, and are prone to issues such as reward hacking or overfitting. We introduce Rich Preference Optimization (RPO), a novel pipeline that leverages rich feedback signals from Vision Language Models (VLMs) to improve the curation of preference pairs for fine-tuning visual generative models like text-to-image diffusion models. Our approach begins with prompting VLMs to generate detailed critiques of synthesized images, from which we further prompt VLMs to extract reliable and actionable image editing instructions. By implementing these instructions, we create refined images, resulting in synthetic, informative preference pairs that serve as enhanced tuning datasets. We demonstrate the effectiveness of our pipeline and the resulting datasets in fine-tuning state-of-the-art diffusion models.

preprint2023arXiv

Drug Synergistic Combinations Predictions via Large-Scale Pre-Training and Graph Structure Learning

Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation. However, identifying novel drug combinations through wet-lab experiments is resource intensive due to the vast combinatorial search space. Recently, computational approaches, specifically deep learning models have emerged as an efficient way to discover synergistic combinations. While previous methods reported fair performance, their models usually do not take advantage of multi-modal data and they are unable to handle new drugs or cell lines. In this study, we collected data from various datasets covering various drug-related aspects. Then, we take advantage of large-scale pre-training models to generate informative representations and features for drugs, proteins, and diseases. Based on that, a message-passing graph is built on top to propagate information together with graph structure learning flexibility. This is first introduced in the biological networks and enables us to generate pseudo-relations in the graph. Our framework achieves state-of-the-art results in comparison with other deep learning-based methods on synergistic prediction benchmark datasets. We are also capable of inferencing new drug combination data in a test on an independent set released by AstraZeneca, where 10% of improvement over previous methods is observed. In addition, we're robust against unseen drugs and surpass almost 15% AU ROC compared to the second-best model. We believe our framework contributes to both the future wet-lab discovery of novel drugs and the building of promising guidance for precise combination medicine.

preprint2022arXiv

Complex structure due to As bonding and interplay with electronic structure in superconducting BaNi2As2

BaNi2As2 is a superconductor chemically related to the Fe-based superconductors, with a complex and poorly understood structural phase transition. We show based on first principles calculations that in fact there are two distinct competing structures. These structures are very different from electronic, transport and bonding points of view but are close in energy. These arise due to complex As bonding patterns and drive distortions of the Ni layers. This is supported by photoemission experiments. This leads to an interplay of electronic and structural behavior including induced anisotropic of the electronic transport. The competition between these distortions is associated with the complex behavior observed in BaNi2As2 samples.

preprint2020arXiv

Metal enrichment in the circumgalactic medium and Lyα haloes around quasars at z $\sim$3

Deep observations have detected extended Ly$α$ emission nebulae surrounding tens of quasars at redshift 2 to 6. However, the metallicity of such extended haloes is still poorly understood. We perform a detailed analysis on a large sample of 80 quasars at $z\sim3$ based on MUSE-VLT data. We find clear evidence of extended emission of the UV nebular lines such as CIV or HeII for about 20$\%$ of the sample, while CIII] is only marginally detected in a few objects. By stacking the cubes we detect emission of CIV, HeII and CIII] out to a radius of about 45 kpc. CIV and HeII show a radial decline much steeper than Ly$α$, while CIII] shows a shallower profile similar to Ly$α$ in the inner 45 kpc. We infer that the average metallicity of the circumgalactic gas within the central 30-50~kpc is $\sim$0.5 solar, or even higher. However, we also find evidence of a component of the Ly$α$ haloes, which has much weaker metal emission lines relative to Ly$α$. We suggest that the high metallicity of the circumgalactic medium within the central 30-50 kpc is associated with chemical pre-enrichment by past quasar-driven outflows and that there is a more extended component of the CGM that has much lower metallicity and likely associated with near-pristine gas accreted from the intergalactic medium. We show that our observational results are in good agreement with the expectations of the FABLE zoom-in cosmological simulations.

preprint2020arXiv

Orbital-collaborative Charge Density Wave in Monolayer VTe2

Charge density waves in transition metal dichalcogenides have been intensively studied for their close correlation with Mott insulator, charge-transfer insulator, and superconductor. VTe2 monolayer recently comes into sight because of its prominent electron correlations and the mysterious origin of CDW orders. As a metal of more than one type of charge density waves, it involves complicated electron-electron and electron-phonon interactions. Through a scanning tunneling microscopy study, we observed triple-Q 4-by-4 and single-Q 4-by-1 modulations with significant charge and orbital separation. The triple-Q 4-by-4 order arises strongly from the p-d hybridized states, resulting in a charge distribution in agreement with the V-atom clustering model. Associated with a lower Fermi level, the local single-Q 4-by-1 electronic pattern is generated with the p-d hybridized states remaining 4-by-4 ordered. In the spectroscopic study, orbital- and atomic- selective charge-density-wave gaps with the size up to ~400 meV were resolved on the atomic scale.

preprint2020arXiv

The Sloan Digital Sky Survey Reverberation Mapping Project: Photometric g and i Light Curves

The Sloan Digital Sky Survey Reverberation Mapping (SDSS-RM) program monitors 849 active galactic nuclei (AGN) both spectroscopically and photometrically. The photometric observations used in this work span over four years and provide an excellent baseline for variability studies of these objects. We present the photometric light curves from 2014 to 2017 obtained by the Steward Observatory's Bok telescope and the CFHT telescope with MegaCam. We provide details on the data acquisition and processing of the data from each telescope, the difference imaging photometry used to produce the light curves, and the calculation of a variability index to quantify each AGN's variability. We find that the Welch-Stetson J-index provides a useful characterization of AGN variability and can be used to select AGNs for further study.

preprint2020arXiv

The Third Data Release of the Beijing-Arizona Sky Survey

The Beijing-Arizona Sky Survey (BASS) is a wide and deep imaging survey to cover a 5400 deg$^2$ area in the Northern Galactic Cap with the 2.3m Bok telescope using two filters ($g$ and $r$ bands). The Mosaic $z$-band Legacy Survey (MzLS) covers the same area in $z$ band with the 4m Mayall telescope. These two surveys will be used for spectroscopic targeting of the Dark Energy Spectroscopic Instrument (DESI). The BASS survey observations were completed in 2019 March. This paper describes the third data release (DR3) of BASS, which contains the photometric data from all BASS and MzLS observations between 2015 January and 2019 March. The median astrometric precision relative to {\it Gaia} positions is about 17 mas and the median photometric offset relative to the PanSTARRS1 photometry is within 5 mmag. The median $5σ$ AB magnitude depths for point sources are 24.2, 23.6, and 23.0 mag for $g$, $r$, and $z$ bands, respectively. The photometric depth within the survey area is highly homogeneous, with the difference between the 20\% and 80\% depth less than 0.3 mag. The DR3 data, including raw data, calibrated single-epoch images, single-epoch photometric catalogs, stacked images, and co-added photometric catalogs, are publicly accessible at \url{http://batc.bao.ac.cn/BASS/doku.php?id=datarelease:home}.

preprint2019arXiv

The Sloan Digital Sky Survey Reverberation Mapping Project: Initial CIV Lag Results from Four Years of Data

We present reverberation-mapping lags and black-hole mass measurements using the CIV 1549 broad emission line from a sample of 349 quasars monitored as a part of the Sloan Digital Sky Survey Reverberation Mapping Project. Our data span four years of spectroscopic and photometric monitoring for a total baseline of 1300 days. We report significant time delays between the continuum and the CIV 1549 emission line in 52 quasars, with an estimated false-positive detection rate of 10%. Our analysis of marginal lag measurements indicates that there are on the order of 100 additional lags that should be recoverable by adding more years of data from the program. We use our measurements to calculate black-hole masses and fit an updated CIV radius-luminosity relationship. Our results significantly increase the sample of quasars with CIV RM results, with the quasars spanning two orders of magnitude in luminosity toward the high-luminosity end of the CIV radius-luminosity relation. In addition, these quasars are located at among the highest redshifts (z~1.4-2.8) of quasars with black hole masses measured with reverberation mapping. This work constitutes the first large sample of CIV reverberation-mapping measurements in more than a dozen quasars, demonstrating the utility of multi-object reverberation mapping campaigns.