Source author record

Xin Zheng

Xin Zheng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computation and Language physics.atom-ph physics.optics quant-ph Computer Vision Methodology physics.ins-det Robotics Software Engineering Sound

Catalog footprint

What is connected

13works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Bridging Sequence and Graph Structure for Epigenetic Age Prediction

Epigenetic clocks based on DNA methylation have emerged as powerful tools for estimating biological age, with broad applications in aging research, age-related disease studies, and longevity science. Despite advances across machine learning approaches to epigenetic age prediction, spanning penalised linear regression, deep feedforward networks, residual architectures, and graph neural networks, no existing method jointly models co-methylation graph structure and site-specific DNA sequence context within a unified framework. We propose a unified sequence--graph integration framework for epigenetic age prediction that addresses this gap, integrating eight-dimensional DNA sequence statistical features through a lightweight gated modulation mechanism that adaptively scales each site's methylation signal according to its sequence-determined biological relevance prior to graph convolution. Evaluated on 3,707 blood methylation samples against a comprehensive set of baselines, our method achieves a test MAE of 3.149 years, a 12.8\% improvement over the strongest graph-based baseline. Biologically informed statistical features outperform CNN-based sequence encoding, demonstrating that handcrafted sequence features are more effective than end-to-end learned representations in this data regime. Post-hoc interpretability analysis identifies CpG density and local adenine frequency as features with age-dependent importance shifts, consistent with known mechanisms of age-related hypermethylation at CpG-dense promoter regions. Our code is at https://github.com/yaoli2022/graphage-seq.

preprint2026arXiv

ScaleBox: Enabling High-Fidelity and Scalable Code Verification for Large Language Models

Code sandboxes have emerged as a critical infrastructure for advancing the coding capabilities of large language models, providing verifiable feedback for both RL training and evaluation. However, existing systems fail to provide accurate verification and efficiency under high-concurrency workloads. We present ScaleBox, a high-fidelity and scalable system designed to address these limitations in large-scale code training. ScaleBox introduces automated special-judge generation and management, fine-grained parallel execution across test cases with seamless multi-node coordination, and a configuration-driven evaluation suite for reproducible benchmarking. A series of experiments demonstrates that ScaleBox significantly enhances code verification accuracy and efficiency. Our further RLVR experiments show that ScaleBox substantially improves both performance on LiveCodeBench and training stability, significantly outperforming heuristic-matching baselines. By providing a reliable and high-throughput infrastructure, ScaleBox facilitates more effective research and development in large-scale code training.

preprint2026arXiv

ST-TGExplainer: Disentangling Stability and Transition Patterns for Temporal GNN Interpretability

Temporal graph neural networks (TGNNs) have gained significant traction for solving real-world temporal graph tasks. However, their interpretability remains limited, as most TGNNs fail to identify which historical interactions most influence a given prediction. Despite promising progress on interpretable TGNNs, existing methods predominantly focus on previously seen historical interactions, which we term stability patterns, while overlooking newly emerging first-time interactions, which we term transition patterns. Both types of patterns are essential for faithful temporal explanations. To address this limitation, we propose ST-TGExplainer, a self-explainable TGNN that disentangles Stability and Transition patterns in temporal graphs for a more faithful Temporal GNN Explainer. Guided by a disentangled information bottleneck objective, ST-TGExplainer learns a compact explanatory subgraph that remains predictive of the event label while explicitly suppressing label-conditioned redundancy between stability and transition patterns. Extensive experiments demonstrate that ST-TGExplainer achieves strong predictive performance and yields more faithful explanations. Code is available at https://github.com/hjchen-hdu/ST-TGExplainer.

preprint2022arXiv

Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation

Recently, $k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor ($k$NN) retrieval to achieve domain adaptation without retraining. Despite being conceptually attractive, it heavily relies on high-quality in-domain parallel corpora, limiting its capability on unsupervised domain adaptation, where in-domain parallel corpora are scarce or nonexistent. In this paper, we propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval. To this end, we first introduce an autoencoder task based on the target language, and then insert lightweight adapters into the original NMT model to map the token-level representation of this task to the ideal representation of translation task. Experiments on multi-domain datasets demonstrate that our proposed approach significantly improves the translation accuracy with target-side monolingual data, while achieving comparable performance with back-translation.

preprint2021arXiv

Efficient LiDAR Odometry for Autonomous Driving

LiDAR odometry plays an important role in self-localization and mapping for autonomous navigation, which is usually treated as a scan registration problem. Although having achieved promising performance on KITTI odometry benchmark, the conventional searching tree-based approach still has the difficulty in dealing with the large scale point cloud efficiently. The recent spherical range image-based method enjoys the merits of fast nearest neighbor search by spherical mapping. However, it is not very effective to deal with the ground points nearly parallel to LiDAR beams. To address these issues, we propose a novel efficient LiDAR odometry approach by taking advantage of both non-ground spherical range image and bird's-eye-view map for ground points. Moreover, a range adaptive method is introduced to robustly estimate the local surface normal. Additionally, a very fast and memory-efficient model update scheme is proposed to fuse the points and their corresponding normals at different time-stamps. We have conducted extensive experiments on KITTI odometry benchmark, whose promising results demonstrate that our proposed approach is effective.

preprint2021arXiv

High precision differential clock comparisons with a multiplexed optical lattice clock

Rapid progress in the precision and accuracy of optical atomic clocks over the last decade has advanced the frontiers of timekeeping, metrology, and quantum science. However, the stabilities of most optical clocks remain limited by the local oscillator rather than the atoms themselves, leaving room for further progress. Here we implement a "multiplexed" one-dimensional optical lattice clock, in which spatially-resolved, movable ensembles of ultra-cold strontium atoms are trapped in the same optical lattice, interrogated simultaneously by a shared clock laser, and read-out in parallel. By performing synchronized Ramsey interrogations of ensemble pairs we observe atom-atom coherence times up to 26 seconds, a 270-fold improvement over the atom-laser coherence time, demonstrate a relative stability of $9.7(4)\times10^{-18}/\sqrtτ$ (where $τ$ is the averaging time in seconds), and reach a fractional uncertainty of $8.9(3)\times 10^{-20}$ after 3.3 hours of averaging. These results demonstrate that applications requiring ultra-high-precision comparisons between optical atomic clocks need not be limited by the stability of the local oscillator. With multiple ensemble pairs, we realize a miniaturized clock network consisting of 6 atom ensembles, resulting in 15 unique pairwise clock comparisons with relative stabilities below $3\times10^{-17}/\sqrtτ$. Finally, we demonstrate the capability to simultaneously load spatially-resolved, heterogeneous ensemble pairs of all four stable isotopes of strontium in a lattice. The unique capabilities offered by this platform pave the way for future studies of precision isotope shift measurements, spatially resolved characterization of limiting clock systematics, development of clock-based gravitational wave and dark matter detectors, and novel tests of relativity including measurements of the gravitational redshift at sub-centimeter scales.

preprint2021arXiv

Towards Faithfulness in Open Domain Table-to-text Generation from an Entity-centric View

In open domain table-to-text generation, we notice that the unfaithful generation usually contains hallucinated content which can not be aligned to any input table record. We thus try to evaluate the generation faithfulness with two entity-centric metrics: table record coverage and the ratio of hallucinated entities in text, both of which are shown to have strong agreement with human judgements. Then based on these metrics, we quantitatively analyze the correlation between training data quality and generation fidelity which indicates the potential usage of entity information in faithful generation. Motivated by these findings, we propose two methods for faithful generation: 1) augmented training by incorporating the auxiliary entity information, including both an augmented plan-based model and an unsupervised model and 2) training instance selection based on faithfulness ranking. We show these approaches improve generation fidelity in both full dataset setting and few shot learning settings by both automatic and human evaluations.

preprint2020arXiv

Impacts of random filling on spin squeezing via Rydberg dressing in optical clocks

We analyze spin squeezing via Rydberg dressing in optical lattice clocks with random fractional filling. We compare the achievable clock stability in different lattice geometries, including unity-filled tweezer clock arrays and fractionally filled lattice clocks with varying dimensionality. We provide practical considerations and useful tools in the form of approximate analytical expressions and fitting functions to aid in the experimental implementation of Rydberg-dressed spin squeezing. We demonstrate that spin squeezing via Rydberg dressing in one-, two-, and three-dimensional optical lattices can provide significant improvements in stability in the presence of random fractional filling.

preprint2020arXiv

Quantile Treatment Effects and Bootstrap Inference under Covariate-Adaptive Randomization

In this paper, we study the estimation and inference of the quantile treatment effect under covariate-adaptive randomization. We propose two estimation methods: (1) the simple quantile regression and (2) the inverse propensity score weighted quantile regression. For the two estimators, we derive their asymptotic distributions uniformly over a compact set of quantile indexes, and show that, when the treatment assignment rule does not achieve strong balance, the inverse propensity score weighted estimator has a smaller asymptotic variance than the simple quantile regression estimator. For the inference of method (1), we show that the Wald test using a weighted bootstrap standard error under-rejects. But for method (2), its asymptotic size equals the nominal level. We also show that, for both methods, the asymptotic size of the Wald test using a covariate-adaptive bootstrap standard error equals the nominal level. We illustrate the finite sample performance of the new estimation and inference methods using both simulated and real datasets.

preprint2015arXiv

Application of Deep Neural Network in Estimation of the Weld Bead Parameters

We present a deep learning approach to estimation of the bead parameters in welding tasks. Our model is based on a four-hidden-layer neural network architecture. More specifically, the first three hidden layers of this architecture utilize Sigmoid function to produce their respective intermediate outputs. On the other hand, the last hidden layer uses a linear transformation to generate the final output of this architecture. This transforms our deep network architecture from a classifier to a non-linear regression model. We compare the performance of our deep network with a selected number of results in the literature to show a considerable improvement in reducing the errors in estimation of these values. Furthermore, we show its scalability on estimating the weld bead parameters with same level of accuracy on combination of datasets that pertain to different welding techniques. This is a nontrivial result that is counter-intuitive to the general belief in this field of research.

preprint2015arXiv

Black phosphorus as a new broadband saturable absorber for infrared passively Q-switched fiber lasers

Black phosphorus (BP) with its enticing electric and optical properties is intensely researched in the field of optoelectronics. In this paper, Q-switched pulses at 1550 nm and 2 um wavelengths are obtained by inserting bulk-structured BP based saturable absorber (SA) into an erbium-doped fiber laser (EDFL) and an thulium/holmium-doped fiber laser (THDFL), respectively. The BP-SA was prepared by depositing powered BP material on to the flat side of a side-polished single mode fiber. Q-switched 1550 nm pulses with width tuned from 9.35 to 31 us were obtained for the EDFL. For the THDFL, over 100 nm wavelength range could be achieved from 1832 to 1935 nm by adjusting the pump power. To the best of our knowledge, these results demonstrated the broadband saturable absorption property of BP and for the first time verified that BP as a new two-dimensional material for applications in saturable absorption devices.

preprint2015arXiv

Mid-infrared ultra-short mode-locked fiber laser utilizing topological insulator Bi2Te3 nano-sheets as the saturable absorber

The newly-emergent two-dimensional topological insulators (TIs) have shown their unique electronic and optical properties, such as good thermal management, high nonlinear refraction index and ultrafast relaxation time. Their narrow energy band gaps predict their optical absorption ability further into the mid-infrared region and their possibility to be very broadband light modulators ranging from the visible to the mid-infrared region. In this paper, a mid-infrared mode-locked fluoride fiber laser with TI Bi2Te3 nano-sheets as the saturable absorber is presented. Continuous wave lasing, Q-switched and continuous-wave mode-locking (CW-ML) operations of the laser are observed sequentially by increasing the pump power. The observed CW-ML pulse train has a pulse repetition rate of 10.4 MHz, a pulse width of ~6 ps, and a center wavelength of 2830 nm. The maximum achievable pulse energy is 8.6 nJ with average power up to 90 mW. This work forcefully demonstrates the promising applications of two-dimensional TIs for ultra-short laser operation and nonlinear optics in the mid-infrared region.

preprint2013arXiv

Feature Learning with Gaussian Restricted Boltzmann Machine for Robust Speech Recognition

In this paper, we first present a new variant of Gaussian restricted Boltzmann machine (GRBM) called multivariate Gaussian restricted Boltzmann machine (MGRBM), with its definition and learning algorithm. Then we propose using a learned GRBM or MGRBM to extract better features for robust speech recognition. Our experiments on Aurora2 show that both GRBM-extracted and MGRBM-extracted feature performs much better than Mel-frequency cepstral coefficient (MFCC) with either HMM-GMM or hybrid HMM-deep neural network (DNN) acoustic model, and MGRBM-extracted feature is slightly better.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Machine Learning Artificial Intelligence Computation and Language physics.atom-ph physics.optics quant-ph Computer Vision Methodology physics.ins-det Robotics Software Engineering Sound

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2605.10541:author:5:xin-zheng

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.19822:author:2:xin-zheng

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2604.27467:author:2:xin-zheng

Imported May 20, 2026Synced May 20, 2026

2 works

Feng Xia

Researcher

Feng Xia contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Hao Yu

Researcher

Hao Yu contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Ke Yin

Researcher

Ke Yin contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Shimon Kolkowitz

Researcher

Shimon Kolkowitz contributes to research discovery and scholarly infrastructure.

Open to collaborate

Xin Zheng

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Bridging Sequence and Graph Structure for Epigenetic Age Prediction

ScaleBox: Enabling High-Fidelity and Scalable Code Verification for Large Language Models

ST-TGExplainer: Disentangling Stability and Transition Patterns for Temporal GNN Interpretability

Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation

Efficient LiDAR Odometry for Autonomous Driving

High precision differential clock comparisons with a multiplexed optical lattice clock

Towards Faithfulness in Open Domain Table-to-text Generation from an Entity-centric View

Impacts of random filling on spin squeezing via Rydberg dressing in optical clocks

Quantile Treatment Effects and Bootstrap Inference under Covariate-Adaptive Randomization

Application of Deep Neural Network in Estimation of the Weld Bead Parameters

Black phosphorus as a new broadband saturable absorber for infrared passively Q-switched fiber lasers

Mid-infrared ultra-short mode-locked fiber laser utilizing topological insulator Bi2Te3 nano-sheets as the saturable absorber

Feature Learning with Gaussian Restricted Boltzmann Machine for Robust Speech Recognition