Researcher profile

Ruixiang Zhang

Ruixiang Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
15topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

Primal Generation, Dual Judgment: Self-Training from Test-Time Scaling

Code generation is typically trained in the primal space of programs: a model produces a candidate solution and receives sparse execution feedback, often a single pass/fail bit. Test-time scaling enriches the inference procedure by sampling multiple candidates and judging among them, but the comparative information this process reveals is discarded after inference. We argue that this information defines a dual judgment space that provides a far richer training signal: the model learns not from an isolated success or failure, but from the relative correctness structure across its own plausible attempts, identifying which succeed, which fail, and what distinguishes them. We introduce DuST (Dual Self-Training), a framework for self-training from the dual judgment space. DuST samples candidate programs from the model's own distribution, labels them through sandbox execution, retains groups containing both successes and failures, and trains the model to rank candidates by execution correctness using GRPO. The objective is purely discriminative: the model is never directly rewarded for generating correct programs. Dual self-training improves both judgment and generation. Across five models spanning two families and three scales (4B to 30B), DuST consistently improves Best-of-4 test-time scaling on LiveCodeBench. For Qwen3-30B-Thinking on LiveCodeBench v6, judgment quality improves by +6.2 NDCG, single-sample pass@1 improves by +3.1, and Best-of-4 accuracy improves by +4.1. The trained model's single rollout matches the base model's Best-of-4 performance. SFT on the same ranking data improves judgment without improving generation, confirming that on-policy RL is the mechanism that transfers dual-space learning back into primal generation.

preprint2024arXiv

Oscillatory integral operators on manifolds and related Kakeya and Nikodym problems

We consider Carleson-Sjölin operators on Riemannian manifolds that arise naturally from the study of Bochner-Riesz problems on manifolds. They are special cases of Hörmander-type oscillatory integral operators. We obtain improved $L^p$ bounds of Carleson-Sjölin operators in two cases: The case where the underlying manifold has constant sectional curvature and the case where the manifold satisfies Sogge's chaotic curvature condition. The two results rely on very different methods: To prove the former result, we show that on a Riemannian manifold, the distance function satisfies Bourgain's condition if and only if the manifold has constant sectional curvature. To obtain the second result, we introduce the notion of "contact orders" to Hörmander-type oscillatory integral operators, prove that if a Hörmander-type oscillatory integral operator is of a finite contact order, then it always has better $L^p$ bounds than "worst cases" (in spirit of Bourgain and Guth, and Guth, Hickman and Iliopoulou), and eventually verify that for Riemannian manifolds that satisfy Sogge's chaotic curvature condition, their distance functions alway have finite contact orders. As byproducts, we obtain new bounds for Nikodym maximal functions on manifolds of constant sectional curvatures.

preprint2022arXiv

Learning Representation from Neural Fisher Kernel with Low-rank Approximation

In this paper, we study the representation of neural networks from the view of kernels. We first define the Neural Fisher Kernel (NFK), which is the Fisher Kernel applied to neural networks. We show that NFK can be computed for both supervised and unsupervised learning models, which can serve as a unified tool for representation extraction. Furthermore, we show that practical NFKs exhibit low-rank structures. We then propose an efficient algorithm that computes a low rank approximation of NFK, which scales to large datasets and networks. We show that the low-rank approximation of NFKs derived from unsupervised generative models and supervised learning models gives rise to high-quality compact representations of data, achieving competitive results on a variety of machine learning tasks.

preprint2022arXiv

Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple Feature Aggregation

LiDAR sensor is essential to the perception system in autonomous vehicles and intelligent robots. To fulfill the real-time requirements in real-world applications, it is necessary to efficiently segment the LiDAR scans. Most of previous approaches directly project 3D point cloud onto the 2D spherical range image so that they can make use of the efficient 2D convolutional operations for image segmentation. Although having achieved the encouraging results, the neighborhood information is not well-preserved in the spherical projection. Moreover, the temporal information is not taken into consideration in the single scan segmentation task. To tackle these problems, we propose a novel approach to semantic segmentation for LiDAR sequences named Meta-RangeSeg, where a new range residual image representation is introduced to capture the spatial-temporal information. Specifically, Meta-Kernel is employed to extract the meta features, which reduces the inconsistency between the 2D range image coordinates input and 3D Cartesian coordinates output. An efficient U-Net backbone is used to obtain the multi-scale features. Furthermore, Feature Aggregation Module (FAM) strengthens the role of range channel and aggregates features at different levels. We have conducted extensive experiments for performance evaluation on SemanticKITTI and SemanticPOSS. The promising results show that our proposed Meta-RangeSeg method is more efficient and effective than the existing approaches. Our full implementation is publicly available at https://github.com/songw-zju/Meta-RangeSeg .

preprint2022arXiv

On the multiparameter Falconer distance problem

We study an extension of the Falconer distance problem in the multiparameter setting. Given $\ell\geq 1$ and $\mathbb{R}^{d}=\mathbb{R}^{d_1}\times\cdots \times\mathbb{R}^{d_\ell}$, $d_i\geq 2$. For any compact set $E\subset \mathbb{R}^{d}$ with Hausdorff dimension larger than $d-\frac{\min(d_i)}{2}+\frac{1}{4}$ if $\min(d_i) $ is even, $d-\frac{\min(d_i)}{2}+\frac{1}{4}+\frac{1}{4\min(d_i)}$ if $\min(d_i) $ is odd, we prove that the multiparameter distance set of $E$ has positive $\ell$-dimensional Lebesgue measure. A key ingredient in the proof is a new multiparameter radial projection theorem for fractal measures.

preprint2022arXiv

Rank-Constrained Least-Squares: Prediction and Inference

In this work, we focus on the high-dimensional trace regression model with a low-rank coefficient matrix. We establish a nearly optimal in-sample prediction risk bound for the rank-constrained least-squares estimator under no assumptions on the design matrix. Lying at the heart of the proof is a covering number bound for the family of projection operators corresponding to the subspaces spanned by the design. By leveraging this complexity result, we perform a power analysis for a permutation test on the existence of a low-rank signal under the high-dimensional trace regression model. We show that the permutation test based on the rank-constrained least-squares estimator achieves non-trivial power with no assumptions on the minimum (restricted) eigenvalue of the covariance matrix of the design. Finally, we use alternating minimization to approximately solve the rank-constrained least-squares problem to evaluate its empirical in-sample prediction risk and power of the resulting permutation test in our numerical study.

preprint2022arXiv

The Brascamp-Lieb inequality and its influence on Fourier analysis

Brascamp-Lieb inequalities have been important in analysis, mathematical physics and neighboring areas. Recently, these inequalities have had a deep influence on Fourier analysis and, in particular, on Fourier restriction theory. In this article we motivate and explain this connection. A lot of our examples are taken from a rapidly developing subarea called "decoupling". It is the author's hope that this article will be accessible to graduate students in fields broadly related to analysis.

preprint2021arXiv

Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models

AI Safety is a major concern in many deep learning applications such as autonomous driving. Given a trained deep learning model, an important natural problem is how to reliably verify the model's prediction. In this paper, we propose a novel framework -- deep verifier networks (DVN) to verify the inputs and outputs of deep discriminative models with deep generative models. Our proposed model is based on conditional variational auto-encoders with disentanglement constraints. We give both intuitive and theoretical justifications of the model. Our verifier network is trained independently with the prediction model, which eliminates the need of retraining the verifier network for a new model. We test the verifier network on out-of-distribution detection and adversarial example detection problems, as well as anomaly detection problems in structured prediction tasks such as image caption generation. We achieve state-of-the-art results in all of these problems.

preprint2020arXiv

Perceptual Generative Autoencoders

Modern generative models are usually designed to match target distributions directly in the data space, where the intrinsic dimension of data can be much lower than the ambient dimension. We argue that this discrepancy may contribute to the difficulties in training generative models. We therefore propose to map both the generated and target distributions to a latent space using the encoder of a standard autoencoder, and train the generator (or decoder) to match the target distribution in the latent space. Specifically, we enforce the consistency in both the data space and the latent space with theoretically justified data and latent reconstruction losses. The resulting generative model, which we call a perceptual generative autoencoder (PGA), is then trained with a maximum likelihood or variational autoencoder (VAE) objective. With maximum likelihood, PGAs generalize the idea of reversible generative models to unrestricted neural network architectures and arbitrary number of latent dimensions. When combined with VAEs, PGAs substantially improve over the baseline VAEs in terms of sample quality. Compared to other autoencoder-based generative models using simple priors, PGAs achieve state-of-the-art FID scores on CIFAR-10 and CelebA.

preprint2020arXiv

Polynomial Blow-up Upper Bounds for the Einstein-scalar field System Under Spherical Symmetry

For general gravitational collapse, inside the black-hole region, singularities $(r=0)$ may arise. In this article, we aim to answer how strong these singularities could be. We analyse the behaviours of various geometric quantities. In particular, we show that in the most singular scenario, the Kretschmann scalar obeys polynomial blow-up upper bounds $O(1/r^N)$. This improves previously best-known double-exponential upper bounds $O\big(\exp\exp(1/r)\big)$. Our result is sharp in the sense that there are known examples showing that no sub-polynomial upper bound could hold. Finally we do a case study on perturbations of the Schwarzschild solution.