Source author record

Ruixiang Zhang

Ruixiang Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.CA Machine Learning math.AP math.DG Computer Vision math-ph math.CO math.MP Artificial Intelligence Computation and Language gr-qc math.NT math.ST Multimedia Robotics Software Engineering Statistics Theory

Catalog footprint

What is connected

16works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Primal Generation, Dual Judgment: Self-Training from Test-Time Scaling

Code generation is typically trained in the primal space of programs: a model produces a candidate solution and receives sparse execution feedback, often a single pass/fail bit. Test-time scaling enriches the inference procedure by sampling multiple candidates and judging among them, but the comparative information this process reveals is discarded after inference. We argue that this information defines a dual judgment space that provides a far richer training signal: the model learns not from an isolated success or failure, but from the relative correctness structure across its own plausible attempts, identifying which succeed, which fail, and what distinguishes them. We introduce DuST (Dual Self-Training), a framework for self-training from the dual judgment space. DuST samples candidate programs from the model's own distribution, labels them through sandbox execution, retains groups containing both successes and failures, and trains the model to rank candidates by execution correctness using GRPO. The objective is purely discriminative: the model is never directly rewarded for generating correct programs. Dual self-training improves both judgment and generation. Across five models spanning two families and three scales (4B to 30B), DuST consistently improves Best-of-4 test-time scaling on LiveCodeBench. For Qwen3-30B-Thinking on LiveCodeBench v6, judgment quality improves by +6.2 NDCG, single-sample pass@1 improves by +3.1, and Best-of-4 accuracy improves by +4.1. The trained model's single rollout matches the base model's Best-of-4 performance. SFT on the same ranking data improves judgment without improving generation, confirming that on-policy RL is the mechanism that transfers dual-space learning back into primal generation.

preprint2024arXiv

Oscillatory integral operators on manifolds and related Kakeya and Nikodym problems

We consider Carleson-Sjölin operators on Riemannian manifolds that arise naturally from the study of Bochner-Riesz problems on manifolds. They are special cases of Hörmander-type oscillatory integral operators. We obtain improved $L^p$ bounds of Carleson-Sjölin operators in two cases: The case where the underlying manifold has constant sectional curvature and the case where the manifold satisfies Sogge's chaotic curvature condition. The two results rely on very different methods: To prove the former result, we show that on a Riemannian manifold, the distance function satisfies Bourgain's condition if and only if the manifold has constant sectional curvature. To obtain the second result, we introduce the notion of "contact orders" to Hörmander-type oscillatory integral operators, prove that if a Hörmander-type oscillatory integral operator is of a finite contact order, then it always has better $L^p$ bounds than "worst cases" (in spirit of Bourgain and Guth, and Guth, Hickman and Iliopoulou), and eventually verify that for Riemannian manifolds that satisfy Sogge's chaotic curvature condition, their distance functions alway have finite contact orders. As byproducts, we obtain new bounds for Nikodym maximal functions on manifolds of constant sectional curvatures.

preprint2022arXiv

Learning Representation from Neural Fisher Kernel with Low-rank Approximation

In this paper, we study the representation of neural networks from the view of kernels. We first define the Neural Fisher Kernel (NFK), which is the Fisher Kernel applied to neural networks. We show that NFK can be computed for both supervised and unsupervised learning models, which can serve as a unified tool for representation extraction. Furthermore, we show that practical NFKs exhibit low-rank structures. We then propose an efficient algorithm that computes a low rank approximation of NFK, which scales to large datasets and networks. We show that the low-rank approximation of NFKs derived from unsupervised generative models and supervised learning models gives rise to high-quality compact representations of data, achieving competitive results on a variety of machine learning tasks.

preprint2022arXiv

Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple Feature Aggregation

LiDAR sensor is essential to the perception system in autonomous vehicles and intelligent robots. To fulfill the real-time requirements in real-world applications, it is necessary to efficiently segment the LiDAR scans. Most of previous approaches directly project 3D point cloud onto the 2D spherical range image so that they can make use of the efficient 2D convolutional operations for image segmentation. Although having achieved the encouraging results, the neighborhood information is not well-preserved in the spherical projection. Moreover, the temporal information is not taken into consideration in the single scan segmentation task. To tackle these problems, we propose a novel approach to semantic segmentation for LiDAR sequences named Meta-RangeSeg, where a new range residual image representation is introduced to capture the spatial-temporal information. Specifically, Meta-Kernel is employed to extract the meta features, which reduces the inconsistency between the 2D range image coordinates input and 3D Cartesian coordinates output. An efficient U-Net backbone is used to obtain the multi-scale features. Furthermore, Feature Aggregation Module (FAM) strengthens the role of range channel and aggregates features at different levels. We have conducted extensive experiments for performance evaluation on SemanticKITTI and SemanticPOSS. The promising results show that our proposed Meta-RangeSeg method is more efficient and effective than the existing approaches. Our full implementation is publicly available at https://github.com/songw-zju/Meta-RangeSeg .

preprint2022arXiv

On the multiparameter Falconer distance problem

We study an extension of the Falconer distance problem in the multiparameter setting. Given $\ell\geq 1$ and $\mathbb{R}^{d}=\mathbb{R}^{d_1}\times\cdots \times\mathbb{R}^{d_\ell}$, $d_i\geq 2$. For any compact set $E\subset \mathbb{R}^{d}$ with Hausdorff dimension larger than $d-\frac{\min(d_i)}{2}+\frac{1}{4}$ if $\min(d_i) $ is even, $d-\frac{\min(d_i)}{2}+\frac{1}{4}+\frac{1}{4\min(d_i)}$ if $\min(d_i) $ is odd, we prove that the multiparameter distance set of $E$ has positive $\ell$-dimensional Lebesgue measure. A key ingredient in the proof is a new multiparameter radial projection theorem for fractal measures.

preprint2022arXiv

Rank-Constrained Least-Squares: Prediction and Inference

In this work, we focus on the high-dimensional trace regression model with a low-rank coefficient matrix. We establish a nearly optimal in-sample prediction risk bound for the rank-constrained least-squares estimator under no assumptions on the design matrix. Lying at the heart of the proof is a covering number bound for the family of projection operators corresponding to the subspaces spanned by the design. By leveraging this complexity result, we perform a power analysis for a permutation test on the existence of a low-rank signal under the high-dimensional trace regression model. We show that the permutation test based on the rank-constrained least-squares estimator achieves non-trivial power with no assumptions on the minimum (restricted) eigenvalue of the covariance matrix of the design. Finally, we use alternating minimization to approximately solve the rank-constrained least-squares problem to evaluate its empirical in-sample prediction risk and power of the resulting permutation test in our numerical study.

preprint2022arXiv

The Brascamp-Lieb inequality and its influence on Fourier analysis

Brascamp-Lieb inequalities have been important in analysis, mathematical physics and neighboring areas. Recently, these inequalities have had a deep influence on Fourier analysis and, in particular, on Fourier restriction theory. In this article we motivate and explain this connection. A lot of our examples are taken from a rapidly developing subarea called "decoupling". It is the author's hope that this article will be accessible to graduate students in fields broadly related to analysis.

preprint2021arXiv

Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models

AI Safety is a major concern in many deep learning applications such as autonomous driving. Given a trained deep learning model, an important natural problem is how to reliably verify the model's prediction. In this paper, we propose a novel framework -- deep verifier networks (DVN) to verify the inputs and outputs of deep discriminative models with deep generative models. Our proposed model is based on conditional variational auto-encoders with disentanglement constraints. We give both intuitive and theoretical justifications of the model. Our verifier network is trained independently with the prediction model, which eliminates the need of retraining the verifier network for a new model. We test the verifier network on out-of-distribution detection and adversarial example detection problems, as well as anomaly detection problems in structured prediction tasks such as image caption generation. We achieve state-of-the-art results in all of these problems.

preprint2020arXiv

A sharp square function estimate for the cone in $\mathbb{R}^3$

We prove a sharp square function estimate for the cone in $\mathbb{R}^3$ and consequently the local smoothing conjecture for the wave equation in $2+1$ dimensions.

preprint2020arXiv

Perceptual Generative Autoencoders

Modern generative models are usually designed to match target distributions directly in the data space, where the intrinsic dimension of data can be much lower than the ambient dimension. We argue that this discrepancy may contribute to the difficulties in training generative models. We therefore propose to map both the generated and target distributions to a latent space using the encoder of a standard autoencoder, and train the generator (or decoder) to match the target distribution in the latent space. Specifically, we enforce the consistency in both the data space and the latent space with theoretically justified data and latent reconstruction losses. The resulting generative model, which we call a perceptual generative autoencoder (PGA), is then trained with a maximum likelihood or variational autoencoder (VAE) objective. With maximum likelihood, PGAs generalize the idea of reversible generative models to unrestricted neural network architectures and arbitrary number of latent dimensions. When combined with VAEs, PGAs substantially improve over the baseline VAEs in terms of sample quality. Compared to other autoencoder-based generative models using simple priors, PGAs achieve state-of-the-art FID scores on CIFAR-10 and CelebA.

preprint2020arXiv

Polynomial Blow-up Upper Bounds for the Einstein-scalar field System Under Spherical Symmetry

For general gravitational collapse, inside the black-hole region, singularities $(r=0)$ may arise. In this article, we aim to answer how strong these singularities could be. We analyse the behaviours of various geometric quantities. In particular, we show that in the most singular scenario, the Kretschmann scalar obeys polynomial blow-up upper bounds $O(1/r^N)$. This improves previously best-known double-exponential upper bounds $O\big(\exp\exp(1/r)\big)$. Our result is sharp in the sense that there are known examples showing that no sub-polynomial upper bound could hold. Finally we do a case study on perturbations of the Schwarzschild solution.

preprint2019arXiv

Lower bounds for estimates of the Schrödinger maximal function

We give new lower bounds for $L^p$ estimates of the Schrödinger maximal function by generalizing an example of Bourgain.

preprint2016arXiv

The Least Number with Prescribed Legendre Symbols

In this article we estimate the number of integers up to $X$ which can be represented by a positive-definite, binary integral quadratic form of discriminant which is small relative to $X$. This follows from understanding the vector of signs when computing the Legendre symbol of small integers $n$ at multiple primes.

preprint2015arXiv

Bounds of incidences between points and algebraic curves

We prove new bounds on the number of incidences between points and higher degree algebraic curves. The key ingredient is an improved initial bound, which is valid for all fields. Then we apply the polynomial method to obtain global bounds on $\mathbb{R}$ and $\mathbb{C}$.

preprint2014arXiv

On sharp local turns of planar polynomials

We show that for a real polynomial of degree $n$ in two variables $x$ and $y$, any local "sharp turn" must have its "size" $\gtrsim e^{-Cn^{2}}$. We also show that there is indeed an example that has a sharp turn of size $\lesssim e^{-Cn}$. This gives a quite satisfactory answer to a problem raised by Guth. The problem was inspired by applications of the polynomial method in the study of Kakeya conjecture.

preprint2014arXiv

Polynomials with dense zero sets and discrete models of the Kakeya conjecture and the Furstenberg set problem

We prove the discrete analogue of Kakeya conjecture over $\mathbb{R}^n$. This result suggests that a (hypothetically) low dimensional Kakeya set cannot be constructed directly from discrete configurations. We also prove a generalization which completely solves the discrete analogue of the Furstenberg set problem in all dimensions. The difference between our theorems and the (true) problems is only the (still difficult) issue of continuity since no transversality-at-incidences assumptions are imposed. The main tool of the proof is a theorem of Wongkew \cite{wongkew2003volumes} which states that a low degree polynomial cannot have its zero set being too dense inside the unit cube, coupled with Dvir-type polynomial arguments \cite{dvir2009size}. From the viewpoint of the proofs, we also state a conjecture that is stronger than and almost equivalent to the (lower) Minkowski version of the Kakeya conjecture and prove some results towards it. We also present our own version of the proof of the theorem in \cite{wongkew2003volumes}. Our proof shows that this theorem follows from a combination of properties of zero sets of polynomials and a general proposition about hypersurfaces which might be of independent interest. Finally, we discuss how to generalize Bourgain's conjecture to high dimensions, which is closely related to the theme here.

Ruixiang Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Primal Generation, Dual Judgment: Self-Training from Test-Time Scaling

Oscillatory integral operators on manifolds and related Kakeya and Nikodym problems

Learning Representation from Neural Fisher Kernel with Low-rank Approximation

Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple Feature Aggregation

On the multiparameter Falconer distance problem

Rank-Constrained Least-Squares: Prediction and Inference

The Brascamp-Lieb inequality and its influence on Fourier analysis

Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models

A sharp square function estimate for the cone in $\mathbb{R}^3$

Perceptual Generative Autoencoders

Polynomial Blow-up Upper Bounds for the Einstein-scalar field System Under Spherical Symmetry

Lower bounds for estimates of the Schrödinger maximal function

The Least Number with Prescribed Legendre Symbols

Bounds of incidences between points and algebraic curves

On sharp local turns of planar polynomials

Polynomials with dense zero sets and discrete models of the Kakeya conjecture and the Furstenberg set problem