Source author record

Yifeng Yu

Yifeng Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.AP math.DS Machine Learning math.NA math.PR physics.flu-dyn Artificial Intelligence Computation and Language math.MG math.OC math.ST nlin.PS Statistics Theory

Catalog footprint

What is connected

18works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

BubbleSpec: Turning Long-Tail Bubbles into Speculative Rollout Drafts for Synchronous Reinforcement Learning

Reinforcement Learning (RL) has become a cornerstone for improving the performance of Large Language Models (LLMs). However, its rollout phase constitutes a significant efficiency bottleneck, mainly arising from the long-tail bubbles across data parallel ranks, particularly in long-context scenarios where faster GPUs remain idle while waiting for stragglers. Existing solutions, such as partial rollout or asynchronous RL, mitigate these bubbles by compromising the algorithm's strict synchronous nature. Instead, we propose BubbleSpec, a novel framework that accelerates RL rollouts while strictly keeping the mathematical exactness. Instead of attempting to eliminate bubbles, BubbleSpec exploits them. We exploit the idle time windows of faster ranks to pre-generate rollout results for subsequent steps, serving as drafts for speculative decoding. Unlike prior speculative methods that rely on historical epoch similarity and warm-ups, BubbleSpec is agnostic to dataset size and provides immediate acceleration from the onset of training. Extensive evaluations demonstrate that BubbleSpec reduces decoding steps by 50% and increases rollout throughput by up to 1.8x. Critically, BubbleSpec is seamlessly compatible with various RL frameworks and strategies as it sustains the strict synchronous property of RL algorithms.

preprint2026arXiv

Diffusion Models with Heavy-Tailed Targets: Score Estimation and Sampling Guarantees

Score-based diffusion models have become a powerful framework for generative modeling, with score estimation as a central statistical bottleneck. Existing guarantees for score estimation largely focus on light-tailed targets or rely on restrictive assumptions such as compact support, which are often violated by heavy-tailed data in practice. In this work, we study conventional (Gaussian) score-based diffusion models when the target distribution is heavy-tailed and belongs to a Sobolev class with smoothness parameter $β>0$. We consider both exponential and polynomial tail decay, indexed by a tail parameter $γ$. Using kernel density estimation, we derive sharp minimax rates for score estimation, revealing a qualitative dichotomy: under exponential tails, the rate matches the light-tailed case up to polylogarithmic factors, whereas under polynomial tails the rate depends explicitly on $γ$. We further provide sampling guarantees for the associated continuous reverse dynamics. In total variation, the generated distribution converges at the minimax optimal rate $n^{-β/(2β+d)}$ under exponential tails (up to logarithmic factors), and at a $γ$-dependent rate under polynomial tails. Whether the latter sampling rate is minimax optimal remains an open question. These results characterize the statistical limits of score estimation and the resulting sampling accuracy for heavy-tailed targets, extending diffusion theory beyond the light-tailed setting.

preprint2026arXiv

InfoLaw: Information Scaling Laws for Large Language Models with Quality-Weighted Mixture Data and Repetition

Upweighting high-quality data in LLM pretraining often improves performance, but in datalimited regimes, especially under overtraining, stronger upweighting increases repetition and can degrade performance. However, standard scaling laws do not reliably extrapolate across mixture recipes or under repetitions, making the selection for optimal data recipes at scaling underdetermined. To solve this, we introduce InfoLaw (Information Scaling Laws), a data-aware scaling framework that predicts loss from consumed tokens, model size, data mixture weights, and repetition. The key idea is to model pretraining as information accumulation, where quality controls information density and repetition induces scaledependent diminishing returns. We first collect the model performance after training on datasets that vary in scale, quality distribution, and repetition level. Then we build up the modeling for information so that information accurately predicts those model performance. InfoLaw predicts performance on unseen data recipes and larger scale runs (up to 7B, 425B tokens) with 0.15% mean and 0.96% max absolute error in loss, and it extrapolates reliably across overtraining levels, enabling efficient data-recipe selection under varying compute budgets.

preprint2026arXiv

On the Limits of Latent Reuse in Diffusion Models

Diffusion models are often trained in low-dimensional latent spaces, which are then reused for related but shifted datasets. In this work, we study when such latent reuse remains reliable under distribution shift. We consider a source-target setting in which both datasets are approximately low-dimensional but may lie near different subspaces. We show that freezing and reusing a source latent space induces a target-domain score error governed by two quantities: the principal-angle misalignment between the source and target subspaces, and the target ambient noise amplified by the diffusion time scale. Motivated by these limits, we further study mixed source-target training and characterize how the required shared latent dimension depends on the relative geometry of the two distributions. Our results provide theoretical guidance on when latent reuse is reliable and when learning a shared representation may be necessary.

preprint2022arXiv

Differentiability of effective fronts in the continuous setting in two dimensions

We study the effective front associated with first-order front propagations in two dimensions ($n=2$) in the periodic setting with continuous coefficients. Our main result says that that the boundary of the effective front is differentiable at every irrational point. Equivalently, the stable norm associated with a continuous $\mathbb{Z}^2$-periodic Riemannian metric is differentiable at irrational points. This conclusion was obtained decades ago for smooth metrics ([3,5]). To the best of our knowledge, our result provides the first nontrivial property of the effective fronts in the continuous setting, which is the standard assumption in the PDE theory. Combining with the sufficiency result in [12], our result implies that for continuous coefficients, a polygon could be an effective front if and only if it is centrally symmetric with rational vertices and nonempty interior.

preprint2022arXiv

Optimal convergence rate for periodic homogenization of convex Hamilton-Jacobi equations

In this paper, we show that the rate of convergence in periodic homogenization of convex Hamilton-Jacobi equations is always $O(\varepsilon)$, which is optimal. This is a natural extension of a result concerning stable norms in metric geometry [4] that is essentially equivalent to the homogenization of convex static Hamilton-Jacobi equations. Another extremely interesting question in this direction is whether the $O(\varepsilon)$ rate holds in the nonconvex setting. We present a special nonconvex example with $O(\varepsilon)$ convergence rate, which relies on identifying the shape of the effective Hamiltonian and game theory interpretation formulas.

preprint2022arXiv

Remarks on optimal rates of convergence in periodic homogenization of linear elliptic equations in non-divergence form

We study and characterize the optimal rates of convergence in periodic homogenization of linear elliptic equations in non-divergence form. We obtain that the optimal rate of convergence is either $O(\varepsilon)$ or $O(\varepsilon^2)$ depending on the diffusion matrix $A$, source term $f$, and boundary data $g$. Moreover, we show that the set of diffusion matrices $A$ that give optimal rate $O(\varepsilon)$ is open and dense in the set of $C^{2,α}$ periodic, symmetric, and positive definite matrices, which means that generically, the optimal rate is $O(\varepsilon)$.

preprint2020arXiv

High Degeneracy of Effective Hamiltonian in Two Dimensions

Consider the effective Hamiltonian $\overline H(p)$ associated with the mechanical Hamiltonian $H(p,x)={1\over 2}|p|^2+V(x)$. We prove that for generic $V$, $\overline H$ is piecewise 1d in a dense open set in two dimensions using Aubry-Mather theory.

preprint2016arXiv

Ballistic Orbits and Front Speed Enhancement for ABC Flows

We study the two main types of trajectories of the ABC flow in the near-integrable regime: spiral orbits and edge orbits. The former are helical orbits which are perturbations of similar orbits that exist in the integrable regime, while the latter exist only in the non-integrable regime. We prove existence of ballistic (i.e., linearly growing) spiral orbits by using the contraction mapping principle in the Hamiltonian formulation, and we also find and analyze ballistic edge orbits. We discuss the relationship of existence of these orbits with questions concerning front propagation in the presence of flows, in particular, the question of linear (i.e., maximal possible) front speed enhancement rate for ABC flows.

preprint2016arXiv

Periodic orbits of the ABC flow with $A=B=C=1$

In this paper, we prove that the ODE system $$ \begin{align*} \dot x &=\sin z+\cos y\\ \dot y &= \sin x+\cos z\\ \dot z &=\sin y + \cos x, \end{align*} $$ whose right-hand side is the Arnold-Beltrami-Childress (ABC) flow with parameters $A=B=C=1$, has periodic orbits on $(2π\mathbb T)^3$ with rotation vectors parallel to $(1,0,0)$, $(0,1,0)$, and $(0,0,1)$. An application of this result is that the well-known G-equation model for turbulent combustion with this ABC flow on $\mathbb R^3$ has a linear (i.e., maximal possible) flame speed enhancement rate as the amplitude of the flow grows.

preprint2015arXiv

Performance analysis and signal design for a stationary signalized ring road

Existing methods for traffic signal design are either too simplistic to capture realistic traffic characteristics or too complicated to be mathematically tractable. In this study, we attempts to fill the gap by presenting a new method based on the LWR model for performance analysis and signal design in a stationary signalized ring road. We first solve the link transmission model to obtain an equation for the boundary flow in stationary states, which are defined to be time-periodic solutions in both flow-rate and density with a period of the cycle length. We then derive an explicit macroscopic fundamental diagram (MFD), in which the average flow-rate in stationary states is a function of both traffic density and signal settings. Finally we present simple formulas for optimal cycle lengths under five levels of congestion with a start-up lost time. With numerical examples we verify our analytical results and discuss the existence of near-optimal cycle lengths. This study lays the foundation for future studies on performance analysis and signal design for more general urban networks based on the kinematic wave theory.

preprint2015arXiv

Some inverse problems in periodic homogenization of Hamilton-Jacobi equations

We look at the effective Hamiltonian $\bar H$ associated with the Hamiltonian $H(p,x)=H(p)+V(x)$ in the periodic homogenization theory. Our central goal is to understand the relation between $V$ and $\bar H$. We formulate some inverse problems concerning this relation. Such type of inverse problems are in general very challenging. In the paper, we discuss several special cases in both convex and nonconvex settings.

preprint2013arXiv

Stochastic homogenization of a nonconvex Hamilton-Jacobi equation

We present a proof of qualitative stochastic homogenization for a nonconvex Hamilton-Jacobi equation. The new idea is to introduce a family of "sub-equations" and to control solutions of the original equation by the maximal subsolutions of the latter, which have deterministic limits by the subadditive ergodic theorem and maximality.

preprint2012arXiv

A Numerical Study of Turbulent Flame Speeds of Curvature and Strain G-equations in Cellular Flows

We study front speeds of curvature and strain G-equations arising in turbulent combustion. These G-equations are Hamilton-Jacobi type level set partial differential equations (PDEs) with non-coercive Hamiltonians and degenerate nonlinear second order diffusion. The Hamiltonian of strain G-equation is also non-convex. Numerical computation is performed based on monotone discretization and weighted essentially nonoscillatory (WENO) approximation of transformed G-equations on a fixed periodic domain. The advection field in the computation is a two dimensional Hamiltonian flow consisting of a periodic array of counter-rotating vortices, or cellular flows. Depending on whether the evolution is predominantly in the hyperbolic or parabolic regimes, suitable explicit and semi-implicit time stepping methods are chosen. The turbulent flame speeds are computed as the linear growth rates of large time solutions. A new nonlinear parabolic PDE is proposed for the reinitialization of level set functions to prevent piling up of multiple bundles of level sets on the periodic domain. We found that the turbulent flame speed $s_T$ of the curvature G-equation is enhanced as the intensity $A$ of cellular flows increases, at a rate between those of the inviscid and viscous G-equations. The $s_T$ of the strain G-equation increases in small $A$, decreases in larger $A$, then drops down to zero at a large enough but finite value $A_{*}$. The flame front ceases to propagate at this critical intensity $A_*$, and is quenched by the cellular flow.

preprint2012arXiv

Nonuniqueness of infinity ground states

In this paper, we construct a dumbbell domain for which the associated principle $\infty$-eigenvalue is not simple. This gives a negative answer to the outstanding problem posed by Juutinen-Lindquivst-Manfredi ("The $\infty$-eigenvalue problem", Arch. Ration. Mech. Anal. 148, 1999, no.2, 89-105). It remains a challenge to determine whether simplicity holds for convex domains.

preprint2012arXiv

Turbulent Flame Speeds of G-equation Models in Unsteady Cellular Flows

We perform a computationl study of front speeds of G-equation models in time dependent cellular flows. The G-equations arise in premixed turbulent combustion, and are Hamilton-Jacobi type level set partial differential equations (PDEs). The curvature-strain G equations are also non-convex with degenerate diffusion. The computation is based on monotone finite difference discretization and weighted essentially nonoscillatory (WENO) methods. We found that the large time front speeds lock into the frequency of time periodic cellular flows in curvature-strain G-equations similar to what occurs in the basic inviscid G-equation. However, such frequency locking phenomenon disappears in viscous G-equation, and in the inviscid G-equation if time periodic oscillation of the cellular flow is replaced by time stochastic oscillation.

preprint2011arXiv

Analysis and Comparison of Large Time Front Speeds in Turbulent Combustion Models

Predicting turbulent flame speed (the large time front speed) is a fundamental problem in turbulent combustion theory. Several models have been proposed to study the turbulent flame speed, such as the G-equations, the F-equations (Majda-Souganidis model) and reaction-diffusion-advection (RDA) equations. In the first part of this paper, we show that flow induced strain reduces front speeds of G-equations in periodic compressible and shear flows. The F-equations arise in asymptotic analysis of reaction-diffusion-advection equations and are quadratically nonlinear analogues of the G-equations. In the second part of the paper, we compare asymptotic growth rates of the turbulent flame speeds from the G-equations, the F-equations and the RDA equations in the large amplitude ($A$) regime of spatially periodic flows. The F and G equations share the same asymptotic front speed growth rate; in particular, the same sublinear growth law $A\over \log(A)$ holds in cellular flows. Moreover, in two space dimensions, if one of these three models (G-equation, F-equation and the RDA equation) predicts the bending effect (sublinear growth in the large flow), so will the other two. The nonoccurrence of speed bending is characterized by the existence of periodic orbits on the torus and the property of their rotation vectors in the advective flow fields. The cat's eye flow is discussed as a typical example of directional dependence of the front speed bending. The large time front speeds of the viscous F-equation have the same growth rate as those of the inviscid F and G-equations in two dimensional periodic incompressible flows.

preprint2009arXiv

Uniqueness of values of Aronsson operators and running costs in "tug-of-war" games

Let $A_H$ be the Aronsson operator associated with a Hamiltonian $H(x,z,p).$ Aronsson operators arise from $L^\infty$ variational problems, two person game theory, control problems, etc. In this paper, we prove, under suitable conditions, that if $u\in W^{1,\infty}_{\rm loc}(Ω)$ is simultaneously a viscosity solution of both of the equations $A_H(u)=f(x)$ and $A_H(u)=g(x)$ in $Ω$, where $f, g\in C(Ω),$ then $f=g.$ The assumption $u\in W_{loc}^{1,\infty}(Ω)$ can be relaxed to $u\in C(Ω)$ in many interesting situations. Also, we prove that if $f,g,u\in C(Ω)$ and $u$ is simultaneously a viscosity solution of the equations ${Δ_\infty u\over |Du|^2}=-f(x)$ and ${Δ_{\infty}u\over |Du|^2}=-g(x)$ in $Ω$ then $f=g.$ This answers a question posed in Peres, Schramm, Scheffield and Wilson [PSSW] concerning whether or not the value function uniquely determines the running cost in the "tug-of-war" game.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

math.AP math.DS Machine Learning math.NA math.PR physics.flu-dyn Artificial Intelligence Computation and Language math.MG math.OC math.ST nlin.PS Statistics Theory

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2605.08862:author:5:yifeng-yu

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.13448:author:1:yifeng-yu

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.02364:author:8:yifeng-yu

Imported May 20, 2026Synced May 20, 2026

5 works

Hung V. Tran

Researcher

Hung V. Tran contributes to research discovery and scholarly infrastructure.

Open to collaborate

5 works

Jack Xin

Researcher

Jack Xin contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Andrej Zlatoš

Researcher

Andrej Zlatoš contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Yu-Yu Liu

Researcher

Yu-Yu Liu contributes to research discovery and scholarly infrastructure.

Open to collaborate

Yifeng Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

BubbleSpec: Turning Long-Tail Bubbles into Speculative Rollout Drafts for Synchronous Reinforcement Learning

Diffusion Models with Heavy-Tailed Targets: Score Estimation and Sampling Guarantees

InfoLaw: Information Scaling Laws for Large Language Models with Quality-Weighted Mixture Data and Repetition

On the Limits of Latent Reuse in Diffusion Models

Differentiability of effective fronts in the continuous setting in two dimensions

Optimal convergence rate for periodic homogenization of convex Hamilton-Jacobi equations

Remarks on optimal rates of convergence in periodic homogenization of linear elliptic equations in non-divergence form

High Degeneracy of Effective Hamiltonian in Two Dimensions

Ballistic Orbits and Front Speed Enhancement for ABC Flows

Periodic orbits of the ABC flow with $A=B=C=1$

Performance analysis and signal design for a stationary signalized ring road

Some inverse problems in periodic homogenization of Hamilton-Jacobi equations

Stochastic homogenization of a nonconvex Hamilton-Jacobi equation

A Numerical Study of Turbulent Flame Speeds of Curvature and Strain G-equations in Cellular Flows

Nonuniqueness of infinity ground states

Turbulent Flame Speeds of G-equation Models in Unsteady Cellular Flows

Analysis and Comparison of Large Time Front Speeds in Turbulent Combustion Models

Uniqueness of values of Aronsson operators and running costs in "tug-of-war" games