Source author record

Wenpin Tang

Wenpin Tang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.PR Machine Learning Artificial Intelligence math.ST Statistics Theory econ.GN q-fin.EC econ.TH math.AP math.CO math.OC q-fin.GN

Catalog footprint

What is connected

17works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DiFFPO: Training Diffusion LLMs to Reason Fast and Furious via Reinforcement Learning

We propose DiFFPO, Diffusion Fast and Furious Policy Optimization, a unified framework for training masked diffusion large language models (dLLMs) to reason not only better (furious), but also faster via reinforcement learning (RL). We first unify the existing baseline approach such as d1 by proposing to train surrogate policies via off-policy RL, whose likelihood is much more tractable as an approximation to the true dLLM policy. This naturally motivates a more accurate and informative two-stage likelihood approximation combined with importance sampling correction, which leads to generalized RL algorithms with better sample efficiency and superior task performance. Second, we propose a new direction of joint training efficient samplers/controllers of dLLMs policy. Via RL, we incentivize dLLMs' natural multi-token prediction capabilities by letting the model learn to adaptively allocate an inference threshold for each prompt. By jointly training the sampler, we yield better accuracies with lower number of function evaluations (NFEs) compared to training the model only, obtaining the best performance in improving the Pareto frontier of the inference-time compute of dLLMs. We showcase the effectiveness of our pipeline by training open source large diffusion language models over benchmark math and planning tasks.

preprint2026arXiv

Improved techniques for fine-tuning flow models via adjoint matching: a deterministic control pipeline

We propose a deterministic adjoint matching framework that formulates human preference alignment for flow-based generative models as an optimal control problem over velocity fields. One can directly regress the control toward a value-gradient-induced target under the current policy, leading to a simple and stable training objective. Building on this perspective, we introduce a truncated adjoint scheme that focuses computation on the terminal portion of the trajectory, where reward-relevant signals concentrate, which yields substantial computational savings while preserving alignment quality. We further generalize the framework beyond standard KL-based regularization, allowing more flexible trade-offs between alignment strength and distributional preservation. Experiments on SiT-XL/2 and FLUX.2-Klein-4B demonstrate consistent gains across multiple alignment metrics, along with substantially improved diversity and mode preservation.

preprint2026arXiv

RPO: Fine-Tuning Visual Generative Models via Rich Vision-Language Preferences

Traditional preference tuning methods for LLMs/Visual Generative Models often rely solely on reward model labeling, which can be opaque, offer limited insights into the rationale behind preferences, and are prone to issues such as reward hacking or overfitting. We introduce Rich Preference Optimization (RPO), a novel pipeline that leverages rich feedback signals from Vision Language Models (VLMs) to improve the curation of preference pairs for fine-tuning visual generative models like text-to-image diffusion models. Our approach begins with prompting VLMs to generate detailed critiques of synthesized images, from which we further prompt VLMs to extract reliable and actionable image editing instructions. By implementing these instructions, we create refined images, resulting in synthetic, informative preference pairs that serve as enhanced tuning datasets. We demonstrate the effectiveness of our pipeline and the resulting datasets in fine-tuning state-of-the-art diffusion models.

preprint2026arXiv

Tweedie's Formulae and Diffusion Generative Models Beyond Gaussian

Diffusion models have achieved remarkable success in generating samples from unknown data distributions. Most popular stochastic differential equation-based diffusion models perturb the target distribution by adding Gaussian noise, transforming it into a simple prior, and then use denoising score matching, a consequence of Tweedie's formula, to learn the score function and generate clean samples from noise. However, non-Gaussian diffusion models with state-dependent diffusion coefficient have been largely underexplored, as have the corresponding Tweedie's formulae. In this work, we extend Tweedie's formula to important non-Gaussian processes, including geometric Brownian motion (GBM), squared Bessel (BESQ) processes, and Cox-Ingersoll-Ross (CIR) processes, thereby yielding the corresponding denoising score-matching objectives. We then apply the derived formulae to image and financial time series generation using GBM- and CIR-based diffusion models, and to empirical Bayes estimation under the BESQ setting. The reported experimental results demonstrate the potential of non-Gaussian models.

preprint2024arXiv

Polynomial Voting Rules

We propose and study a new class of polynomial voting rules for a general decentralized decision/consensus system, and more specifically for the PoS (Proof of Stake) protocol. The main idea, inspired by the Penrose square-root law and the more recent quadratic voting rule, is to differentiate a voter's voting power and the voter's share (fraction of the total in the system). We show that while voter shares form a martingale process that converge to a Dirichlet distribution, their voting powers follow a super-martingale process that decays to zero over time. This prevents any voter from controlling the voting process, and thus enhances security. For both limiting results, we also provide explicit rates of convergence. When the initial total volume of votes (or stakes) is large, we show a phase transition in share stability (or the lack thereof), corresponding to the voter's initial share relative to the total. We also study the scenario in which trading (of votes/stakes) among the voters is allowed, and quantify the level of risk sensitivity (or risk averse) in three categories, corresponding to the voter's utility being a super-martingale, a sub-martingale, and a martingale. For each category, we identify the voter's best strategy in terms of participation and trading.

preprint2023arXiv

Fixed-Domain Asymptotics Under Vecchia's Approximation of Spatial Process Likelihoods

Statistical modeling for massive spatial data sets has generated a substantial literature on scalable spatial processes based upon Vecchia's approximation. Vecchia's approximation for Gaussian process models enables fast evaluation of the likelihood by restricting dependencies at a location to its neighbors. We establish inferential properties of microergodic spatial covariance parameters within the paradigm of fixed-domain asymptotics when they are estimated using Vecchia's approximation. The conditions required to formally establish these properties are explored, theoretically and empirically, and the effectiveness of Vecchia's approximation is further corroborated from the standpoint of fixed-domain asymptotics.

preprint2023arXiv

One-dependent colorings of the star graph

This paper is concerned with symmetric $1$-dependent colorings of the $d$-ray star graph $\mathscr{S}^d$ for $d \ge 2$. We compute the critical point of the $1$-dependent hard-core processes on $\mathscr{S}^d$, which gives a lower bound for the number of colors needed for a $1$-dependent coloring of $\mathscr{S}^d$. We provide an explicit construction of a $1$-dependent $q$-coloring for any $q \ge 5$ of the infinite subgraph $\mathscr{S}^3_{(1,1,\infty)}$, which is symmetric in the colors and whose restriction to any path is some symmetric $1$-dependent $q$-coloring. We also prove that there is no such coloring of $\mathscr{S}^3_{(1,1,\infty)}$ with $q = 4$ colors. A list of open problems are presented.

preprint2022arXiv

Escaping Saddle Points Efficiently with Occupation-Time-Adapted Perturbations

Motivated by the super-diffusivity of self-repelling random walk, which has roots in statistical physics, this paper develops a new perturbation mechanism for optimization algorithms. In this mechanism, perturbations are adapted to the history of states via the notion of occupation time. After integrating this mechanism into the framework of perturbed gradient descent (PGD) and perturbed accelerated gradient descent (PAGD), two new algorithms are proposed: perturbed gradient descent adapted to occupation time (PGDOT) and its accelerated version (PAGDOT). PGDOT and PAGDOT are shown to converge to second-order stationary points at least as fast as PGD and PAGD, respectively, and thus they are guaranteed to avoid getting stuck at non-degenerate saddle points. The theoretical analysis is corroborated by empirical studies in which the new algorithms consistently escape saddle points and outperform not only their counterparts, PGD and PAGD, but also other popular alternatives including stochastic gradient descent, Adam, AMSGrad, and RMSProp.

preprint2022arXiv

Stability of shares in the Proof of Stake Protocol -- Concentration and Phase Transitions

This paper is concerned with the stability of shares in a cryptocurrency where the new coins are issued according to the Proof of Stake protocol. We identify large, medium and small investors under various rewarding schemes, and show that the limiting behaviors of these investors are different -- for large investors their shares are stable, while for medium to small investors their shares may be volatile or even shrink to zero. For instance, with a geometric reward there is chaotic centralization, where all the shares will eventually concentrate on one investor in a random manner. This leads to the phase transition phenomenon, and the thresholds for stability are characterized. In response to the increasing activities in blockchain networks, we also propose and analyze a dynamical population model for the PoS protocol, which allows the number of investors to grow over the time. Numerical experiments are provided to corroborate our theory.

preprint2021arXiv

Simulated annealing from continuum to discretization: a convergence analysis via the Eyring--Kramers law

We study the convergence rate of continuous-time simulated annealing $(X_t; \, t \ge 0)$ and its discretization $(x_k; \, k =0,1, \ldots)$ for approximating the global optimum of a given function $f$. We prove that the tail probability $\mathbb{P}(f(X_t) > \min f +δ)$ (resp. $\mathbb{P}(f(x_k) > \min f +δ)$) decays polynomial in time (resp. in cumulative step size), and provide an explicit rate as a function of the model parameters. Our argument applies the recent development on functional inequalities for the Gibbs measure at low temperatures -- the Eyring-Kramers law. In the discrete setting, we obtain a condition on the step size to ensure the convergence.

preprint2020arXiv

Arcsine laws for random walks generated from random permutations with applications to genomics

A classical result for the simple symmetric random walk with $2n$ steps is that the number of steps above the origin, the time of the last visit to the origin, and the time of the maximum height all have exactly the same distribution and converge when scaled to the arcsine law. Motivated by applications in genomics, we study the distributions of these statistics for the non-Markovian random walk generated from the ascents and descents of a uniform random permutation and a Mallows($q$) permutation and show that they have the same asymptotic distributions as for the simple random walk. We also give an unexpected conjecture, along with numerical evidence and a partial proof in special cases, for the result that the number of steps above the origin by step $2n$ for the uniform permutation generated walk has exactly the same discrete arcsine distribution as for the simple random walk, even though the other statistics for these walks have very different laws. We also give explicit error bounds to the limit theorems using Stein's method for the arcsine distribution, as well as functional central limit theorems and a strong embedding of the Mallows$(q)$ permutation which is of independent interest.

preprint2020arXiv

Parallel Search for Information

We consider the problem of a decision-maker searching for information on multiple alternatives when information is learned on all alternatives simultaneously. The decision-maker has a running cost of searching for information, and has to decide when to stop searching for information and choose one alternative. The expected payoff of each alternative evolves as a diffusion process when information is being learned. We present necessary and sufficient conditions for the solution, establishing existence and uniqueness. We show that the optimal boundary where search is stopped (free boundary) is star-shaped, and present an asymptotic characterization of the value function and the free boundary. We show properties of how the distance between the free boundary and the diagonal varies with the number of alternatives, and how the free boundary under parallel search relates to the one under sequential search, with and without economies of scale on the search costs.

preprint2015arXiv

Patterns in random walks and Brownian motion

We ask if it is possible to find some particular continuous paths of unit length in linear Brownian motion. Beginning with a discrete version of the problem, we derive the asymptotics of the expected waiting time for several interesting patterns. These suggest corresponding results on the existence/non-existence of continuous paths embedded in Brownian motion. With further effort we are able to prove some of these existence and non-existence results by various stochastic analysis arguments. A list of open problems is presented.

preprint2015arXiv

The Slepian zero set, and Brownian bridge embedded in Brownian motion by a spacetime shift

This paper is concerned with various aspects of the Slepian process $(B_{t+1} - B_t, t \ge 0)$ derived from a one-dimensional Brownian motion $(B_t, t \ge 0 )$. In particular, we offer an analysis of the local structure of the Slepian zero set $\{t : B_{t+1} = B_t \}$, including a path decomposition of the Slepian process for $0 \le t \le 1$. We also establish the existence of a random time $T$ such that $T$ falls in the the Slepian zero set almost surely and the process $(B_{T+u} - B_T, 0 \le u \le 1)$ is standard Brownian bridge.

preprint2015arXiv

The Vervaat transform of Brownian bridges and Brownian motion

For a continuous function $f \in \mathcal{C}([0,1])$, define the Vervaat transform $V(f)(t):=f(τ(f)+t \mod1)+f(1)1_{\{t+τ(f) \geq 1\}}-f(τ(f))$, where $τ(f)$ corresponds to the first time at which the minimum of $f$ is attained. Motivated by recent study of quantile transforms of random walks and Brownian motion, we investigate the Vervaat transform of Brownian motion and Brownian bridges with arbitrary endpoints. When the two endpoints of the bridge are not the same, the Vervaat transform is not Markovian. We describe its distribution by path decomposition and study its semi-martingale property. The same study is done for the Vervaat transform of unconditioned Brownian motion, the expectation and variance of which are also derived.

preprint2014arXiv

A note on the frontier of a branching reflected Brownian motion

In this note, we study the asymptotical frontier behavior of a branching reflected Brownian motion. There is essentially no difference in maximal displacement between a branching Brownian motion and its reflected counterpart. We provide two proofs of this fact, one via a soft argument on the dependance of two-sided extremal particles in a branching Brownian motion and the other based on direct computations as in Roberts. The asymptotics of minimal displacement is also given.

preprint2013arXiv

On Vervaat transform of Brownian bridges and Brownian motion

For a continuous function $f \in \mathcal{C}([0,1])$, define the Vervaat transform $V(f)(t):=f(τ(f)+t \mod1)+f(1)1_{\{t+τ(f) \geq 1\}}-f(τ(f))$, where $τ(f)$ corresponds to the first time at which the minimum of $f$ is attained. Motivated by recent study of quantile transforms for random walks and Brownian motion, we study the Vervaat transform of Brownian motion and Brownian bridges with arbitary endpoints. When the two endpoints of the bridge are not the same, the Vervaat transform is not Markovian. We describe its distribution by path decompositions and study its semimartingale properties. The expectation and variance of the Vervaat transform of Brownian motion are also derived.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint