Source author record

Kevin Tian

Kevin Tian appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms math.OC Machine Learning Computation Discrete Mathematics math.NA math.PR math.ST Numerical Analysis Populations and Evolution quant-ph Statistics Theory

Catalog footprint

What is connected

12works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Perspectives on Stochastic Localization

We survey different perspectives on the stochastic localization process of Eldan, a powerful construction that has had many exciting recent applications in high-dimensional probability and algorithm design. Unlike prior surveys on this topic, our focus is on giving a self-contained presentation of all known alternative constructions of Eldan's stochastic localization, with an emphasis on connections between different constructions. Our hope is that by collecting these perspectives, some of which had primarily arisen within a particular community (e.g., probability theory, theoretical computer science, information theory, or machine learning), we can broaden the accessibility of stochastic localization, and ease its future use.

preprint2026arXiv

The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination

Inspired by recent work on learning with distribution shift, we give a general outlier removal algorithm called iterative polynomial filtering and show a number of striking applications for supervised learning with contamination: (1) We show that any function class that can be approximated by low-degree polynomials with respect to a hypercontractive distribution can be efficiently learned under bounded contamination (also known as nasty noise). This is a surprising resolution to a longstanding gap between the complexity of agnostic learning and learning with contamination, as it was widely believed that low-degree approximators only implied tolerance to label noise. In particular, it implies the first efficient algorithm for learning halfspaces with $η$-bounded contamination up to error $2η+ε$ with respect to the Gaussian distribution. (2) For any function class that admits the (stronger) notion of sandwiching approximators, we obtain near-optimal learning guarantees even with respect to heavy additive contamination, where far more than $1/2$ of the training set may be added adversarially. Prior related work held only for regression and in a list-decodable setting. (3) We obtain the first efficient algorithms for tolerant testable learning of functions of halfspaces with respect to any fixed log-concave distribution. Even the non-tolerant case for a single halfspace in this setting had remained open. These results significantly advance our understanding of efficient supervised learning under contamination, a setting that has been much less studied than its unsupervised counterpart.

preprint2023arXiv

Quantum Speedups for Zero-Sum Games via Improved Dynamic Gibbs Sampling

We give a quantum algorithm for computing an $ε$-approximate Nash equilibrium of a zero-sum game in a $m \times n$ payoff matrix with bounded entries. Given a standard quantum oracle for accessing the payoff matrix our algorithm runs in time $\widetilde{O}(\sqrt{m + n}\cdot ε^{-2.5} + ε^{-3})$ and outputs a classical representation of the $ε$-approximate Nash equilibrium. This improves upon the best prior quantum runtime of $\widetilde{O}(\sqrt{m + n} \cdot ε^{-3})$ obtained by [vAG19] and the classic $\widetilde{O}((m + n) \cdot ε^{-2})$ runtime due to [GK95] whenever $ε= Ω((m +n)^{-1})$. We obtain this result by designing new quantum data structures for efficiently sampling from a slowly-changing Gibbs distribution.

preprint2022arXiv

Semi-Random Sparse Recovery in Nearly-Linear Time

Sparse recovery is one of the most fundamental and well-studied inverse problems. Standard statistical formulations of the problem are provably solved by general convex programming techniques and more practical, fast (nearly-linear time) iterative methods. However, these latter "fast algorithms" have previously been observed to be brittle in various real-world settings. We investigate the brittleness of fast sparse recovery algorithms to generative model changes through the lens of studying their robustness to a "helpful" semi-random adversary, a framework which tests whether an algorithm overfits to input assumptions. We consider the following basic model: let $\mathbf{A} \in \mathbb{R}^{n \times d}$ be a measurement matrix which contains an unknown subset of rows $\mathbf{G} \in \mathbb{R}^{m \times d}$ which are bounded and satisfy the restricted isometry property (RIP), but is otherwise arbitrary. Letting $x^\star \in \mathbb{R}^d$ be $s$-sparse, and given either exact measurements $b = \mathbf{A} x^\star$ or noisy measurements $b = \mathbf{A} x^\star + ξ$, we design algorithms recovering $x^\star$ information-theoretically optimally in nearly-linear time. We extend our algorithm to hold for weaker generative models relaxing our planted RIP assumption to a natural weighted variant, and show that our method's guarantees naturally interpolate the quality of the measurement matrix to, in some parameter regimes, run in sublinear time. Our approach differs from prior fast iterative methods with provable guarantees under semi-random generative models: natural conditions on a submatrix which make sparse recovery tractable are NP-hard to verify. We design a new iterative method tailored to the geometry of sparse recovery which is provably robust to our semi-random model. We hope our approach opens the door to new robust, efficient algorithms for natural statistical inverse problems.

preprint2022arXiv

Sharper Rates for Separable Minimax and Finite Sum Optimization via Primal-Dual Extragradient Methods

We design accelerated algorithms with improved rates for several fundamental classes of optimization problems. Our algorithms all build upon techniques related to the analysis of primal-dual extragradient methods via relative Lipschitzness proposed recently by [CST21]. (1) Separable minimax optimization. We study separable minimax optimization problems $\min_x \max_y f(x) - g(y) + h(x, y)$, where $f$ and $g$ have smoothness and strong convexity parameters $(L^x, μ^x)$, $(L^y, μ^y)$, and $h$ is convex-concave with a $(Λ^{xx}, Λ^{xy}, Λ^{yy})$-blockwise operator norm bounded Hessian. We provide an algorithm with gradient query complexity $\tilde{O}\left(\sqrt{\frac{L^{x}}{μ^{x}}} + \sqrt{\frac{L^{y}}{μ^{y}}} + \frac{Λ^{xx}}{μ^{x}} + \frac{Λ^{xy}}{\sqrt{μ^{x}μ^{y}}} + \frac{Λ^{yy}}{μ^{y}}\right)$. Notably, for convex-concave minimax problems with bilinear coupling (e.g.\ quadratics), where $Λ^{xx} = Λ^{yy} = 0$, our rate matches a lower bound of [ZHZ19]. (2) Finite sum optimization. We study finite sum optimization problems $\min_x \frac{1}{n}\sum_{i\in[n]} f_i(x)$, where each $f_i$ is $L_i$-smooth and the overall problem is $μ$-strongly convex. We provide an algorithm with gradient query complexity $\tilde{O}\left(n + \sum_{i\in[n]} \sqrt{\frac{L_i}{nμ}} \right)$. Notably, when the smoothness bounds $\{L_i\}_{i\in[n]}$ are non-uniform, our rate improves upon accelerated SVRG [LMH15, FGKS15] and Katyusha [All17] by up to a $\sqrt{n}$ factor. (3) Minimax finite sums. We generalize our algorithms for minimax and finite sum optimization to solve a natural family of minimax finite sum optimization problems at an accelerated rate, encapsulating both above results up to a logarithmic factor.

preprint2020arXiv

Acceleration with a Ball Optimization Oracle

Consider an oracle which takes a point $x$ and returns the minimizer of a convex function $f$ in an $\ell_2$ ball of radius $r$ around $x$. It is straightforward to show that roughly $r^{-1}\log\frac{1}ε$ calls to the oracle suffice to find an $ε$-approximate minimizer of $f$ in an $\ell_2$ unit ball. Perhaps surprisingly, this is not optimal: we design an accelerated algorithm which attains an $ε$-approximate minimizer with roughly $r^{-2/3} \log \frac{1}ε$ oracle queries, and give a matching lower bound. Further, we implement ball optimization oracles for functions with locally stable Hessians using a variant of Newton's method. The resulting algorithm applies to a number of problems of practical and theoretical import, improving upon previous results for logistic and $\ell_\infty$ regression and achieving guarantees comparable to the state-of-the-art for $\ell_p$ regression.

preprint2020arXiv

Composite Logconcave Sampling with a Restricted Gaussian Oracle

We consider sampling from composite densities on $\mathbb{R}^d$ of the form $dπ(x) \propto \exp(-f(x) - g(x))dx$ for well-conditioned $f$ and convex (but possibly non-smooth) $g$, a family generalizing restrictions to a convex set, through the abstraction of a restricted Gaussian oracle. For $f$ with condition number $κ$, our algorithm runs in $O \left(κ^2 d \log^2\tfrac{κd}ε\right)$ iterations, each querying a gradient of $f$ and a restricted Gaussian oracle, to achieve total variation distance $ε$. The restricted Gaussian oracle, which draws samples from a distribution whose negative log-likelihood sums a quadratic and $g$, has been previously studied and is a natural extension of the proximal oracle used in composite optimization. Our algorithm is conceptually simple and obtains stronger provable guarantees and greater generality than existing methods for composite sampling. We conduct experiments showing our algorithm vastly improves upon the hit-and-run algorithm for sampling the restriction of a (non-diagonal) Gaussian to the positive orthant.

preprint2020arXiv

Coordinate Methods for Accelerating $\ell_\infty$ Regression and Faster Approximate Maximum Flow

We provide faster algorithms for approximately solving $\ell_{\infty}$ regression, a fundamental problem prevalent in both combinatorial and continuous optimization. In particular, we provide accelerated coordinate descent methods capable of provably exploiting dynamic measures of coordinate smoothness, and apply them to $\ell_\infty$ regression over a box to give algorithms which converge in $k$ iterations at a $O(1/k)$ rate. Our algorithms can be viewed as an alternative approach to the recent breakthrough result of Sherman [She17] which achieves a similar runtime improvement over classic algorithmic approaches, i.e. smoothing and gradient descent, which either converge at a $O(1/\sqrt{k})$ rate or have running times with a worse dependence on problem parameters. Our runtimes match those of [She17] across a broad range of parameters and achieve improvement in certain structured cases. We demonstrate the efficacy of our result by providing faster algorithms for the well-studied maximum flow problem. Directly leveraging our accelerated $\ell_\infty$ regression algorithms imply a $\tilde{O}\left(m + \sqrt{mn}/ε\right)$ runtime to compute an $ε$-approximate maximum flow for an undirected graph with $m$ edges and $n$ vertices, generically improving upon the previous best known runtime of $\tilde{O}\left(m/ε\right)$ in [She17] whenever the graph is slightly dense. We further design an algorithm adapted to the structure of the regression problem induced by maximum flow obtaining a runtime of $\tilde{O}\left(m + \max(n, \sqrt{ns})/ε\right)$, where $s$ is the squared $\ell_2$ norm of the congestion of any optimal flow. Moreover, we show how to leverage this result to achieve improved exact algorithms for maximum flow on a variety of unit capacity graphs. We hope that our work serves as an important step towards achieving even faster maximum flow algorithms.

preprint2020arXiv

Coordinate Methods for Matrix Games

We develop primal-dual coordinate methods for solving bilinear saddle-point problems of the form $\min_{x \in \mathcal{X}} \max_{y\in\mathcal{Y}} y^\top A x$ which contain linear programming, classification, and regression as special cases. Our methods push existing fully stochastic sublinear methods and variance-reduced methods towards their limits in terms of per-iteration complexity and sample complexity. We obtain nearly-constant per-iteration complexity by designing efficient data structures leveraging Taylor approximations to the exponential and a binomial heap. We improve sample complexity via low-variance gradient estimators using dynamic sampling distributions that depend on both the iterates and the magnitude of the matrix entries. Our runtime bounds improve upon those of existing primal-dual methods by a factor depending on sparsity measures of the $m$ by $n$ matrix $A$. For example, when rows and columns have constant $\ell_1/\ell_2$ norm ratios, we offer improvements by a factor of $m+n$ in the fully stochastic setting and $\sqrt{m+n}$ in the variance-reduced setting. We apply our methods to computational geometry problems, i.e. minimum enclosing ball, maximum inscribed ball, and linear regression, and obtain improved complexity bounds. For linear regression with an elementwise nonnegative matrix, our guarantees improve on exact gradient methods by a factor of $\sqrt{\mathrm{nnz}(A)/(m+n)}$.

preprint2020arXiv

Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo

We show that the gradient norm $\|\nabla f(x)\|$ for $x \sim \exp(-f(x))$, where $f$ is strongly convex and smooth, concentrates tightly around its mean. This removes a barrier in the prior state-of-the-art analysis for the well-studied Metropolized Hamiltonian Monte Carlo (HMC) algorithm for sampling from a strongly logconcave distribution. We correspondingly demonstrate that Metropolized HMC mixes in $\tilde{O}(κd)$ iterations, improving upon the $\tilde{O}(κ^{1.5}\sqrt{d} + κd)$ runtime of (Dwivedi et. al. '18, Chen et. al. '19) by a factor $(κ/d)^{1/2}$ when the condition number $κ$ is large. Our mixing time analysis introduces several techniques which to our knowledge have not appeared in the literature and may be of independent interest, including restrictions to a nonconvex set with good conductance behavior, and a new reduction technique for boosting a constant-accuracy total variation guarantee under weak warmness assumptions. This is the first high-accuracy mixing time result for logconcave distributions using only first-order function information which achieves linear dependence on $κ$; we also give evidence that this dependence is likely to be necessary for standard Metropolized first-order methods.

preprint2020arXiv

Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

We develop two methods for the following fundamental statistical task: given an $ε$-corrupted set of $n$ samples from a $d$-dimensional sub-Gaussian distribution, return an approximate top eigenvector of the covariance matrix. Our first robust PCA algorithm runs in polynomial time, returns a $1 - O(ε\logε^{-1})$-approximate top eigenvector, and is based on a simple iterative filtering approach. Our second, which attains a slightly worse approximation factor, runs in nearly-linear time and sample complexity under a mild spectral gap assumption. These are the first polynomial-time algorithms yielding non-trivial information about the covariance of a corrupted sub-Gaussian distribution without requiring additional algebraic structure of moments. As a key technical tool, we develop the first width-independent solvers for Schatten-$p$ norm packing semidefinite programs, giving a $(1 + ε)$-approximate solution in $O(p\log(\tfrac{nd}ε)ε^{-1})$ input-sparsity time iterations (where $n$, $d$ are problem dimensions).

preprint2014arXiv

Variance on the Leaves of a Tree Markov Random Field: Detecting Character Dependencies in Phylogenies

Stochastic models of evolution (Markov random fields on trivalent trees) generally assume that different characters (different runs of the stochastic process) are independent and identically distributed. In this paper we take the first steps towards addressing dependent characters. Specifically we show that, under certain technical assumptions regarding the evolution of individual characters, we can detect any significant, history independent, correlation between any pair of multistate characters. For the special case of the Cavender-Farris-Neyman (CFN) model on two states with symmetric transition matrices, our analysis needs milder assumptions. To perform the analysis, we need to prove a new concentration result for multistate random variables of a Markov random field on arbitrary trivalent trees: we show that the random variable counting the number of leaves in any particular subset of states has variance that is subquadratic in the number of leaves.

Kevin Tian

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Perspectives on Stochastic Localization

The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination

Quantum Speedups for Zero-Sum Games via Improved Dynamic Gibbs Sampling

Semi-Random Sparse Recovery in Nearly-Linear Time

Sharper Rates for Separable Minimax and Finite Sum Optimization via Primal-Dual Extragradient Methods

Acceleration with a Ball Optimization Oracle

Composite Logconcave Sampling with a Restricted Gaussian Oracle

Coordinate Methods for Accelerating $\ell_\infty$ Regression and Faster Approximate Maximum Flow

Coordinate Methods for Matrix Games

Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo

Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

Variance on the Leaves of a Tree Markov Random Field: Detecting Character Dependencies in Phylogenies