Source author record

Samuel B. Hopkins

Samuel B. Hopkins appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Machine Learning Computational Complexity math.ST Statistics Theory Computer Science and Game Theory Cryptography and Security Information Theory math.IT math.NA Numerical Analysis

Catalog footprint

What is connected

11works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Robust Spectral Algorithm for Overcomplete Tensor Decomposition

We give a spectral algorithm for decomposing overcomplete order-4 tensors, so long as their components satisfy an algebraic non-degeneracy condition that holds for nearly all (all but an algebraic set of measure $0$) tensors over $(\mathbb{R}^d)^{\otimes 4}$ with rank $n \le d^2$. Our algorithm is robust to adversarial perturbations of bounded spectral norm. Our algorithm is inspired by one which uses the sum-of-squares semidefinite programming hierarchy (Ma, Shi, and Steurer STOC'16, arXiv:1610.01980), and we achieve comparable robustness and overcompleteness guarantees under similar algebraic assumptions. However, our algorithm avoids semidefinite programming and may be implemented as a series of basic linear-algebraic operations. We consequently obtain a much faster running time than semidefinite programming methods: our algorithm runs in time $\tilde O(n^2d^3) \le \tilde O(d^7)$, which is subquadratic in the input size $d^4$ (where we have suppressed factors related to the condition number of the input tensor).

preprint2022arXiv

Efficient Mean Estimation with Pure Differential Privacy via a Sum-of-Squares Exponential Mechanism

We give the first polynomial-time algorithm to estimate the mean of a $d$-variate probability distribution with bounded covariance from $\tilde{O}(d)$ independent samples subject to pure differential privacy. Prior algorithms for this problem either incur exponential running time, require $Ω(d^{1.5})$ samples, or satisfy only the weaker concentrated or approximate differential privacy conditions. In particular, all prior polynomial-time algorithms require $d^{1+Ω(1)}$ samples to guarantee small privacy loss with "cryptographically" high probability, $1-2^{-d^{Ω(1)}}$, while our algorithm retains $\tilde{O}(d)$ sample complexity even in this stringent setting. Our main technique is a new approach to use the powerful Sum of Squares method (SoS) to design differentially private algorithms. SoS proofs to algorithms is a key theme in numerous recent works in high-dimensional algorithmic statistics -- estimators which apparently require exponential running time but whose analysis can be captured by low-degree Sum of Squares proofs can be automatically turned into polynomial-time algorithms with the same provable guarantees. We demonstrate a similar proofs to private algorithms phenomenon: instances of the workhorse exponential mechanism which apparently require exponential time but which can be analyzed with low-degree SoS proofs can be automatically turned into polynomial-time differentially private algorithms. We prove a meta-theorem capturing this phenomenon, which we expect to be of broad use in private algorithm design. Our techniques also draw new connections between differentially private and robust statistics in high dimensions. In particular, viewed through our proofs-to-private-algorithms lens, several well-studied SoS proofs from recent works in algorithmic robust statistics directly yield key components of our differentially private mean estimation algorithm.

preprint2021arXiv

Robust and Heavy-Tailed Mean Estimation Made Simple, via Regret Minimization

We study the problem of estimating the mean of a distribution in high dimensions when either the samples are adversarially corrupted or the distribution is heavy-tailed. Recent developments in robust statistics have established efficient and (near) optimal procedures for both settings. However, the algorithms developed on each side tend to be sophisticated and do not directly transfer to the other, with many of them having ad-hoc or complicated analyses. In this paper, we provide a meta-problem and a duality theorem that lead to a new unified view on robust and heavy-tailed mean estimation in high dimensions. We show that the meta-problem can be solved either by a variant of the Filter algorithm from the recent literature on robust estimation or by the quantum entropy scoring scheme (QUE), due to Dong, Hopkins and Li (NeurIPS '19). By leveraging our duality theorem, these results translate into simple and efficient algorithms for both robust and heavy-tailed settings. Furthermore, the QUE-based procedure has run-time that matches the fastest known algorithms on both fronts. Our analysis of Filter is through the classic regret bound of the multiplicative weights update method. This connection allows us to avoid the technical complications in previous works and improve upon the run-time analysis of a gradient-descent-based algorithm for robust mean estimation by Cheng, Diakonikolas, Ge and Soltanolkotabi (ICML '20).

preprint2020arXiv

Estimating Rank-One Spikes from Heavy-Tailed Noise via Self-Avoiding Walks

We study symmetric spiked matrix models with respect to a general class of noise distributions. Given a rank-1 deformation of a random noise matrix, whose entries are independently distributed with zero mean and unit variance, the goal is to estimate the rank-1 part. For the case of Gaussian noise, the top eigenvector of the given matrix is a widely-studied estimator known to achieve optimal statistical guarantees, e.g., in the sense of the celebrated BBP phase transition. However, this estimator can fail completely for heavy-tailed noise. In this work, we exhibit an estimator that works for heavy-tailed noise up to the BBP threshold that is optimal even for Gaussian noise. We give a non-asymptotic analysis of our estimator which relies only on the variance of each entry remaining constant as the size of the matrix grows: higher moments may grow arbitrarily fast or even fail to exist. Previously, it was only known how to achieve these guarantees if higher-order moments of the noises are bounded by a constant independent of the size of the matrix. Our estimator can be evaluated in polynomial time by counting self-avoiding walks via a color -coding technique. Moreover, we extend our estimator to spiked tensor models and establish analogous results.

preprint2020arXiv

Robustly Learning any Clusterable Mixture of Gaussians

We study the efficient learnability of high-dimensional Gaussian mixtures in the outlier-robust setting, where a small constant fraction of the data is adversarially corrupted. We resolve the polynomial learnability of this problem when the components are pairwise separated in total variation distance. Specifically, we provide an algorithm that, for any constant number of components $k$, runs in polynomial time and learns the components of an $ε$-corrupted $k$-mixture within information theoretically near-optimal error of $\tilde{O}(ε)$, under the assumption that the overlap between any pair of components $P_i, P_j$ (i.e., the quantity $1-TV(P_i, P_j)$) is bounded by $\mathrm{poly}(ε)$. Our separation condition is the qualitatively weakest assumption under which accurate clustering of the samples is possible. In particular, it allows for components with arbitrary covariances and for components with identical means, as long as their covariances differ sufficiently. Ours is the first polynomial time algorithm for this problem, even for $k=2$. Our algorithm follows the Sum-of-Squares based proofs to algorithms approach. Our main technical contribution is a new robust identifiability proof of clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system. The key ingredients of this proof are a novel use of SoS-certifiable anti-concentration and a new characterization of pairs of Gaussians with small (dimension-independent) overlap in terms of their parameter distance.

preprint2020arXiv

Smoothed Complexity of 2-player Nash Equilibria

We prove that computing a Nash equilibrium of a two-player ($n \times n$) game with payoffs in $[-1,1]$ is PPAD-hard (under randomized reductions) even in the smoothed analysis setting, smoothing with noise of constant magnitude. This gives a strong negative answer to conjectures of Spielman and Teng [ST06] and Cheng, Deng, and Teng [CDT09]. In contrast to prior work proving PPAD-hardness after smoothing by noise of magnitude $1/\operatorname{poly}(n)$ [CDT09], our smoothed complexity result is not proved via hardness of approximation for Nash equilibria. This is by necessity, since Nash equilibria can be approximated to constant error in quasi-polynomial time [LMM03]. Our results therefore separate smoothed complexity and hardness of approximation for Nash equilibria in two-player games. The key ingredient in our reduction is the use of a random zero-sum game as a gadget to produce two-player games which remain hard even after smoothing. Our analysis crucially shows that all Nash equilibria of random zero-sum games are far from pure (with high probability), and that this remains true even after smoothing.

preprint2020arXiv

Subexponential LPs Approximate Max-Cut

We show that for every $\varepsilon > 0$, the degree-$n^\varepsilon$ Sherali-Adams linear program (with $\exp(\tilde{O}(n^\varepsilon))$ variables and constraints) approximates the maximum cut problem within a factor of $(\frac{1}{2}+\varepsilon')$, for some $\varepsilon'(\varepsilon) > 0$. Our result provides a surprising converse to known lower bounds against all linear programming relaxations of Max-Cut, and hence resolves the extension complexity of approximate Max-Cut for approximation factors close to $\frac{1}{2}$ (up to the function $\varepsilon'(\varepsilon)$). Previously, only semidefinite programs and spectral methods were known to yield approximation factors better than $\frac 12$ for Max-Cut in time $2^{o(n)}$. We also show that constant-degree Sherali-Adams linear programs (with $\text{poly}(n)$ variables and constraints) can solve Max-Cut with approximation factor close to $1$ on graphs of small threshold rank: this is the first connection of which we are aware between threshold rank and linear programming-based algorithms. Our results separate the power of Sherali-Adams versus Lovász-Schrijver hierarchies for approximating Max-Cut, since it is known that $(\frac{1}{2}+\varepsilon)$ approximation of Max Cut requires $Ω_\varepsilon (n)$ rounds in the Lovász-Schrijver hierarchy. We also provide a subexponential time approximation for Khot's Unique Games problem: we show that for every $\varepsilon > 0$ the degree-$(n^\varepsilon \log q)$ Sherali-Adams linear program distinguishes instances of Unique Games of value $\geq 1-\varepsilon'$ from instances of value $\leq \varepsilon'$, for some $\varepsilon'( \varepsilon) >0$, where $q$ is the alphabet size. Such guarantees are qualitatively similar to those of previous subexponential-time algorithms for Unique Games but our algorithm does not rely on semidefinite programming or subspace enumeration techniques.

preprint2016arXiv

A Nearly Tight Sum-of-Squares Lower Bound for the Planted Clique Problem

We prove that with high probability over the choice of a random graph $G$ from the Erdős-Rényi distribution $G(n,1/2)$, the $n^{O(d)}$-time degree $d$ Sum-of-Squares semidefinite programming relaxation for the clique problem will give a value of at least $n^{1/2-c(d/\log n)^{1/2}}$ for some constant $c>0$. This yields a nearly tight $n^{1/2 - o(1)}$ bound on the value of this program for any degree $d = o(\log n)$. Moreover we introduce a new framework that we call \emph{pseudo-calibration} to construct Sum of Squares lower bounds. This framework is inspired by taking a computational analog of Bayesian probability theory. It yields a general recipe for constructing good pseudo-distributions (i.e., dual certificates for the Sum-of-Squares semidefinite program), and sheds further light on the ways in which this hierarchy differs from others.

preprint2016arXiv

Fast spectral algorithms from sum-of-squares proofs: tensor decomposition and planted sparse vectors

We consider two problems that arise in machine learning applications: the problem of recovering a planted sparse vector in a random linear subspace and the problem of decomposing a random low-rank overcomplete 3-tensor. For both problems, the best known guarantees are based on the sum-of-squares method. We develop new algorithms inspired by analyses of the sum-of-squares method. Our algorithms achieve the same or similar guarantees as sum-of-squares for these problems but the running time is significantly faster. For the planted sparse vector problem, we give an algorithm with running time nearly linear in the input size that approximately recovers a planted sparse vector with up to constant relative sparsity in a random subspace of $\mathbb R^n$ of dimension up to $\tilde Ω(\sqrt n)$. These recovery guarantees match the best known ones of Barak, Kelner, and Steurer (STOC 2014) up to logarithmic factors. For tensor decomposition, we give an algorithm with running time close to linear in the input size (with exponent $\approx 1.086$) that approximately recovers a component of a random 3-tensor over $\mathbb R^n$ of rank up to $\tilde Ω(n^{4/3})$. The best previous algorithm for this problem due to Ge and Ma (RANDOM 2015) works up to rank $\tilde Ω(n^{3/2})$ but requires quasipolynomial time.

preprint2015arXiv

SoS and Planted Clique: Tight Analysis of MPW Moments at all Degrees and an Optimal Lower Bound at Degree Four

The problem of finding large cliques in random graphs and its "planted" variant, where one wants to recover a clique of size $ω\gg \log{(n)}$ added to an \Erdos-\Renyi graph $G \sim G(n,\frac{1}{2})$, have been intensely studied. Nevertheless, existing polynomial time algorithms can only recover planted cliques of size $ω= Ω(\sqrt{n})$. By contrast, information theoretically, one can recover planted cliques so long as $ω\gg \log{(n)}$. In this work, we continue the investigation of algorithms from the sum of squares hierarchy for solving the planted clique problem begun by Meka, Potechin, and Wigderson (MPW, 2015) and Deshpande and Montanari (DM,2015). Our main results improve upon both these previous works by showing: 1. Degree four SoS does not recover the planted clique unless $ω\gg \sqrt n poly \log n$, improving upon the bound $ω\gg n^{1/3}$ due to DM. A similar result was obtained independently by Raghavendra and Schramm (2015). 2. For $2 < d = o(\sqrt{\log{(n)}})$, degree $2d$ SoS does not recover the planted clique unless $ω\gg n^{1/(d + 1)} /(2^d poly \log n)$, improving upon the bound due to MPW. Our proof for the second result is based on a fine spectral analysis of the certificate used in the prior works MPW,DM and Feige and Krauthgamer (2003) by decomposing it along an appropriately chosen basis. Along the way, we develop combinatorial tools to analyze the spectrum of random matrices with dependent entries and to understand the symmetries in the eigenspaces of the set symmetric matrices inspired by work of Grigoriev (2001). An argument of Kelner shows that the first result cannot be proved using the same certificate. Rather, our proof involves constructing and analyzing a new certificate that yields the nearly tight lower bound by "correcting" the certificate of previous works.

preprint2015arXiv

Tensor principal component analysis via sum-of-squares proofs

We study a statistical model for the tensor principal component analysis problem introduced by Montanari and Richard: Given a order-$3$ tensor $T$ of the form $T = τ\cdot v_0^{\otimes 3} + A$, where $τ\geq 0$ is a signal-to-noise ratio, $v_0$ is a unit vector, and $A$ is a random noise tensor, the goal is to recover the planted vector $v_0$. For the case that $A$ has iid standard Gaussian entries, we give an efficient algorithm to recover $v_0$ whenever $τ\geq ω(n^{3/4} \log(n)^{1/4})$, and certify that the recovered vector is close to a maximum likelihood estimator, all with high probability over the random choice of $A$. The previous best algorithms with provable guarantees required $τ\geq Ω(n)$. In the regime $τ\leq o(n)$, natural tensor-unfolding-based spectral relaxations for the underlying optimization problem break down (in the sense that their integrality gap is large). To go beyond this barrier, we use convex relaxations based on the sum-of-squares method. Our recovery algorithm proceeds by rounding a degree-$4$ sum-of-squares relaxations of the maximum-likelihood-estimation problem for the statistical model. To complement our algorithmic results, we show that degree-$4$ sum-of-squares relaxations break down for $τ\leq O(n^{3/4}/\log(n)^{1/4})$, which demonstrates that improving our current guarantees (by more than logarithmic factors) would require new techniques or might even be intractable. Finally, we show how to exploit additional problem structure in order to solve our sum-of-squares relaxations, up to some approximation, very efficiently. Our fastest algorithm runs in nearly-linear time using shifted (matrix) power iteration and has similar guarantees as above. The analysis of this algorithm also confirms a variant of a conjecture of Montanari and Richard about singular vectors of tensor unfoldings.

Samuel B. Hopkins

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

A Robust Spectral Algorithm for Overcomplete Tensor Decomposition

Efficient Mean Estimation with Pure Differential Privacy via a Sum-of-Squares Exponential Mechanism

Robust and Heavy-Tailed Mean Estimation Made Simple, via Regret Minimization

Estimating Rank-One Spikes from Heavy-Tailed Noise via Self-Avoiding Walks

Robustly Learning any Clusterable Mixture of Gaussians

Smoothed Complexity of 2-player Nash Equilibria

Subexponential LPs Approximate Max-Cut

A Nearly Tight Sum-of-Squares Lower Bound for the Planted Clique Problem

Fast spectral algorithms from sum-of-squares proofs: tensor decomposition and planted sparse vectors

SoS and Planted Clique: Tight Analysis of MPW Moments at all Degrees and an Optimal Lower Bound at Degree Four

Tensor principal component analysis via sum-of-squares proofs