Source author record

Santosh Vempala

Santosh Vempala appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Machine Learning Computational Complexity math.PR math.FA math.CO Computational Geometry Discrete Mathematics math.OC Cryptography and Security Distributed, Parallel, and Cluster Computing math.NA Neural and Evolutionary Computing Numerical Analysis

Catalog footprint

What is connected

31works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Convergence of Gibbs Sampling: Coordinate Hit-and-Run Mixes Fast

The Gibbs Sampler is a general method for sampling high-dimensional distributions, dating back to Turchin, 1971. In each step of the Gibbs Sampler, we pick a random coordinate and re-sample that coordinate from the distribution induced by fixing all other coordinates. While it has become widely used over the past half-century, guarantees of efficient convergence have been elusive. We show that for a convex body $K$ in $\mathbb{R}^{n}$ with diameter $D$, the mixing time of the Coordinate Hit-and-Run (CHAR) algorithm on $K$ is polynomial in $n$ and $D$. We also give a lower bound on the conductance of CHAR, showing that it is strictly worse than hit-and-run or the ball walk in the worst case.

preprint2021arXiv

Solving Sparse Linear Systems Faster than Matrix Multiplication

Can linear systems be solved faster than matrix multiplication? While there has been remarkable progress for the special cases of graph structured linear systems, in the general setting, the bit complexity of solving an $n \times n$ linear system $Ax=b$ is $\tilde{O}(n^ω)$, where $ω< 2.372864$ is the matrix multiplication exponent. Improving on this has been an open problem even for sparse linear systems with poly$(n)$ condition number. In this paper, we present an algorithm that solves linear systems in sparse matrices asymptotically faster than matrix multiplication for any $ω> 2$. This speedup holds for any input matrix $A$ with $o(n^{ω-1}/\log(κ(A)))$ non-zeros, where $κ(A)$ is the condition number of $A$. For poly$(n)$-conditioned matrices with $\tilde{O}(n)$ nonzeros, and the current value of $ω$, the bit complexity of our algorithm to solve to within any $1/\text{poly}(n)$ error is $O(n^{2.331645})$. Our algorithm can be viewed as an efficient, randomized implementation of the block Krylov method via recursive low displacement rank factorizations. It is inspired by the algorithm of [Eberly et al. ISSAC `06 `07] for inverting matrices over finite fields. In our analysis of numerical stability, we develop matrix anti-concentration techniques to bound the smallest eigenvalue and the smallest gap in eigenvalues of semi-random matrices.

preprint2020arXiv

Multi-Criteria Dimensionality Reduction with Applications to Fairness

Dimensionality reduction is a classical technique widely used for data analysis. One foundational instantiation is Principal Component Analysis (PCA), which minimizes the average reconstruction error. In this paper, we introduce the "multi-criteria dimensionality reduction" problem where we are given multiple objectives that need to be optimized simultaneously. As an application, our model captures several fairness criteria for dimensionality reduction such as our novel Fair-PCA problem and the Nash Social Welfare (NSW) problem. In Fair-PCA, the input data is divided into $k$ groups, and the goal is to find a single $d$-dimensional representation for all groups for which the minimum variance of any one group is maximized. In NSW, the goal is to maximize the product of the individual variances of the groups achieved by the common low-dimensional space. Our main result is an exact polynomial-time algorithm for the two-criterion dimensionality reduction problem when the two criteria are increasing concave functions. As an application of this result, we obtain a polynomial time algorithm for Fair-PCA for $k=2$ groups and a polynomial time algorithm for NSW objective for $k=2$ groups. We also give approximation algorithms for $k>2$. Our technical contribution in the above results is to prove new low-rank properties of extreme point solutions to semi-definite programs. We conclude with experiments indicating the effectiveness of algorithms based on extreme point solutions of semi-definite programs on several real-world data sets.

preprint2020arXiv

Robustly Clustering a Mixture of Gaussians

We give an efficient algorithm for robustly clustering of a mixture of two arbitrary Gaussians, a central open problem in the theory of computationally efficient robust estimation, assuming only that the the means of the component Gaussians are well-separated or their covariances are well-separated. Our algorithm and analysis extend naturally to robustly clustering mixtures of well-separated strongly logconcave distributions. The mean separation required is close to the smallest possible to guarantee that most of the measure of each component can be separated by some hyperplane (for covariances, it is the same condition in the second degree polynomial kernel). We also show that for Gaussian mixtures, separation in total variation distance suffices to achieve robust clustering. Our main tools are a new identifiability criterion based on isotropic position and the Fisher discriminant, and a corresponding Sum-of-Squares convex programming relaxation, of fixed degree.

preprint2020arXiv

Strong Self-Concordance and Sampling

Motivated by the Dikin walk, we develop aspects of an interior-point theory for sampling in high dimension. Specifically, we introduce a symmetric parameter and the notion of strong self-concordance. These properties imply that the corresponding Dikin walk mixes in $\tilde{O}(n\barν)$ steps from a warm start in a convex body in $\mathbb{R}^{n}$ using a strongly self-concordant barrier with symmetric self-concordance parameter $\barν$. For many natural barriers, $\barν$ is roughly bounded by $ν$, the standard self-concordance parameter. We show that this property and strong self-concordance hold for the Lee-Sidford barrier. As a consequence, we obtain the first walk to mix in $\tilde{O}(n^{2})$ steps for an arbitrary polytope in $\mathbb{R}^{n}$. Strong self-concordance for other barriers leads to an interesting (and unexpected) connection -- for the universal and entropic barriers, it is implied by the KLS conjecture.

preprint2016arXiv

A Note on Non-Degenerate Integer Programs with Small Sub-Determinants

The intention of this note is two-fold. First, we study integer optimization problems in standard form defined by $A \in\mathbb{Z}^{m\times{}n}$ and present an algorithm to solve such problems in polynomial-time provided that both the largest absolute value of an entry in $A$ and $m$ are constant. Then, this is applied to solve integer programs in inequality form in polynomial-time, where the absolute values of all maximal sub-determinants of $A$ lie between $1$ and a constant.

preprint2016arXiv

Agnostic Estimation of Mean and Covariance

We consider the problem of estimating the mean and covariance of a distribution from iid samples in $\mathbb{R}^n$, in the presence of an $η$ fraction of malicious noise; this is in contrast to much recent work where the noise itself is assumed to be from a distribution of known type. The agnostic problem includes many interesting special cases, e.g., learning the parameters of a single Gaussian (or finding the best-fit Gaussian) when $η$ fraction of data is adversarially corrupted, agnostically learning a mixture of Gaussians, agnostic ICA, etc. We present polynomial-time algorithms to estimate the mean and covariance with error guarantees in terms of information-theoretic lower bounds. As a corollary, we also obtain an agnostic algorithm for Singular Value Decomposition.

preprint2016arXiv

Chi-squared Amplification: Identifying Hidden Hubs

We consider the following general hidden hubs model: an $n \times n$ random matrix $A$ with a subset $S$ of $k$ special rows (hubs): entries in rows outside $S$ are generated from the probability distribution $p_0 \sim N(0,σ_0^2)$; for each row in $S$, some $k$ of its entries are generated from $p_1 \sim N(0,σ_1^2)$, $σ_1>σ_0$, and the rest of the entries from $p_0$. The problem is to identify the high-degree hubs efficiently. This model includes and significantly generalizes the planted Gaussian Submatrix Model, where the special entries are all in a $k \times k$ submatrix. There are two well-known barriers: if $k\geq c\sqrt{n\ln n}$, just the row sums are sufficient to find $S$ in the general model. For the submatrix problem, this can be improved by a $\sqrt{\ln n}$ factor to $k \ge c\sqrt{n}$ by spectral methods or combinatorial methods. In the variant with $p_0=\pm 1$ (with probability $1/2$ each) and $p_1\equiv 1$, neither barrier has been broken. We give a polynomial-time algorithm to identify all the hidden hubs with high probability for $k \ge n^{0.5-δ}$ for some $δ>0$, when $σ_1^2>2σ_0^2$. The algorithm extends to the setting where planted entries might have different variances each at least as large as $σ_1^2$. We also show a nearly matching lower bound: for $σ_1^2 \le 2σ_0^2$, there is no polynomial-time Statistical Query algorithm for distinguishing between a matrix whose entries are all from $N(0,σ_0^2)$ and a matrix with $k=n^{0.5-δ}$ hidden hubs for any $δ>0$. The lower bound as well as the algorithm are related to whether the chi-squared distance of the two distributions diverges. At the critical value $σ_1^2=2σ_0^2$, we show that the general hidden hubs problem can be solved for $k\geq c\sqrt n(\ln n)^{1/4}$, improving on the naive row sum-based method.

preprint2016arXiv

Cortical Computation via Iterative Constructions

We study Boolean functions of an arbitrary number of input variables that can be realized by simple iterative constructions based on constant-size primitives. This restricted type of construction needs little global coordination or control and thus is a candidate for neurally feasible computation. Valiant's construction of a majority function can be realized in this manner and, as we show, can be generalized to any uniform threshold function. We study the rate of convergence, finding that while linear convergence to the correct function can be achieved for any threshold using a fixed set of primitives, for quadratic convergence, the size of the primitives must grow as the threshold approaches 0 or 1. We also study finite realizations of this process and the learnability of the functions realized. We show that the constructions realized are accurate outside a small interval near the target threshold, where the size of the construction grows as the inverse square of the interval width. This phenomenon, that errors are higher closer to thresholds (and thresholds closer to the boundary are harder to represent), is a well-known cognitive finding.

preprint2016arXiv

Gaussian Cooling and O*(n^3) Algorithms for Volume and Gaussian Volume

We present an $O^*(n^3)$ randomized algorithm for estimating the volume of a well-rounded convex body given by a membership oracle, improving on the previous best complexity of $O^*(n^4)$. The new algorithmic ingredient is an accelerated cooling schedule where the rate of cooling increases with the temperature. Previously, the known approach for potentially achieving this asymptotic complexity relied on a positive resolution of the KLS hyperplane conjecture, a central open problem in convex geometry. We also obtain an $O^*(n^3)$ randomized algorithm for integrating a standard Gaussian distribution over an arbitrary convex set containing the unit ball. Both the volume and Gaussian volume algorithms use an improved algorithm for sampling a Gaussian distribution restricted to a convex body. In this latter setting, as we show, the KLS conjecture holds and for a spherical Gaussian distribution with variance $σ^2$, the sampling complexity is $O^*(\max\{n^3, σ^2n^2\})$ for the first sample and $O^*(\max\{n^2, σ^2n^2\})$ for every subsequent sample.

preprint2016arXiv

Geometric Random Edge

We show that a variant of the random-edge pivoting rule results in a strongly polynomial time simplex algorithm for linear programs $\max\{c^Tx \colon Ax\leq b\}$, whose constraint matrix $A$ satisfies a geometric property introduced by Brunsch and Röglin: The sine of the angle of a row of $A$ to a hyperplane spanned by $n-1$ other rows of $A$ is at least $δ$. This property is a geometric generalization of $A$ being integral and all sub-determinants of $A$ being bounded by $Δ$ in absolute value (since $δ\geq 1/(Δ^2 n)$). In particular, linear programs defined by totally unimodular matrices are captured in this famework ($δ\geq 1/ n$) for which Dyer and Frieze previously described a strongly polynomial-time randomized algorithm. The number of pivots of the simplex algorithm is polynomial in the dimension and $1/δ$ and independent of the number of constraints of the linear program. Our main result can be viewed as an algorithmic realization of the proof of small diameter for such polytopes by Bonifas et al., using the ideas of Dyer and Frieze.

preprint2016arXiv

Statistical Algorithms and a Lower Bound for Detecting Planted Clique

We introduce a framework for proving lower bounds on computational problems over distributions against algorithms that can be implemented using access to a statistical query oracle. For such algorithms, access to the input distribution is limited to obtaining an estimate of the expectation of any given function on a sample drawn randomly from the input distribution, rather than directly accessing samples. Most natural algorithms of interest in theory and in practice, e.g., moments-based methods, local search, standard iterative methods for convex optimization, MCMC and simulated annealing can be implemented in this framework. Our framework is based on, and generalizes, the statistical query model in learning theory (Kearns, 1998). Our main application is a nearly optimal lower bound on the complexity of any statistical query algorithm for detecting planted bipartite clique distributions (or planted dense subgraph distributions) when the planted clique has size $O(n^{1/2-δ})$ for any constant $δ> 0$. The assumed hardness of variants of these problems has been used to prove hardness of several other problems and as a guarantee for security in cryptographic applications. Our lower bounds provide concrete evidence of hardness, thus supporting these assumptions.

preprint2016arXiv

Statistical Query Algorithms for Mean Vector Estimation and Stochastic Convex Optimization

Stochastic convex optimization, where the objective is the expectation of a random convex function, is an important and widely used method with numerous applications in machine learning, statistics, operations research and other areas. We study the complexity of stochastic convex optimization given only statistical query (SQ) access to the objective function. We show that well-known and popular first-order iterative methods can be implemented using only statistical queries. For many cases of interest we derive nearly matching upper and lower bounds on the estimation (sample) complexity including linear optimization in the most general setting. We then present several consequences for machine learning, differential privacy and proving concrete lower bounds on the power of convex optimization based methods. The key ingredient of our work is SQ algorithms and lower bounds for estimating the mean vector of a distribution over vectors supported on a convex body in $\mathbb{R}^d$. This natural problem has not been previously studied and we show that our solutions can be used to get substantially improved SQ versions of Perceptron and other online algorithms for learning halfspaces.

preprint2016arXiv

Towards Human Computable Passwords

An interesting challenge for the cryptography community is to design authentication protocols that are so simple that a human can execute them without relying on a fully trusted computer. We propose several candidate authentication protocols for a setting in which the human user can only receive assistance from a semi-trusted computer --- a computer that stores information and performs computations correctly but does not provide confidentiality. Our schemes use a semi-trusted computer to store and display public challenges $C_i\in[n]^k$. The human user memorizes a random secret mapping $σ:[n]\rightarrow\mathbb{Z}_d$ and authenticates by computing responses $f(σ(C_i))$ to a sequence of public challenges where $f:\mathbb{Z}_d^k\rightarrow\mathbb{Z}_d$ is a function that is easy for the human to evaluate. We prove that any statistical adversary needs to sample $m=\tildeΩ(n^{s(f)})$ challenge-response pairs to recover $σ$, for a security parameter $s(f)$ that depends on two key properties of $f$. To obtain our results, we apply the general hypercontractivity theorem to lower bound the statistical dimension of the distribution over challenge-response pairs induced by $f$ and $σ$. Our lower bounds apply to arbitrary functions $f $ (not just to functions that are easy for a human to evaluate), and generalize recent results of Feldman et al. As an application, we propose a family of human computable password functions $f_{k_1,k_2}$ in which the user needs to perform $2k_1+2k_2+1$ primitive operations (e.g., adding two digits or remembering $σ(i)$), and we show that $s(f) = \min\{k_1+1, (k_2+1)/2\}$. For these schemes, we prove that forging passwords is equivalent to recovering the secret mapping. Thus, our human computable password schemes can maintain strong security guarantees even after an adversary has observed the user login to many different accounts.

preprint2015arXiv

Subsampled Power Iteration: a Unified Algorithm for Block Models and Planted CSP's

We present an algorithm for recovering planted solutions in two well-known models, the stochastic block model and planted constraint satisfaction problems, via a common generalization in terms of random bipartite graphs. Our algorithm matches up to a constant factor the best-known bounds for the number of edges (or constraints) needed for perfect recovery and its running time is linear in the number of edges used. The time complexity is significantly better than both spectral and SDP-based approaches. The main contribution of the algorithm is in the case of unequal sizes in the bipartition (corresponding to odd uniformity in the CSP). Here our algorithm succeeds at a significantly lower density than the spectral approaches, surpassing a barrier based on the spectral norm of a random matrix. Other significant features of the algorithm and analysis include (i) the critical use of power iteration with subsampling, which might be of independent interest; its analysis requires keeping track of multiple norms of an evolving solution (ii) it can be implemented statistically, i.e., with very limited access to the input distribution (iii) the algorithm is extremely simple to implement and runs in linear time, and thus is practical even for very large instances.

preprint2014arXiv

Efficient Representations for Life-Long Learning and Autoencoding

It has been a long-standing goal in machine learning, as well as in AI more generally, to develop life-long learning systems that learn many different tasks over time, and reuse insights from tasks learned, "learning to learn" as they do so. In this work we pose and provide efficient algorithms for several natural theoretical formulations of this goal. Specifically, we consider the problem of learning many different target functions over time, that share certain commonalities that are initially unknown to the learning algorithm. Our aim is to learn new internal representations as the algorithm learns new target functions, that capture this commonality and allow subsequent learning tasks to be solved more efficiently and from less data. We develop efficient algorithms for two very different kinds of commonalities that target functions might share: one based on learning common low-dimensional and unions of low-dimensional subspaces and one based on learning nonlinear Boolean combinations of features. Our algorithms for learning Boolean feature combinations additionally have a dual interpretation, and can be viewed as giving an efficient procedure for constructing near-optimal sparse Boolean autoencoders under a natural "anchor-set" assumption.

preprint2014arXiv

Fourier PCA and Robust Tensor Decomposition

Fourier PCA is Principal Component Analysis of a matrix obtained from higher order derivatives of the logarithm of the Fourier transform of a distribution.We make this method algorithmic by developing a tensor decomposition method for a pair of tensors sharing the same vectors in rank-$1$ decompositions. Our main application is the first provably polynomial-time algorithm for underdetermined ICA, i.e., learning an $n \times m$ matrix $A$ from observations $y=Ax$ where $x$ is drawn from an unknown product distribution with arbitrary non-Gaussian components. The number of component distributions $m$ can be arbitrarily higher than the dimension $n$ and the columns of $A$ only need to satisfy a natural and efficiently verifiable nondegeneracy condition. As a second application, we give an alternative algorithm for learning mixtures of spherical Gaussians with linearly independent means. These results also hold in the presence of Gaussian noise.

preprint2014arXiv

Principal Component Analysis and Higher Correlations for Distributed Data

We consider algorithmic problems in the setting in which the input data has been partitioned arbitrarily on many servers. The goal is to compute a function of all the data, and the bottleneck is the communication used by the algorithm. We present algorithms for two illustrative problems on massive data sets: (1) computing a low-rank approximation of a matrix $A=A^1 + A^2 + \ldots + A^s$, with matrix $A^t$ stored on server $t$ and (2) computing a function of a vector $a_1 + a_2 + \ldots + a_s$, where server $t$ has the vector $a_t$; this includes the well-studied special case of computing frequency moments and separable functions, as well as higher-order correlations such as the number of subgraphs of a specified type occurring in a graph. For both problems we give algorithms with nearly optimal communication, and in particular the only dependence on $n$, the size of the data, is in the number of bits needed to represent indices and words ($O(\log n)$).

preprint2014arXiv

Stochastic billiards for sampling from the boundary of a convex set

Stochastic billiards can be used for approximate sampling from the boundary of a bounded convex set through the Markov Chain Monte Carlo (MCMC) paradigm. This paper studies how many steps of the underlying Markov chain are required to get samples (approximately) from the uniform distribution on the boundary of the set, for sets with an upper bound on the curvature of the boundary. Our main theorem implies a polynomial-time algorithm for sampling from the boundary of such sets.

preprint2014arXiv

The Cutting Plane Method is Polynomial for Perfect Matchings

The cutting plane approach to optimal matchings has been discussed by several authors over the past decades (e.g., Padberg and Rao '82, Grotschel and Holland '85, Lovasz and Plummer '86, Trick '87, Fischetti and Lodi '07) and its convergence has been an open question. We give a cutting plane algorithm that converges in polynomial-time using only Edmonds' blossom inequalities; it maintains half-integral intermediate LP solutions supported by a disjoint union of odd cycles and edges. Our main insight is a method to retain only a subset of the previously added cutting planes based on their dual values. This allows us to quickly find violated blossom inequalities and argue convergence by tracking the number of odd cycles in the support of intermediate solutions.

preprint2013arXiv

A Cubic Algorithm for Computing Gaussian Volume

We present randomized algorithms for sampling the standard Gaussian distribution restricted to a convex set and for estimating the Gaussian measure of a convex set, in the general membership oracle model. The complexity of integration is $O^*(n^3)$ while the complexity of sampling is $O^*(n^3)$ for the first sample and $O^*(n^2)$ for every subsequent sample. These bounds improve on the corresponding state-of-the-art by a factor of $n$. Our improvement comes from several aspects: better isoperimetry, smoother annealing, avoiding transformation to isotropic position and the use of the "speedy walk" in the analysis.

preprint2013arXiv

Integer Feasibility of Random Polytopes

We study integer programming instances over polytopes P(A,b)={x:Ax<=b} where the constraint matrix A is random, i.e., its entries are i.i.d. Gaussian or, more generally, its rows are i.i.d. from a spherically symmetric distribution. The radius of the largest inscribed ball is closely related to the existence of integer points in the polytope. We show that for m=2^O(sqrt{n}), there exist constants c_0 < c_1 such that with high probability, random polytopes are integer feasible if the radius of the largest ball contained in the polytope is at least c_1sqrt{log(m/n)}; and integer infeasible if the largest ball contained in the polytope is centered at (1/2,...,1/2) and has radius at most c_0sqrt{log(m/n)}. Thus, random polytopes transition from having no integer points to being integer feasible within a constant factor increase in the radius of the largest inscribed ball. We show integer feasibility via a randomized polynomial-time algorithm for finding an integer point in the polytope. Our main tool is a simple new connection between integer feasibility and linear discrepancy. We extend a recent algorithm for finding low-discrepancy solutions (Lovett-Meka, FOCS '12) to give a constructive upper bound on the linear discrepancy of random matrices. By our connection between discrepancy and integer feasibility, this upper bound on linear discrepancy translates to the radius lower bound that guarantees integer feasibility of random polytopes.

preprint2013arXiv

The Complexity of Approximating Vertex Expansion

We study the complexity of approximating the vertex expansion of graphs $G = (V,E)$, defined as \[ Φ^V := \min_{S \subset V} n \cdot \frac{|N(S)|}{|S| |V \backslash S|}. \] We give a simple polynomial-time algorithm for finding a subset with vertex expansion $O(\sqrt{OPT \log d})$ where $d$ is the maximum degree of the graph. Our main result is an asymptotically matching lower bound: under the Small Set Expansion (SSE) hypothesis, it is hard to find a subset with expansion less than $C\sqrt{OPT \log d}$ for an absolute constant $C$. In particular, this implies for all constant $ε> 0$, it is SSE-hard to distinguish whether the vertex expansion $< ε$ or at least an absolute constant. The analogous threshold for edge expansion is $\sqrt{OPT}$ with no dependence on the degree; thus our results suggest that vertex expansion is harder to approximate than edge expansion. In particular, while Cheeger's algorithm can certify constant edge expansion, it is SSE-hard to certify constant vertex expansion in graphs. Our proof is via a reduction from the {\it Unique Games} instance obtained from the \SSE hypothesis to the vertex expansion problem. It involves the definition of a smoother intermediate problem we call {\sf Analytic Vertex Expansion} which is representative of both the vertex expansion and the conductance of the graph. Both reductions (from the UGC instance to this problem and from this problem to vertex expansion) use novel proof ideas.

preprint2012arXiv

Near-Optimal Deterministic Algorithms for Volume Computation and Lattice Problems via M-Ellipsoids

We give a deterministic 2^{O(n)} algorithm for computing an M-ellipsoid of a convex body, matching a known lower bound. This has several interesting consequences including improved deterministic algorithms for volume estimation of convex bodies and the shortest and closest lattice vector problems under general norms.

preprint2011arXiv

Algorithms for Implicit Hitting Set Problems

A hitting set for a collection of sets is a set that has a non-empty intersection with each set in the collection; the hitting set problem is to find a hitting set of minimum cardinality. Motivated by instances of the hitting set problem where the number of sets to be hit is large, we introduce the notion of implicit hitting set problems. In an implicit hitting set problem the collection of sets to be hit is typically too large to list explicitly; instead, an oracle is provided which, given a set H, either determines that H is a hitting set or returns a set that H does not hit. We show a number of examples of classic implicit hitting set problems, and give a generic algorithm for solving such problems optimally. The main contribution of this paper is to show that this framework is valuable in developing approximation algorithms. We illustrate this methodology by presenting a simple on-line algorithm for the minimum feedback vertex set problem on random graphs. In particular our algorithm gives a feedback vertex set of size n-(1/p)\log{np}(1-o(1)) with probability at least 3/4 for the random graph G_{n,p} (the smallest feedback vertex set is of size n-(2/p)\log{np}(1+o(1))). We also consider a planted model for the feedback vertex set in directed random graphs. Here we show that a hitting set for a polynomial-sized subset of cycles is a hitting set for the planted random graph and this allows us to exactly recover the planted feedback vertex set.

preprint2011arXiv

Deterministic Construction of an Approximate M-Ellipsoid and its Application to Derandomizing Lattice Algorithms

We give a deterministic O(log n)^n algorithm for the {\em Shortest Vector Problem (SVP)} of a lattice under {\em any} norm, improving on the previous best deterministic bound of n^O(n) for general norms and nearly matching the bound of 2^O(n) for the standard Euclidean norm established by Micciancio and Voulgaris (STOC 2010). Our algorithm can be viewed as a derandomization of the AKS randomized sieve algorithm, which can be used to solve SVP for any norm in 2^O(n) time with high probability. We use the technique of covering a convex body by ellipsoids, as introduced for lattice problems in (Dadush et al., FOCS 2011). Our main contribution is a deterministic approximation of an M-ellipsoid of any convex body. We achieve this via a convex programming formulation of the optimal ellipsoid with the objective function being an n-dimensional integral that we show can be approximated deterministically, a technique that appears to be of independent interest.

preprint2011arXiv

Enumerative Lattice Algorithms in Any Norm via M-Ellipsoid Coverings

We give a novel algorithm for enumerating lattice points in any convex body, and give applications to several classic lattice problems, including the Shortest and Closest Vector Problems (SVP and CVP, respectively) and Integer Programming (IP). Our enumeration technique relies on a classical concept from asymptotic convex geometry known as the M-ellipsoid, and uses as a crucial subroutine the recent algorithm of Micciancio and Voulgaris (STOC 2010) for lattice problems in the l_2 norm. As a main technical contribution, which may be of independent interest, we build on the techniques of Klartag (Geometric and Functional Analysis, 2006) to give an expected 2^O(n)-time algorithm for computing an M-ellipsoid for any n-dimensional convex body. As applications, we give deterministic 2^{O(n)}-time and -space algorithms for solving exact SVP, and exact CVP when the target point is sufficiently close to the lattice, on n-dimensional lattices in any (semi-)norm given an M-ellipsoid of the unit ball. In many norms of interest, including all l_p norms, an M-ellipsoid is computable in deterministic poly(n) time, in which case these algorithms are fully deterministic. Here our approach may be seen as a derandomization of the "AKS sieve" for exact SVP and CVP (Ajtai, Kumar, and Sivakumar; STOC 2001 and CCC 2002). As a further application of our SVP algorithm, we derive an expected O(f*(n))^n-time algorithm for Integer Programming, where f*(n) denotes the optimal bound in the so-called "flatness theorem," which satisfies f*(n) = O(n^{4/3} \polylog(n)) and is conjectured to be f*(n)=Θ(n). Our runtime improves upon the previous best of O(n^{2})^{n} by Hildebrand and Koppe (2010).

preprint2011arXiv

Many Sparse Cuts via Higher Eigenvalues

Cheeger's fundamental inequality states that any edge-weighted graph has a vertex subset $S$ such that its expansion (a.k.a. conductance) is bounded as follows: \[ ϕ(S) \defeq \frac{w(S,\bar{S})}{\min \set{w(S), w(\bar{S})}} \leq 2\sqrt{λ_2} \] where $w$ is the total edge weight of a subset or a cut and $λ_2$ is the second smallest eigenvalue of the normalized Laplacian of the graph. Here we prove the following natural generalization: for any integer $k \in [n]$, there exist $ck$ disjoint subsets $S_1, ..., S_{ck}$, such that \[ \max_i ϕ(S_i) \leq C \sqrt{λ_{k} \log k} \] where $λ_i$ is the $i^{th}$ smallest eigenvalue of the normalized Laplacian and $c<1,C>0$ are suitable absolute constants. Our proof is via a polynomial-time algorithm to find such subsets, consisting of a spectral projection and a randomized rounding. As a consequence, we get the same upper bound for the small set expansion problem, namely for any $k$, there is a subset $S$ whose weight is at most a $\bigO(1/k)$ fraction of the total weight and $ϕ(S) \le C \sqrt{λ_k \log k}$. Both results are the best possible up to constant factors. The underlying algorithmic problem, namely finding $k$ subsets such that the maximum expansion is minimized, besides extending sparse cuts to more than one subset, appears to be a natural clustering problem in its own right.

preprint2010arXiv

A Deterministic Polynomial-time Approximation Scheme for Counting Knapsack Solutions

Given n elements with nonnegative integer weights w1,..., wn and an integer capacity C, we consider the counting version of the classic knapsack problem: find the number of distinct subsets whose weights add up to at most the given capacity. We give a deterministic algorithm that estimates the number of solutions to within relative error 1+-eps in time polynomial in n and 1/eps (fully polynomial approximation scheme). More precisely, our algorithm takes time O(n^3 (1/eps) log (n/eps)). Our algorithm is based on dynamic programming. Previously, randomized polynomial time approximation schemes were known first by Morris and Sinclair via Markov chain Monte Carlo techniques, and subsequently by Dyer via dynamic programming and rejection sampling.

preprint2009arXiv

Logconcave Random Graphs

We propose the following model of a random graph on n vertices. Let F be a distribution in R_+^{n(n-1)/2} with a coordinate for every pair i$ with 1 \le i,j \le n. Then G_{F,p} is the distribution on graphs with n vertices obtained by picking a random point X from F and defining a graph on n vertices whose edges are pairs ij for which X_{ij} \le p. The standard Erdős-Rényi model is the special case when F is uniform on the 0-1 unit cube. We examine basic properties such as the connectivity threshold for quite general distributions. We also consider cases where the X_{ij} are the edge weights in some random instance of a combinatorial optimization problem. By choosing suitable distributions, we can capture random graphs with interesting properties such as triangle-free random graphs and weighted random graphs with bounded total weight.

preprint2009arXiv

Random Tensors and Planted Cliques

The r-parity tensor of a graph is a generalization of the adjacency matrix, where the tensor's entries denote the parity of the number of edges in subgraphs induced by r distinct vertices. For r=2, it is the adjacency matrix with 1's for edges and -1's for nonedges. It is well-known that the 2-norm of the adjacency matrix of a random graph is O(\sqrt{n}). Here we show that the 2-norm of the r-parity tensor is at most f(r)\sqrt{n}\log^{O(r)}n, answering a question of Frieze and Kannan who proved this for r=3. As a consequence, we get a tight connection between the planted clique problem and the problem of finding a vector that approximates the 2-norm of the r-parity tensor of a random graph. Our proof method is based on an inductive application of concentration of measure.

Santosh Vempala

What is connected

Connect this record

See the researcher in context

Building this map preview

31 published item(s)

Convergence of Gibbs Sampling: Coordinate Hit-and-Run Mixes Fast

Solving Sparse Linear Systems Faster than Matrix Multiplication

Multi-Criteria Dimensionality Reduction with Applications to Fairness

Robustly Clustering a Mixture of Gaussians

Strong Self-Concordance and Sampling

A Note on Non-Degenerate Integer Programs with Small Sub-Determinants

Agnostic Estimation of Mean and Covariance

Chi-squared Amplification: Identifying Hidden Hubs

Cortical Computation via Iterative Constructions

Gaussian Cooling and O*(n^3) Algorithms for Volume and Gaussian Volume

Geometric Random Edge

Statistical Algorithms and a Lower Bound for Detecting Planted Clique

Statistical Query Algorithms for Mean Vector Estimation and Stochastic Convex Optimization

Towards Human Computable Passwords

Subsampled Power Iteration: a Unified Algorithm for Block Models and Planted CSP's

Efficient Representations for Life-Long Learning and Autoencoding

Fourier PCA and Robust Tensor Decomposition

Principal Component Analysis and Higher Correlations for Distributed Data

Stochastic billiards for sampling from the boundary of a convex set

The Cutting Plane Method is Polynomial for Perfect Matchings

A Cubic Algorithm for Computing Gaussian Volume

Integer Feasibility of Random Polytopes

The Complexity of Approximating Vertex Expansion

Near-Optimal Deterministic Algorithms for Volume Computation and Lattice Problems via M-Ellipsoids

Algorithms for Implicit Hitting Set Problems

Deterministic Construction of an Approximate M-Ellipsoid and its Application to Derandomizing Lattice Algorithms

Enumerative Lattice Algorithms in Any Norm via M-Ellipsoid Coverings

Many Sparse Cuts via Higher Eigenvalues

A Deterministic Polynomial-time Approximation Scheme for Counting Knapsack Solutions

Logconcave Random Graphs

Random Tensors and Planted Cliques