Source author record

Moses Charikar

Moses Charikar appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Machine Learning Computational Complexity math.ST Statistics Theory Computation Distributed, Parallel, and Cluster Computing Information Theory math.IT Computer Science and Game Theory Cryptography and Security Databases Human-Computer Interaction

Catalog footprint

What is connected

24works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

A Quasi-Monte Carlo Data Structure for Smooth Kernel Evaluations

In the kernel density estimation (KDE) problem one is given a kernel $K(x, y)$ and a dataset $P$ of points in a Euclidean space, and must prepare a data structure that can quickly answer density queries: given a point $q$, output a $(1+ε)$-approximation to $μ:=\frac1{|P|}\sum_{p\in P} K(p, q)$. The classical approach to KDE is the celebrated fast multipole method of [Greengard and Rokhlin]. The fast multipole method combines a basic space partitioning approach with a multidimensional Taylor expansion, which yields a $\approx \log^d (n/ε)$ query time (exponential in the dimension $d$). A recent line of work initiated by [Charikar and Siminelakis] achieved polynomial dependence on $d$ via a combination of random sampling and randomized space partitioning, with [Backurs et al.] giving an efficient data structure with query time $\approx \mathrm{poly}{\log(1/μ)}/ε^2$ for smooth kernels. Quadratic dependence on $ε$, inherent to the sampling methods, is prohibitively expensive for small $ε$. This issue is addressed by quasi-Monte Carlo methods in numerical analysis. The high level idea in quasi-Monte Carlo methods is to replace random sampling with a discrepancy based approach -- an idea recently applied to coresets for KDE by [Phillips and Tai]. The work of Phillips and Tai gives a space efficient data structure with query complexity $\approx 1/(εμ)$. This is polynomially better in $1/ε$, but exponentially worse in $1/μ$. We achieve the best of both: a data structure with $\approx \mathrm{poly}{\log(1/μ)}/ε$ query time for smooth kernel KDE. Our main insight is a new way to combine discrepancy theory with randomized space partitioning inspired by, but significantly more efficient than, that of the fast multipole methods. We hope that our techniques will find further applications to linear algebra for kernel matrices.

preprint2022arXiv

Almost 3-Approximate Correlation Clustering in Constant Rounds

We study parallel algorithms for correlation clustering. Each pair among $n$ objects is labeled as either "similar" or "dissimilar". The goal is to partition the objects into arbitrarily many clusters while minimizing the number of disagreements with the labels. Our main result is an algorithm that for any $ε> 0$ obtains a $(3+ε)$-approximation in $O(1/ε)$ rounds (of models such as massively parallel computation, local, and semi-streaming). This is a culminating point for the rich literature on parallel correlation clustering. On the one hand, the approximation (almost) matches a natural barrier of 3 for combinatorial algorithms. On the other hand, the algorithm's round-complexity is essentially constant. To achieve this result, we introduce a simple $O(1/ε)$-round parallel algorithm. Our main result is to provide an analysis of this algorithm, showing that it achieves a $(3+ε)$-approximation. Our analysis draws on new connections to sublinear-time algorithms. Specifically, it builds on the work of Yoshida, Yamamoto, and Ito [STOC'09] on bounding the "query complexity" of greedy maximal independent set. To our knowledge, this is the first application of this method in analyzing the approximation ratio of any algorithm.

preprint2022arXiv

Polylogarithmic Sketches for Clustering

Given $n$ points in $\ell_p^d$, we consider the problem of partitioning points into $k$ clusters with associated centers. The cost of a clustering is the sum of $p^{\text{th}}$ powers of distances of points to their cluster centers. For $p \in [1,2]$, we design sketches of size poly$(\log(nd),k,1/ε)$ such that the cost of the optimal clustering can be estimated to within factor $1+ε$, despite the fact that the compressed representation does not contain enough information to recover the cluster centers or the partition into clusters. This leads to a streaming algorithm for estimating the clustering cost with space poly$(\log(nd),k,1/ε)$. We also obtain a distributed memory algorithm, where the $n$ points are arbitrarily partitioned amongst $m$ machines, each of which sends information to a central party who then computes an approximation of the clustering cost. Prior to this work, no such streaming or distributed-memory algorithm was known with sublinear dependence on $d$ for $p \in [1,2)$.

preprint2021arXiv

Approximation Algorithms for Orthogonal Non-negative Matrix Factorization

In the non-negative matrix factorization (NMF) problem, the input is an $m\times n$ matrix $M$ with non-negative entries and the goal is to factorize it as $M\approx AW$. The $m\times k$ matrix $A$ and the $k\times n$ matrix $W$ are both constrained to have non-negative entries. This is in contrast to singular value decomposition, where the matrices $A$ and $W$ can have negative entries but must satisfy the orthogonality constraint: the columns of $A$ are orthogonal and the rows of $W$ are also orthogonal. The orthogonal non-negative matrix factorization (ONMF) problem imposes both the non-negativity and the orthogonality constraints, and previous work showed that it leads to better performances than NMF on many clustering tasks. We give the first constant-factor approximation algorithm for ONMF when one or both of $A$ and $W$ are subject to the orthogonality constraint. We also show an interesting connection to the correlation clustering problem on bipartite graphs. Our experiments on synthetic and real-world data show that our algorithm achieves similar or smaller errors compared to previous ONMF algorithms while ensuring perfect orthogonality (many previous algorithms do not satisfy the hard orthogonality constraint).

preprint2020arXiv

A General Framework for Symmetric Property Estimation

In this paper we provide a general framework for estimating symmetric properties of distributions from i.i.d. samples. For a broad class of symmetric properties we identify the easy region where empirical estimation works and the difficult region where more complex estimators are required. We show that by approximately computing the profile maximum likelihood (PML) distribution \cite{ADOS16} in this difficult region we obtain a symmetric property estimation framework that is sample complexity optimal for many properties in a broader parameter regime than previous universal estimation approaches based on PML. The resulting algorithms based on these pseudo PML distributions are also more practical.

preprint2020arXiv

A Simple Sublinear Algorithm for Gap Edit Distance

We study the problem of estimating the edit distance between two $n$-character strings. While exact computation in the worst case is believed to require near-quadratic time, previous work showed that in certain regimes it is possible to solve the following {\em gap edit distance} problem in sub-linear time: distinguish between inputs of distance $\le k$ and $>k^2$. Our main result is a very simple algorithm for this benchmark that runs in time $\tilde O(n/\sqrt{k})$, and in particular settles the open problem of obtaining a truly sublinear time for the entire range of relevant $k$. Building on the same framework, we also obtain a $k$-vs-$k^2$ algorithm for the one-sided preprocessing model with $\tilde O(n)$ preprocessing time and $\tilde O(n/k)$ query time (improving over a recent $\tilde O(n/k+k^2)$-query time algorithm for the same problem [GRS'20].

preprint2020arXiv

Nearest Neighbor Search for Hyperbolic Embeddings

Embedding into hyperbolic space is emerging as an effective representation technique for datasets that exhibit hierarchical structure. This development motivates the need for algorithms that are able to effectively extract knowledge and insights from datapoints embedded in negatively curved spaces. We focus on the problem of nearest neighbor search, a fundamental problem in data analysis. We present efficient algorithmic solutions that build upon established methods for nearest neighbor search in Euclidean space, allowing for easy adoption and integration with existing systems. We prove theoretical guarantees for our techniques and our experiments demonstrate the effectiveness of our approach on real datasets over competing algorithms.

preprint2020arXiv

New lower bounds for Massively Parallel Computation from query complexity

Roughgarden, Vassilvitskii, and Wang (JACM 18) recently introduced a novel framework for proving lower bounds for Massively Parallel Computation using techniques from boolean function complexity. We extend their framework in two different ways, to capture two common features of Massively Parallel Computation: $\circ$ Adaptivity, where machines can write to and adaptively read from shared memory throughout the execution of the computation. Recent work of Behnezhad et al. (SPAA 19) showed that adaptivity enables significantly improved round complexities for a number of central graph problems. $\circ$ Promise problems, where the algorithm only has to succeed on certain inputs. These inputs may have special structure that is of particular interest, or they may be representative of hard instances of the overall problem. Using this extended framework, we give the first unconditional lower bounds on the complexity of distinguishing whether an input graph is a cycle of length $n$ or two cycles of length $n/2$. This promise problem, 1v2-Cycle, has emerged as a central problem in the study of Massively Parallel Computation. We prove that any adaptive algorithm for the 1v2-Cycle problem with I/O capacity $O(n^{\varepsilon})$ per machine requires $Ω(1/\varepsilon)$ rounds, matching a recent upper bound of Behnezhad et al. In addition to strengthening the connections between Massively Parallel Computation and boolean function complexity, we also develop new machinery to reason about the latter. At the heart of our proofs are optimal lower bounds on the query complexity and approximate certificate complexity of the 1v2-Cycle problem.

preprint2020arXiv

Storyboard: Optimizing Precomputed Summaries for Aggregation

An emerging class of data systems partition their data and precompute approximate summaries (i.e., sketches and samples) for each segment to reduce query costs. They can then aggregate and combine the segment summaries to estimate results without scanning the raw data. However, given limited storage space each summary introduces approximation errors that affect query accuracy. For instance, systems that use existing mergeable summaries cannot reduce query error below the error of an individual precomputed summary. We introduce Storyboard, a query system that optimizes item frequency and quantile summaries for accuracy when aggregating over multiple segments. Compared to conventional mergeable summaries, Storyboard leverages additional memory available for summary construction and aggregation to derive a more precise combined result. This reduces error by up to 25x over interval aggregations and 4.4x over data cube aggregations on industrial datasets compared to standard summarization methods, with provable worst-case error guarantees.

preprint2020arXiv

The Bethe and Sinkhorn Permanents of Low Rank Matrices and Implications for Profile Maximum Likelihood

In this paper we consider the problem of computing the likelihood of the profile of a discrete distribution, i.e., the probability of observing the multiset of element frequencies, and computing a profile maximum likelihood (PML) distribution, i.e., a distribution with the maximum profile likelihood. For each problem we provide polynomial time algorithms that given $n$ i.i.d.\ samples from a discrete distribution, achieve an approximation factor of $\exp\left(-O(\sqrt{n} \log n) \right)$, improving upon the previous best-known bound achievable in polynomial time of $\exp(-O(n^{2/3} \log n))$ (Charikar, Shiragur and Sidford, 2019). Through the work of Acharya, Das, Orlitsky and Suresh (2016), this implies a polynomial time universal estimator for symmetric properties of discrete distributions in a broader range of error parameter. We achieve these results by providing new bounds on the quality of approximation of the Bethe and Sinkhorn permanents (Vontobel, 2012 and 2014). We show that each of these are $\exp(O(k \log(N/k)))$ approximations to the permanent of $N \times N$ matrices with non-negative rank at most $k$, improving upon the previous known bounds of $\exp(O(N))$. To obtain our results on PML, we exploit the fact that the PML objective is proportional to the permanent of a certain Vandermonde matrix with $\sqrt{n}$ distinct columns, i.e. with non-negative rank at most $\sqrt{n}$. As a by-product of our work we establish a surprising connection between the convex relaxation in prior work (CSS19) and the well-studied Bethe and Sinkhorn approximations.

preprint2016arXiv

Approximate Hierarchical Clustering via Sparsest Cut and Spreading Metrics

Dasgupta recently introduced a cost function for the hierarchical clustering of a set of points given pairwise similarities between them. He showed that this function is NP-hard to optimize, but a top-down recursive partitioning heuristic based on an alpha_n-approximation algorithm for uniform sparsest cut gives an approximation of O(alpha_n log n) (the current best algorithm has alpha_n=O(sqrt{log n})). We show that the aforementioned sparsest cut heuristic in fact obtains an O(alpha_n)-approximation for hierarchical clustering. The algorithm also applies to a generalized cost function studied by Dasgupta. Moreover, we obtain a strong inapproximability result, showing that the hierarchical clustering objective is hard to approximate to within any constant factor assuming the Small-Set Expansion (SSE) Hypothesis. Finally, we discuss approximation algorithms based on convex relaxations. We present a spreading metric SDP relaxation for the problem and show that it has integrality gap at most O(sqrt{log n}). The advantage of the SDP relative to the sparsest cut heuristic is that it provides an explicit lower bound on the optimal solution and could potentially yield an even better approximation for hierarchical clustering. In fact our analysis of this SDP served as the inspiration for our improved analysis of the sparsest cut heuristic. We also show that a spreading metric LP relaxation gives an O(log n)-approximation.

preprint2016arXiv

Avoiding Imposters and Delinquents: Adversarial Crowdsourcing and Peer Prediction

We consider a crowdsourcing model in which $n$ workers are asked to rate the quality of $n$ items previously generated by other workers. An unknown set of $αn$ workers generate reliable ratings, while the remaining workers may behave arbitrarily and possibly adversarially. The manager of the experiment can also manually evaluate the quality of a small number of items, and wishes to curate together almost all of the high-quality items with at most an $ε$ fraction of low-quality items. Perhaps surprisingly, we show that this is possible with an amount of work required of the manager, and each worker, that does not scale with $n$: the dataset can be curated with $\tilde{O}\Big(\frac{1}{βα^3ε^4}\Big)$ ratings per worker, and $\tilde{O}\Big(\frac{1}{βε^2}\Big)$ ratings by the manager, where $β$ is the fraction of high-quality items. Our results extend to the more general setting of peer prediction, including peer grading in online classrooms.

preprint2015arXiv

Label optimal regret bounds for online local learning

We resolve an open question from (Christiano, 2014b) posed in COLT'14 regarding the optimal dependency of the regret achievable for online local learning on the size of the label set. In this framework the algorithm is shown a pair of items at each step, chosen from a set of $n$ items. The learner then predicts a label for each item, from a label set of size $L$ and receives a real valued payoff. This is a natural framework which captures many interesting scenarios such as collaborative filtering, online gambling, and online max cut among others. (Christiano, 2014a) designed an efficient online learning algorithm for this problem achieving a regret of $O(\sqrt{nL^3T})$, where $T$ is the number of rounds. Information theoretically, one can achieve a regret of $O(\sqrt{n \log L T})$. One of the main open questions left in this framework concerns closing the above gap. In this work, we provide a complete answer to the question above via two main results. We show, via a tighter analysis, that the semi-definite programming based algorithm of (Christiano, 2014a), in fact achieves a regret of $O(\sqrt{nLT})$. Second, we show a matching computational lower bound. Namely, we show that a polynomial time algorithm for online local learning with lower regret would imply a polynomial time algorithm for the planted clique problem which is widely believed to be hard. We prove a similar hardness result under a related conjecture concerning planted dense subgraphs that we put forth. Unlike planted clique, the planted dense subgraph problem does not have any known quasi-polynomial time algorithms. Computational lower bounds for online learning are relatively rare, and we hope that the ideas developed in this work will lead to lower bounds for other online learning scenarios as well.

preprint2015arXiv

Relax, no need to round: integrality of clustering formulations

We study exact recovery conditions for convex relaxations of point cloud clustering problems, focusing on two of the most common optimization problems for unsupervised clustering: $k$-means and $k$-median clustering. Motivations for focusing on convex relaxations are: (a) they come with a certificate of optimality, and (b) they are generic tools which are relatively parameter-free, not tailored to specific assumptions over the input. More precisely, we consider the distributional setting where there are $k$ clusters in $\mathbb{R}^m$ and data from each cluster consists of $n$ points sampled from a symmetric distribution within a ball of unit radius. We ask: what is the minimal separation distance between cluster centers needed for convex relaxations to exactly recover these $k$ clusters as the optimal integral solution? For the $k$-median linear programming relaxation we show a tight bound: exact recovery is obtained given arbitrarily small pairwise separation $ε> 0$ between the balls. In other words, the pairwise center separation is $Δ> 2+ε$. Under the same distributional model, the $k$-means LP relaxation fails to recover such clusters at separation as large as $Δ= 4$. Yet, if we enforce PSD constraints on the $k$-means LP, we get exact cluster recovery at center separation $Δ> 2\sqrt2(1+\sqrt{1/m})$. In contrast, common heuristics such as Lloyd's algorithm (a.k.a. the $k$-means algorithm) can fail to recover clusters in this setting; even with arbitrarily large cluster separation, k-means++ with overseeding by any constant factor fails with high probability at exact cluster recovery. To complement the theoretical analysis, we provide an experimental study of the recovery guarantees for these various methods, and discuss several open problems which these experiments suggest.

preprint2015arXiv

The Hardness of Approximation of Euclidean k-means

The Euclidean $k$-means problem is a classical problem that has been extensively studied in the theoretical computer science, machine learning and the computational geometry communities. In this problem, we are given a set of $n$ points in Euclidean space $R^d$, and the goal is to choose $k$ centers in $R^d$ so that the sum of squared distances of each point to its nearest center is minimized. The best approximation algorithms for this problem include a polynomial time constant factor approximation for general $k$ and a $(1+ε)$-approximation which runs in time $poly(n) 2^{O(k/ε)}$. At the other extreme, the only known computational complexity result for this problem is NP-hardness [ADHP'09]. The main difficulty in obtaining hardness results stems from the Euclidean nature of the problem, and the fact that any point in $R^d$ can be a potential center. This gap in understanding left open the intriguing possibility that the problem might admit a PTAS for all $k,d$. In this paper we provide the first hardness of approximation for the Euclidean $k$-means problem. Concretely, we show that there exists a constant $ε> 0$ such that it is NP-hard to approximate the $k$-means objective to within a factor of $(1+ε)$. We show this via an efficient reduction from the vertex cover problem on triangle-free graphs: given a triangle-free graph, the goal is to choose the fewest number of vertices which are incident on all the edges. Additionally, we give a proof that the current best hardness results for vertex cover can be carried over to triangle-free graphs. To show this we transform $G$, a known hard vertex cover instance, by taking a graph product with a suitably chosen graph $H$, and showing that the size of the (normalized) maximum independent set is almost exactly preserved in the product graph using a spectral analysis, which might be of independent interest.

preprint2014arXiv

Online Bipartite Matching with Decomposable Weights

We study a weighted online bipartite matching problem: $G(V_1, V_2, E)$ is a weighted bipartite graph where $V_1$ is known beforehand and the vertices of $V_2$ arrive online. The goal is to match vertices of $V_2$ as they arrive to vertices in $V_1$, so as to maximize the sum of weights of edges in the matching. If assignments to $V_1$ cannot be changed, no bounded competitive ratio is achievable. We study the weighted online matching problem with {\em free disposal}, where vertices in $V_1$ can be assigned multiple times, but only get credit for the maximum weight edge assigned to them over the course of the algorithm. For this problem, the greedy algorithm is $0.5$-competitive and determining whether a better competitive ratio is achievable is a well known open problem. We identify an interesting special case where the edge weights are decomposable as the product of two factors, one corresponding to each end point of the edge. This is analogous to the well studied related machines model in the scheduling literature, although the objective functions are different. For this case of decomposable edge weights, we design a 0.5664 competitive randomized algorithm in complete bipartite graphs. We show that such instances with decomposable weights are non-trivial by establishing upper bounds of 0.618 for deterministic and $0.8$ for randomized algorithms. A tight competitive ratio of $1-1/e \approx 0.632$ was known previously for both the 0-1 case as well as the case where edge weights depend on the offline vertices only, but for these cases, reassignments cannot change the quality of the solution. Beating 0.5 for weighted matching where reassignments are necessary has been a significant challenge. We thus give the first online algorithm with competitive ratio strictly better than 0.5 for a non-trivial case of weighted matching with free disposal.

preprint2014arXiv

Smoothed Analysis of Tensor Decompositions

Low rank tensor decompositions are a powerful tool for learning generative models, and uniqueness results give them a significant advantage over matrix decomposition methods. However, tensors pose significant algorithmic challenges and tensors analogs of much of the matrix algebra toolkit are unlikely to exist because of hardness results. Efficient decomposition in the overcomplete case (where rank exceeds dimension) is particularly challenging. We introduce a smoothed analysis model for studying these questions and develop an efficient algorithm for tensor decomposition in the highly overcomplete case (rank polynomial in the dimension). In this setting, we show that our algorithm is robust to inverse polynomial error -- a crucial property for applications in learning since we are only allowed a polynomial number of samples. While algorithms are known for exact tensor decomposition in some overcomplete settings, our main contribution is in analyzing their stability in the framework of smoothed analysis. Our main technical contribution is to show that tensor products of perturbed vectors are linearly independent in a robust sense (i.e. the associated matrix has singular values that are at least an inverse polynomial). This key result paves the way for applying tensor methods to learning problems in the smoothed setting. In particular, we use it to obtain results for learning multi-view models and mixtures of axis-aligned Gaussians where there are many more "components" than dimensions. The assumption here is that the model is not adversarially chosen, formalized by a perturbation of model parameters. We believe this an appealing way to analyze realistic instances of learning problems, since this framework allows us to overcome many of the usual limitations of using tensor methods.

preprint2013arXiv

Multireference Alignment using Semidefinite Programming

The multireference alignment problem consists of estimating a signal from multiple noisy shifted observations. Inspired by existing Unique-Games approximation algorithms, we provide a semidefinite program (SDP) based relaxation which approximates the maximum likelihood estimator (MLE) for the multireference alignment problem. Although we show that the MLE problem is Unique-Games hard to approximate within any constant, we observe that our poly-time approximation algorithm for the MLE appears to perform quite well in typical instances, outperforming existing methods. In an attempt to explain this behavior we provide stability guarantees for our SDP under a random noise model on the observations. This case is more challenging to analyze than traditional semi-random instances of Unique-Games: the noise model is on vertices of a graph and translates into dependent noise on the edges. Interestingly, we show that if certain positivity constraints in the SDP are dropped, its solution becomes equivalent to performing phase correlation, a popular method used for pairwise alignment in imaging applications. Finally, we show how symmetry reduction techniques from matrix representation theory can simplify the analysis and computation of the SDP, greatly decreasing its computational cost.

preprint2013arXiv

Uniqueness of Tensor Decompositions with Applications to Polynomial Identifiability

We give a robust version of the celebrated result of Kruskal on the uniqueness of tensor decompositions: we prove that given a tensor whose decomposition satisfies a robust form of Kruskal's rank condition, it is possible to approximately recover the decomposition if the tensor is known up to a sufficiently small (inverse polynomial) error. Kruskal's theorem has found many applications in proving the identifiability of parameters for various latent variable models and mixture models such as Hidden Markov models, topic models etc. Our robust version immediately implies identifiability using only polynomially many samples in many of these settings. This polynomial identifiability is an essential first step towards efficient learning algorithms for these models. Recently, algorithms based on tensor decompositions have been used to estimate the parameters of various hidden variable models efficiently in special cases as long as they satisfy certain "non-degeneracy" properties. Our methods give a way to go beyond this non-degeneracy barrier, and establish polynomial identifiability of the parameters under much milder conditions. Given the importance of Kruskal's theorem in the tensor literature, we expect that this robust version will have several applications beyond the settings we explore in this work.

preprint2011arXiv

On Quadratic Programming with a Ratio Objective

Quadratic Programming (QP) is the well-studied problem of maximizing over {-1,1} values the quadratic form \sum_{i \ne j} a_{ij} x_i x_j. QP captures many known combinatorial optimization problems, and assuming the unique games conjecture, semidefinite programming techniques give optimal approximation algorithms. We extend this body of work by initiating the study of Quadratic Programming problems where the variables take values in the domain {-1,0,1}. The specific problems we study are QP-Ratio : \max_{\{-1,0,1\}^n} \frac{\sum_{i \not = j} a_{ij} x_i x_j}{\sum x_i^2}, and Normalized QP-Ratio : \max_{\{-1,0,1\}^n} \frac{\sum_{i \not = j} a_{ij} x_i x_j}{\sum d_i x_i^2}, where d_i = \sum_j |a_{ij}| We consider an SDP relaxation obtained by adding constraints to the natural eigenvalue (or SDP) relaxation for this problem. Using this, we obtain an $\tilde{O}(n^{1/3})$ algorithm for QP-ratio. We also obtain an $\tilde{O}(n^{1/4})$ approximation for bipartite graphs, and better algorithms for special cases. As with other problems with ratio objectives (e.g. uniform sparsest cut), it seems difficult to obtain inapproximability results based on P!=NP. We give two results that indicate that QP-Ratio is hard to approximate to within any constant factor. We also give a natural distribution on instances of QP-Ratio for which an n^εapproximation (for εroughly 1/10) seems out of reach of current techniques.

preprint2011arXiv

Polynomial integrality gaps for strong SDP relaxations of Densest k-subgraph

The densest k-subgraph (DkS) problem (i.e. find a size k subgraph with maximum number of edges), is one of the notorious problems in approximation algorithms. There is a significant gap between known upper and lower bounds for DkS: the current best algorithm gives an ~ O(n^{1/4}) approximation, while even showing a small constant factor hardness requires significantly stronger assumptions than P != NP. In addition to interest in designing better algorithms, a number of recent results have exploited the conjectured hardness of densest k-subgraph and its variants. Thus, understanding the approximability of DkS is an important challenge. In this work, we give evidence for the hardness of approximating DkS within polynomial factors. Specifically, we expose the limitations of strong semidefinite programs from SDP hierarchies in solving densest k-subgraph. Our results include: * A lower bound of Omega(n^{1/4}/log^3 n) on the integrality gap for Omega(log n/log log n) rounds of the Sherali-Adams relaxation for DkS. This also holds for the relaxation obtained from Sherali-Adams with an added SDP constraint. Our gap instances are in fact Erdos-Renyi random graphs. * For every epsilon > 0, a lower bound of n^{2/53-eps} on the integrality gap of n^{Omega(eps)} rounds of the Lasserre SDP relaxation for DkS, and an n^{Omega_eps(1)} gap for n^{1-eps} rounds. Our construction proceeds via a reduction from random instances of a certain Max-CSP over large domains. In the absence of inapproximability results for DkS, our results show that even the most powerful SDPs are unable to beat a factor of n^{Omega(1)}, and in fact even improving the best known n^{1/4} factor is a barrier for current techniques.

preprint2010arXiv

Detecting High Log-Densities -- an O(n^1/4) Approximation for Densest k-Subgraph

In the Densest k-Subgraph problem, given a graph G and a parameter k, one needs to find a subgraph of G induced on k vertices that contains the largest number of edges. There is a significant gap between the best known upper and lower bounds for this problem. It is NP-hard, and does not have a PTAS unless NP has subexponential time algorithms. On the other hand, the current best known algorithm of Feige, Kortsarz and Peleg, gives an approximation ratio of n^(1/3-epsilon) for some specific epsilon > 0 (estimated at around 1/60). We present an algorithm that for every epsilon > 0 approximates the Densest k-Subgraph problem within a ratio of n^(1/4+epsilon) in time n^O(1/epsilon). In particular, our algorithm achieves an approximation ratio of O(n^1/4) in time n^O(log n). Our algorithm is inspired by studying an average-case version of the problem where the goal is to distinguish random graphs from graphs with planted dense subgraphs. The approximation ratio we achieve for the general case matches the distinguishing ratio we obtain for this planted problem. At a high level, our algorithms involve cleverly counting appropriately defined trees of constant size in G, and using these counts to identify the vertices of the dense subgraph. Our algorithm is based on the following principle. We say that a graph G(V,E) has log-density alpha if its average degree is Theta(|V|^alpha). The algorithmic core of our result is a family of algorithms that output k-subgraphs of nontrivial density whenever the log-density of the densest k-subgraph is larger than the log-density of the host graph.

preprint2010arXiv

Limits of Approximation Algorithms: PCPs and Unique Games (DIMACS Tutorial Lecture Notes)

These are the lecture notes for the DIMACS Tutorial "Limits of Approximation Algorithms: PCPs and Unique Games" held at the DIMACS Center, CoRE Building, Rutgers University on 20-21 July, 2009. This tutorial was jointly sponsored by the DIMACS Special Focus on Hardness of Approximation, the DIMACS Special Focus on Algorithmic Foundations of the Internet, and the Center for Computational Intractability with support from the National Security Agency and the National Science Foundation. The speakers at the tutorial were Matthew Andrews, Sanjeev Arora, Moses Charikar, Prahladh Harsha, Subhash Khot, Dana Moshkovitz and Lisa Zhang. The sribes were Ashkan Aazami, Dev Desai, Igor Gorodezky, Geetha Jagannathan, Alexander S. Kulikov, Darakhshan J. Mir, Alantha Newman, Aleksandar Nikolov, David Pritchard and Gwen Spencer.

preprint2010arXiv

Vertex Sparsifiers and Abstract Rounding Algorithms

The notion of vertex sparsification is introduced in \cite{M}, where it was shown that for any graph $G = (V, E)$ and a subset of $k$ terminals $K \subset V$, there is a polynomial time algorithm to construct a graph $H = (K, E_H)$ on just the terminal set so that simultaneously for all cuts $(A, K-A)$, the value of the minimum cut in $G$ separating $A$ from $K -A$ is approximately the same as the value of the corresponding cut in $H$. We give the first super-constant lower bounds for how well a cut-sparsifier $H$ can simultaneously approximate all minimum cuts in $G$. We prove a lower bound of $Ω(\log^{1/4} k)$ -- this is polynomially-related to the known upper bound of $O(\log k/\log \log k)$. This is an exponential improvement on the $Ω(\log \log k)$ bound given in \cite{LM} which in fact was for a stronger vertex sparsification guarantee, and did not apply to cut sparsifiers. Despite this negative result, we show that for many natural problems, we do not need to incur a multiplicative penalty for our reduction. We obtain optimal $O(\log k)$-competitive Steiner oblivious routing schemes, which generalize the results in \cite{R}. We also demonstrate that for a wide range of graph packing problems (which includes maximum concurrent flow, maximum multiflow and multicast routing, among others, as a special case), the integrality gap of the linear program is always at most $O(\log k)$ times the integrality gap restricted to trees. This result helps to explain the ubiquity of the $O(\log k)$ guarantees for such problems. Lastly, we use our ideas to give an efficient construction for vertex-sparsifiers that match the current best existential results -- this was previously open. Our algorithm makes novel use of Earth-mover constraints.

Moses Charikar

What is connected

Connect this record

See the researcher in context

Building this map preview

24 published item(s)

A Quasi-Monte Carlo Data Structure for Smooth Kernel Evaluations

Almost 3-Approximate Correlation Clustering in Constant Rounds

Polylogarithmic Sketches for Clustering

Approximation Algorithms for Orthogonal Non-negative Matrix Factorization

A General Framework for Symmetric Property Estimation

A Simple Sublinear Algorithm for Gap Edit Distance

Nearest Neighbor Search for Hyperbolic Embeddings

New lower bounds for Massively Parallel Computation from query complexity

Storyboard: Optimizing Precomputed Summaries for Aggregation

The Bethe and Sinkhorn Permanents of Low Rank Matrices and Implications for Profile Maximum Likelihood

Approximate Hierarchical Clustering via Sparsest Cut and Spreading Metrics

Avoiding Imposters and Delinquents: Adversarial Crowdsourcing and Peer Prediction

Label optimal regret bounds for online local learning

Relax, no need to round: integrality of clustering formulations

The Hardness of Approximation of Euclidean k-means

Online Bipartite Matching with Decomposable Weights

Smoothed Analysis of Tensor Decompositions

Multireference Alignment using Semidefinite Programming

Uniqueness of Tensor Decompositions with Applications to Polynomial Identifiability

On Quadratic Programming with a Ratio Objective

Polynomial integrality gaps for strong SDP relaxations of Densest k-subgraph

Detecting High Log-Densities -- an O(n^1/4) Approximation for Densest k-Subgraph

Limits of Approximation Algorithms: PCPs and Unique Games (DIMACS Tutorial Lecture Notes)

Vertex Sparsifiers and Abstract Rounding Algorithms