Source author record

Robert Krauthgamer

Robert Krauthgamer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Computational Geometry Machine Learning math.CO math.MG math.ST Statistics Theory Computational Complexity Discrete Mathematics eess.SP Information Theory math.FA math.IT math.OC

Catalog footprint

What is connected

37works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Almost-Smooth Histograms and Sliding-Window Graph Algorithms

We study algorithms for the sliding-window model, an important variant of the data-stream model, in which the goal is to compute some function of a fixed-length suffix of the stream. We extend the smooth-histogram framework of Braverman and Ostrovsky (FOCS 2007) to almost-smooth functions, which includes all subadditive functions. Specifically, we show that if a subadditive function can be $(1+ε)$-approximated in the insertion-only streaming model, then it can be $(2+ε)$-approximated also in the sliding-window model with space complexity larger by factor $O(ε^{-1}\log w)$, where $w$ is the window size. We demonstrate how our framework yields new approximation algorithms with relatively little effort for a variety of problems that do not admit the smooth-histogram technique. For example, in the frequency-vector model, a symmetric norm is subadditive and thus we obtain a sliding-window $(2+ε)$-approximation algorithm for it. Another example is for streaming matrices, where we derive a new sliding-window $(\sqrt{2}+ε)$-approximation algorithm for Schatten $4$-norm. We then consider graph streams and show that many graph problems are subadditive, including maximum submodular matching, minimum vertex-cover, and maximum $k$-cover, thereby deriving sliding-window $O(1)$-approximation algorithms for them almost for free (using known insertion-only algorithms). Finally, we design for every $d\in (1,2]$ an artificial function, based on the maximum-matching size, whose almost-smoothness parameter is exactly $d$.

preprint2022arXiv

Breaking the Cubic Barrier for All-Pairs Max-Flow: Gomory-Hu Tree in Nearly Quadratic Time

In 1961, Gomory and Hu showed that the All-Pairs Max-Flow problem of computing the max-flow between all $n\choose 2$ pairs of vertices in an undirected graph can be solved using only $n-1$ calls to any (single-pair) max-flow algorithm. Even assuming a linear-time max-flow algorithm, this yields a running time of $O(mn)$, which is $O(n^3)$ when $m = Θ(n^2)$. While subsequent work has improved this bound for various special graph classes, no subcubic-time algorithm has been obtained in the last 60 years for general graphs. We break this longstanding barrier by giving an $\tilde{O}(n^{2})$-time algorithm on general, weighted graphs. Combined with a popular complexity assumption, we establish a counter-intuitive separation: all-pairs max-flows are strictly easier to compute than all-pairs shortest-paths. Our algorithm produces a cut-equivalent tree, known as the Gomory-Hu tree, from which the max-flow value for any pair can be retrieved in near-constant time. For unweighted graphs, we refine our techniques further to produce a Gomory-Hu tree in the time of a poly-logarithmic number of calls to any max-flow algorithm. This shows an equivalence between the all-pairs and single-pair max-flow problems, and is optimal up to poly-logarithmic factors. Using the recently announced $m^{1+o(1)}$-time max-flow algorithm (Chen et al., March 2022), our Gomory-Hu tree algorithm for unweighted graphs also runs in $m^{1+o(1)}$-time.

preprint2022arXiv

Distributed Sparse Normal Means Estimation with Sublinear Communication

We consider the problem of sparse normal means estimation in a distributed setting with communication constraints. We assume there are $M$ machines, each holding $d$-dimensional observations of a $K$-sparse vector $μ$ corrupted by additive Gaussian noise. The $M$ machines are connected in a star topology to a fusion center, whose goal is to estimate the vector $μ$ with a low communication budget. Previous works have shown that to achieve the centralized minimax rate for the $\ell_2$ risk, the total communication must be high - at least linear in the dimension $d$. This phenomenon occurs, however, at very weak signals. We show that at signal-to-noise ratios (SNRs) that are sufficiently high - but not enough for recovery by any individual machine - the support of $μ$ can be correctly recovered with significantly less communication. Specifically, we present two algorithms for distributed estimation of a sparse mean vector corrupted by either Gaussian or sub-Gaussian noise. We then prove that above certain SNR thresholds, with high probability, these algorithms recover the correct support with total communication that is sublinear in the dimension $d$. Furthermore, the communication decreases exponentially as a function of signal strength. If in addition $KM\ll \tfrac{d}{\log d}$, then with an additional round of sublinear communication, our algorithms achieve the centralized rate for the $\ell_2$ risk. Finally, we present simulations that illustrate the performance of our algorithms in different parameter regimes.

preprint2022arXiv

Exact Flow Sparsification Requires Unbounded Size

Given a large edge-capacitated network $G$ and a subset of $k$ vertices called terminals, an (exact) flow sparsifier is a small network $G'$ that preserves (exactly) all multicommodity flows that can be routed between the terminals. Flow sparsifiers were introduced by Leighton and Moitra [STOC 2010], and have been studied and used in many algorithmic contexts. A fundamental question that remained open for over a decade, asks whether every $k$-terminal network admits an exact flow sparsifier whose size is bounded by some function $f(k)$ (regardless of the size of $G$ or its capacities). We resolve this question in the negative by proving that there exist $6$-terminal networks $G$ whose flow sparsifiers $G'$ must have arbitrarily large size. This unboundedness is perhaps surprising, since the analogous sparsification that preserves all terminal cuts (called exact cut sparsifier or mimicking network) admits sparsifiers of size $f_0(k)\leq 2^{2^k}$ [Hagerup, Katajainen, Nishimura, and Ragde, JCSS 1998]. We prove our results by analyzing the set of all feasible demands in the network, known as the demand polytope. We identify an invariant of this polytope, essentially the slope of certain facets, that can be made arbitrarily large even for $k=6$, and implies an explicit lower bound on the size of the network. We further use this technique to answer, again in the negative, an open question of Seymour [JCTB 2015] regarding flow-sparsification that uses only contractions and preserves the infeasibility of one demand vector.

preprint2022arXiv

Faster Algorithms for Orienteering and $k$-TSP

We consider the rooted orienteering problem in Euclidean space: Given $n$ points $P$ in $\mathbb R^d$, a root point $s\in P$ and a budget $\mathcal B>0$, find a path that starts from $s$, has total length at most $\mathcal B$, and visits as many points of $P$ as possible. This problem is known to be NP-hard, hence we study $(1-δ)$-approximation algorithms. The previous Polynomial-Time Approximation Scheme (PTAS) for this problem, due to Chen and Har-Peled (2008), runs in time $n^{O(d\sqrt{d}/δ)}(\log n)^{(d/δ)^{O(d)}}$, and improving on this time bound was left as an open problem. Our main contribution is a PTAS with a significantly improved time complexity of $n^{O(1/δ)}(\log n)^{(d/δ)^{O(d)}}$. A known technique for approximating the orienteering problem is to reduce it to solving $1/δ$ correlated instances of rooted $k$-TSP (a $k$-TSP tour is one that visits at least $k$ points). However, the $k$-TSP tours in this reduction must achieve a certain excess guarantee (namely, their length can surpass the optimum length only in proportion to a parameter of the optimum called excess) that is stronger than the usual $(1+δ)$-approximation. Our main technical contribution is to improve the running time of these $k$-TSP variants, particularly in its dependence on the dimension $d$. Indeed, our running time is polynomial even for a moderately large dimension, roughly up to $d=O(\log\log n)$ instead of $d=O(1)$.

preprint2022arXiv

Near-Linear $\varepsilon$-Emulators for Planar Graphs

We study vertex sparsification for distances, in the setting of planar graphs with distortion: Given a planar graph $G$ (with edge weights) and a subset of $k$ terminal vertices, the goal is to construct an $\varepsilon$-emulator, which is a small planar graph $G'$ that contains the terminals and preserves the distances between the terminals up to factor $1+\varepsilon$. We construct the first $\varepsilon$-emulators for planar graphs of near-linear size $\tilde O(k/\varepsilon^{O(1)})$. In terms of $k$, this is a dramatic improvement over the previous quadratic upper bound of Cheung, Goranci and Henzinger, and breaks below known quadratic lower bounds for exact emulators (the case when $\varepsilon=0$). Moreover, our emulators can be computed in (near-)linear time, which lead to fast $(1+\varepsilon)$-approximation algorithms for basic optimization problems on planar graphs, including multiple-source shortest paths, minimum $(s,t)$-cut, graph diameter, and dynamic distace oracle.

preprint2022arXiv

Optimal Vertex-Cut Sparsification of Quasi-Bipartite Graphs

In vertex-cut sparsification, given a graph $G=(V,E)$ with a terminal set $T\subseteq V$, we wish to construct a graph $G'=(V',E')$ with $T\subseteq V'$, such that for every two sets of terminals $A,B\subseteq T$, the size of a minimum $(A,B)$-vertex-cut in $G'$ is the same as in $G$. In the most basic setting, $G$ is unweighted and undirected, and we wish to bound the size of $G'$ by a function of $k=|T|$. Kratsch and Wahlström [JACM 2020] proved that every graph $G$ (possibly directed), admits a vertex-cut sparsifier $G'$ with $O(k^3)$ vertices, which can in fact be constructed in randomized polynomial time. We study (possibly directed) graphs $G$ that are quasi-bipartite, i.e., every edge has at least one endpoint in $T$, and prove that they admit a vertex-cut sparsifier with $O(k^2)$ edges and vertices, which can in fact be constructed in deterministic polynomial time. In fact, this bound naturally extends to all graphs with a small separator into bounded-size sets. Finally, we prove information-theoretically a nearly-matching lower bound, i.e., that $\tildeΩ(k^2)$ edges are required to sparsify quasi-bipartite undirected graphs.

preprint2020arXiv

Coresets for Clustering in Excluded-minor Graphs and Beyond

Coresets are modern data-reduction tools that are widely used in data analysis to improve efficiency in terms of running time, space and communication complexity. Our main result is a fast algorithm to construct a small coreset for k-Median in (the shortest-path metric of) an excluded-minor graph. Specifically, we give the first coreset of size that depends only on $k$, $ε$ and the excluded-minor size, and our running time is quasi-linear (in the size of the input graph). The main innovation in our new algorithm is that is iterative; it first reduces the $n$ input points to roughly $O(\log n)$ reweighted points, then to $O(\log\log n)$, and so forth until the size is independent of $n$. Each step in this iterative size reduction is based on the importance sampling framework of Feldman and Langberg (STOC 2011), with a crucial adaptation that reduces the number of \emph{distinct points}, by employing a terminal embedding (where low distortion is guaranteed only for the distance from every terminal to all other points). Our terminal embedding is technically involved and relies on shortest-path separators, a standard tool in planar and excluded-minor graphs. Furthermore, our new algorithm is applicable also in Euclidean metrics, by simply using a recent terminal embedding result of Narayanan and Nelson, (STOC 2019), which extends the Johnson-Lindenstrauss Lemma. We thus obtain an efficient coreset construction in high-dimensional Euclidean spaces, thereby matching and simplifying state-of-the-art results (Sohler and Woodruff, FOCS 2018; Huang and Vishnoi, STOC 2020). In addition, we also employ terminal embedding with additive distortion to obtain small coresets in graphs with bounded highway dimension, and use applications of our coresets to obtain improved approximation schemes, e.g., an improved PTAS for planar k-Median via a new centroid set.

preprint2020arXiv

Cut-Equivalent Trees are Optimal for Min-Cut Queries

Min-Cut queries are fundamental: Preprocess an undirected edge-weighted graph, to quickly report a minimum-weight cut that separates a query pair of nodes $s,t$. The best data structure known for this problem simply builds a cut-equivalent tree, discovered 60 years ago by Gomory and Hu, who also showed how to construct it using $n-1$ minimum $st$-cut computations. Using state-of-the-art algorithms for minimum $st$-cut (Lee and Sidford, FOCS 2014) arXiv:1312.6713, one can construct the tree in time $\tilde{O}(mn^{3/2})$, which is also the preprocessing time of the data structure. (Throughout, we focus on polynomially-bounded edge weights, noting that faster algorithms are known for small/unit edge weights.) Our main result shows the following equivalence: Cut-equivalent trees can be constructed in near-linear time if and only if there is a data structure for Min-Cut queries with near-linear preprocessing time and polylogarithmic (amortized) query time, and even if the queries are restricted to a fixed source. That is, equivalent trees are an essentially optimal solution for Min-Cut queries. This equivalence holds even for every minor-closed family of graphs, such as bounded-treewidth graphs, for which a two-decade old data structure (Arikati et al., J.~Algorithms 1998) implies the first near-linear time construction of cut-equivalent trees. Moreover, unlike all previous techniques for constructing cut-equivalent trees, ours is robust to relying on approximation algorithms. In particular, using the almost-linear time algorithm for $(1+ε)$-approximate minimum $st$-cut (Kelner et al., SODA 2014), we can construct a $(1+ε)$-approximate flow-equivalent tree (which is a slightly weaker notion) in time $n^{2+o(1)}$. This leads to the first $(1+ε)$-approximation for All-Pairs Max-Flow that runs in time $n^{2+o(1)}$, and matches the output size almost-optimally.

preprint2020arXiv

Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension

Spectral functions of large matrices contains important structural information about the underlying data, and is thus becoming increasingly important. Many times, large matrices representing real-world data are \emph{sparse} or \emph{doubly sparse} (i.e., sparse in both rows and columns), and are accessed as a \emph{stream} of updates, typically organized in \emph{row-order}. In this setting, where space (memory) is the limiting resource, all known algorithms require space that is polynomial in the dimension of the matrix, even for sparse matrices. We address this challenge by providing the first algorithms whose space requirement is \emph{independent of the matrix dimension}, assuming the matrix is doubly-sparse and presented in row-order. Our algorithms approximate the Schatten $p$-norms, which we use in turn to approximate other spectral functions, such as logarithm of the determinant, trace of matrix inverse, and Estrada index. We validate these theoretical performance bounds by numerical experiments on real-world matrices representing social networks. We further prove that multiple passes are unavoidable in this setting, and show extensions of our primary technique, including a trade-off between space requirements and number of passes.

preprint2020arXiv

Tight Recovery Guarantees for Orthogonal Matching Pursuit Under Gaussian Noise

Orthogonal Matching pursuit (OMP) is a popular algorithm to estimate an unknown sparse vector from multiple linear measurements of it. Assuming exact sparsity and that the measurements are corrupted by additive Gaussian noise, the success of OMP is often formulated as exactly recovering the support of the sparse vector. Several authors derived a sufficient condition for exact support recovery by OMP with high probability depending on the signal-to-noise ratio, defined as the magnitude of the smallest non-zero coefficient of the vector divided by the noise level. We make two contributions. First, we derive a slightly sharper sufficient condition for two variants of OMP, in which either the sparsity level or the noise level is known. Next, we show that this sharper sufficient condition is tight, in the following sense: for a wide range of problem parameters, there exist a dictionary of linear measurements and a sparse vector with a signal-to-noise ratio slightly below that of the sufficient condition, for which with high probability OMP fails to recover its support. Finally, we present simulations which illustrate that our condition is tight for a much broader range of dictionaries.

preprint2020arXiv

Universal Streaming of Subset Norms

Most known algorithms in the streaming model of computation aim to approximate a single function such as an $\ell_p$-norm. In 2009, Nelson [\url{https://sublinear.info}, Open Problem 30] asked if it possible to design \emph{universal algorithms}, that simultaneously approximate multiple functions of the stream. In this paper we answer the question of Nelson for the class of \emph{subset $\ell_0$-norms} in the insertion-only frequency-vector model. Given a family of subsets $\mathcal{S}\subset 2^{[n]}$, we provide a single streaming algorithm that can $(1\pm ε)$-approximate the subset-norm for every $S\in\mathcal{S}$. Here, the subset-$\ell_p$-norm of $v\in \mathbb{R}^n$ with respect to set $S\subseteq [n]$ is the $\ell_p$-norm of vector $v_{|S}$ (which denotes restricting $v$ to $S$, by zeroing all other coordinates). Our main result is a near-tight characterization of the space complexity of every family $\mathcal{S}\subset 2^{[n]}$ of subset-$\ell_0$-norms in insertion-only streams, expressed in terms of the "heavy-hitter dimension" of $\mathcal{S}$, a new combinatorial quantity that is related to the VC-dimension of $\mathcal{S}$. In contrast, we show that the more general turnstile and sliding-window models require a much larger space usage. All these results easily extend to $\ell_1$. In addition, we design algorithms for two other subset-$\ell_p$-norm variants. These can be compared to the Priority Sampling algorithm of Duffield, Lund and Thorup [JACM 2007], which achieves additive approximation $ε\|{v}\|$ for all possible subsets ($\mathcal{S}=2^{[n]}$) in the entry-wise update model. One of our algorithms extends this algorithm to handle turnstile updates, and another one achieves multiplicative approximation given a family $\mathcal{S}$.

preprint2016arXiv

Cheeger-type approximation for sparsest $st$-cut

We introduce the $st$-cut version the Sparsest-Cut problem, where the goal is to find a cut of minimum sparsity among those separating two distinguished vertices $s,t\in V$. Clearly, this problem is at least as hard as the usual (non-$st$) version. Our main result is a polynomial-time algorithm for the product-demands setting, that produces a cut of sparsity $O(\sqrt{\OPT})$, where $\OPT$ denotes the optimum, and the total edge capacity and the total demand are assumed (by normalization) to be $1$. Our result generalizes the recent work of Trevisan [arXiv, 2013] for the non-$st$ version of the same problem (Sparsest-Cut with product demands), which in turn generalizes the bound achieved by the discrete Cheeger inequality, a cornerstone of Spectral Graph Theory that has numerous applications. Indeed, Cheeger's inequality handles graph conductance, the special case of product demands that are proportional to the vertex (capacitated) degrees. Along the way, we obtain an $O(\log n)$-approximation, where $n=\card{V}$, for the general-demands setting of Sparsest $st$-Cut.

preprint2016arXiv

Metric Decompositions of Path-Separable Graphs

A prominent tool in many problems involving metric spaces is a notion of randomized low-diameter decomposition. Loosely speaking, $β$-decomposition refers to a probability distribution over partitions of the metric into sets of low diameter, such that nearby points (parameterized by $β>0$) are likely to be "clustered" together. Applying this notion to the shortest-path metric in edge-weighted graphs, it is known that $n$-vertex graphs admit an $O(\ln n)$-padded decomposition (Bartal, 1996), and that excluded-minor graphs admit $O(1)$-padded decomposition (Klein, Plotkin and Rao 1993, Fakcharoenphol and Talwar 2003, Abraham et al. 2014). We design decompositions to the family of $p$-path-separable graphs, which was defined by Abraham and Gavoille (2006). and refers to graphs that admit vertex-separators consisting of at most $p$ shortest paths in the graph. Our main result is that every $p$-path-separable $n$-vertex graph admits an $O(\ln (p \ln n))$-decomposition, which refines the $O(\ln n)$ bound for general graphs, and provides new bounds for families like bounded-treewidth graphs. Technically, our clustering process differs from previous ones by working in (the shortest-path metric of) carefully chosen subgraphs.

preprint2015arXiv

A Nonlinear Approach to Dimension Reduction

The $l_2$ flattening lemma of Johnson and Lindenstrauss [JL84] is a powerful tool for dimension reduction. It has been conjectured that the target dimension bounds can be refined and bounded in terms of the intrinsic dimensionality of the data set (for example, the doubling dimension). One such problem was proposed by Lang and Plaut [LP01] (see also [GKL03,MatousekProblems07,ABN08,CGT10]), and is still open. We prove another result in this line of work: The snowflake metric $d^{1/2}$ of a doubling set $S \subset l_2$ embeds with constant distortion into $l_2^D$, for dimension $D$ that depends solely on the doubling constant of the metric. In fact, the distortion can be made arbitrarily close to 1, and the target dimension is polylogarithmic in the doubling constant. Our techniques are robust and extend to the more difficult spaces $l_1$ and $l_\infty$, although the dimension bounds here are quantitatively inferior than those for $l_2$.

preprint2015arXiv

Adaptive Metric Dimensionality Reduction

We study adaptive data-dependent dimensionality reduction in the context of supervised learning in general metric spaces. Our main statistical contribution is a generalization bound for Lipschitz functions in metric spaces that are doubling, or nearly doubling. On the algorithmic front, we describe an analogue of PCA for metric spaces: namely an efficient procedure that approximates the data's intrinsic dimension, which is often much lower than the ambient dimension. Our approach thus leverages the dual benefits of low dimensionality: (1) more efficient algorithms, e.g., for proximity search, and (2) more optimistic generalization bounds.

preprint2015arXiv

Cutting corners cheaply, or how to remove Steiner points

Our main result is that the Steiner Point Removal (SPR) problem can always be solved with polylogarithmic distortion, which answers in the affirmative a question posed by Chan, Xia, Konjevod, and Richa (2006). Specifically, we prove that for every edge-weighted graph $G = (V,E,w)$ and a subset of terminals $T \subseteq V$, there is a graph $G'=(T,E',w')$ that is isomorphic to a minor of $G$, such that for every two terminals $u,v\in T$, the shortest-path distances between them in $G$ and in $G'$ satisfy $d_{G,w}(u,v) \le d_{G',w'}(u,v) \le O(\log^5|T|) \cdot d_{G,w}(u,v)$. Our existence proof actually gives a randomized polynomial-time algorithm. Our proof features a new variant of metric decomposition. It is well-known that every $n$-point metric space $(X,d)$ admits a $β$-separating decomposition for $β=O(\log n)$, which roughly means for every desired diameter bound $Δ>0$ there is a randomized partitioning of $X$, which satisfies the following separation requirement: for every $x,y \in X$, the probability they lie in different clusters of the partition is at most $β\,d(x,y)/Δ$. We introduce an additional requirement, which is the following tail bound: for every shortest-path $P$ of length $d(P) \leq Δ/β$, the number of clusters of the partition that meet the path $P$, denoted $Z_P$, satisfies $\Pr[Z_P > t] \le 2e^{-Ω(t)}$ for all $t>0$.

preprint2015arXiv

Do semidefinite relaxations solve sparse PCA up to the information limit?

Estimating the leading principal components of data, assuming they are sparse, is a central task in modern high-dimensional statistics. Many algorithms were developed for this sparse PCA problem, from simple diagonal thresholding to sophisticated semidefinite programming (SDP) methods. A key theoretical question is under what conditions can such algorithms recover the sparse principal components? We study this question for a single-spike model with an $\ell_0$-sparse eigenvector, in the asymptotic regime as dimension $p$ and sample size $n$ both tend to infinity. Amini and Wainwright [Ann. Statist. 37 (2009) 2877-2921] proved that for sparsity levels $k\geqΩ(n/\log p)$, no algorithm, efficient or not, can reliably recover the sparse eigenvector. In contrast, for $k\leq O(\sqrt{n/\log p})$, diagonal thresholding is consistent. It was further conjectured that an SDP approach may close this gap between computational and information limits. We prove that when $k\geqΩ(\sqrt{n})$, the proposed SDP approach, at least in its standard usage, cannot recover the sparse spike. In fact, we conjecture that in the single-spike model, no computationally-efficient algorithm can recover a spike of $\ell_0$-sparsity $k\geqΩ(\sqrt{n})$. Finally, we present empirical results suggesting that up to sparsity levels $k=O(\sqrt{n})$, recovery is possible by a simple covariance thresholding algorithm.

preprint2015arXiv

Sparsification of Two-Variable Valued CSPs

A valued constraint satisfaction problem (VCSP) instance $(V,Π,w)$ is a set of variables $V$ with a set of constraints $Π$ weighted by $w$. Given a VCSP instance, we are interested in a re-weighted sub-instance $(V,Π'\subset Π,w')$ such that preserves the value of the given instance (under every assignment to the variables) within factor $1\pmε$. A well-studied special case is cut sparsification in graphs, which has found various applications. We show that a VCSP instance consisting of a single boolean predicate $P(x,y)$ (e.g., for cut, $P=\mbox{XOR}$) can be sparsified into $O(|V|/ε^2)$ constraints if and only if the number of inputs that satisfy $P$ is anything but one (i.e., $|P^{-1}(1)| \neq 1$). Furthermore, this sparsity bound is tight unless $P$ is a relatively trivial predicate. We conclude that also systems of 2SAT (or 2LIN) constraints can be sparsified.

preprint2015arXiv

The Traveling Salesman Problem: Low-Dimensionality Implies a Polynomial Time Approximation Scheme

The Traveling Salesman Problem (TSP) is among the most famous NP-hard optimization problems. We design for this problem a randomized polynomial-time algorithm that computes a (1+eps)-approximation to the optimal tour, for any fixed eps>0, in TSP instances that form an arbitrary metric space with bounded intrinsic dimension. The celebrated results of Arora (A-98) and Mitchell (M-99) prove that the above result holds in the special case of TSP in a fixed-dimensional Euclidean space. Thus, our algorithm demonstrates that the algorithmic tractability of metric TSP depends on the dimensionality of the space and not on its specific geometry. This result resolves a problem that has been open since the quasi-polynomial time algorithm of Talwar (T-04).

preprint2015arXiv

Towards Resistance Sparsifiers

We study resistance sparsification of graphs, in which the goal is to find a sparse subgraph (with reweighted edges) that approximately preserves the effective resistances between every pair of nodes. We show that every dense regular expander admits a $(1+ε)$-resistance sparsifier of size $\tilde O(n/ε)$, and conjecture this bound holds for all graphs on $n$ nodes. In comparison, spectral sparsification is a strictly stronger notion and requires $Ω(n/ε^2)$ edges even on the complete graph. Our approach leads to the following structural question on graphs: Does every dense regular expander contain a sparse regular expander as a subgraph? Our main technical contribution, which may of independent interest, is a positive answer to this question in a certain setting of parameters. Combining this with a recent result of von Luxburg, Radl, and Hein~(JMLR, 2014) leads to the aforementioned resistance sparsifiers.

preprint2014arXiv

Efficient Classification for Metric Data

Recent advances in large-margin classification of data residing in general metric spaces (rather than Hilbert spaces) enable classification under various natural metrics, such as string edit and earthmover distance. A general framework developed for this purpose by von Luxburg and Bousquet [JMLR, 2004] left open the questions of computational efficiency and of providing direct bounds on generalization error. We design a new algorithm for classification in general metric spaces, whose runtime and accuracy depend on the doubling dimension of the data points, and can thus achieve superior classification performance in many common scenarios. The algorithmic core of our approach is an approximate (rather than exact) solution to the classical problems of Lipschitz extension and of Nearest Neighbor Search. The algorithm's generalization performance is guaranteed via the fat-shattering dimension of Lipschitz classifiers, and we present experimental evidence of its superiority to some common kernel methods. As a by-product, we offer a new perspective on the nearest neighbor classifier, which yields significantly sharper risk asymptotics than the classic analysis of Cover and Hart [IEEE Trans. Info. Theory, 1967].

preprint2014arXiv

Spectral Approaches to Nearest Neighbor Search

We study spectral algorithms for the high-dimensional Nearest Neighbor Search problem (NNS). In particular, we consider a semi-random setting where a dataset $P$ in $\mathbb{R}^d$ is chosen arbitrarily from an unknown subspace of low dimension $k\ll d$, and then perturbed by fully $d$-dimensional Gaussian noise. We design spectral NNS algorithms whose query time depends polynomially on $d$ and $\log n$ (where $n=|P|$) for large ranges of $k$, $d$ and $n$. Our algorithms use a repeated computation of the top PCA vector/subspace, and are effective even when the random-noise magnitude is {\em much larger} than the interpoint distances in $P$. Our motivation is that in practice, a number of spectral NNS algorithms outperform the random-projection methods that seem otherwise theoretically optimal on worst case datasets. In this paper we aim to provide theoretical justification for this disparity.

preprint2014arXiv

The Sketching Complexity of Graph Cuts

We study the problem of sketching an input graph, so that given the sketch, one can estimate the weight of any cut in the graph within factor $1+ε$. We present lower and upper bounds on the size of a randomized sketch, focusing on the dependence on the accuracy parameter $ε>0$. First, we prove that for every $ε> 1/\sqrt n$, every sketch that succeeds (with constant probability) in estimating the weight of all cuts $(S,\bar S)$ in an $n$-vertex graph (simultaneously), must be of size $Ω(n/ε^2)$ bits. In the special case where the sketch is itself a weighted graph (which may or may not be a subgraph) and the estimator is the sum of edge weights across the cut in the sketch, i.e., a cut sparsifier, we show the sketch must have $Ω(n/ε^2)$ edges, which is optimal. Despite the long sequence of work on graph sparsification, no such lower bound was known on the size of a cut sparsifier. We then design a randomized sketch that, given $ε\in(0,1)$ and an edge-weighted $n$-vertex graph, produces a sketch of size $\tilde O(n/ε)$ bits, from which the weight of any cut $(S,\bar S)$ can be reported, with high probability, within factor $1+ε$. The previous upper bound is $\tilde O(n/ε^2)$ bits, which follows by storing a cut sparsifier (Bencz{ú}r and Karger, 1996). To obtain this improvement, we critically use both that the sketch need only be correct on each fixed cut with high probability (rather than on all cuts), and that the estimation procedure of the data structure can be arbitrary (rather than a weighted subgraph). We also show a lower bound of $Ω(n/ε)$ bits for the space requirement of any data structure achieving this guarantee.

preprint2014arXiv

Vertex Sparsifiers: New Results from Old Techniques

Given a capacitated graph $G = (V,E)$ and a set of terminals $K \subseteq V$, how should we produce a graph $H$ only on the terminals $K$ so that every (multicommodity) flow between the terminals in $G$ could be supported in $H$ with low congestion, and vice versa? (Such a graph $H$ is called a flow-sparsifier for $G$.) What if we want $H$ to be a "simple" graph? What if we allow $H$ to be a convex combination of simple graphs? Improving on results of Moitra [FOCS 2009] and Leighton and Moitra [STOC 2010], we give efficient algorithms for constructing: (a) a flow-sparsifier $H$ that maintains congestion up to a factor of $O(\log k/\log \log k)$, where $k = |K|$, (b) a convex combination of trees over the terminals $K$ that maintains congestion up to a factor of $O(\log k)$, and (c) for a planar graph $G$, a convex combination of planar graphs that maintains congestion up to a constant factor. This requires us to give a new algorithm for the 0-extension problem, the first one in which the preimages of each terminal are connected in $G$. Moreover, this result extends to minor-closed families of graphs. Our improved bounds immediately imply improved approximation guarantees for several terminal-based cut and ordering problems.

preprint2013arXiv

Orienting Fully Dynamic Graphs with Worst-Case Time Bounds

In edge orientations, the goal is usually to orient (direct) the edges of an undirected $n$-vertex graph $G$ such that all out-degrees are bounded. When the graph $G$ is fully dynamic, i.e., admits edge insertions and deletions, we wish to maintain such an orientation while keeping a tab on the update time. Low out-degree orientations turned out to be a surprisingly useful tool, with several algorithmic applications involving static or dynamic graphs. Brodal and Fagerberg (1999) initiated the study of the edge orientation problem in terms of the graph's arboricity, which is very natural in this context. They provided a solution with constant out-degree and \emph{amortized} logarithmic update time for all graphs with constant arboricity, which include all planar and excluded-minor graphs. However, it remained an open question (first proposed by Brodal and Fagerberg, later by others) to obtain similar bounds with worst-case update time. We resolve this 15 year old question in the affirmative, by providing a simple algorithm with worst-case bounds that nearly match the previous amortized bounds. Our algorithm is based on a new approach of a combinatorial invariant, and achieves a logarithmic out-degree with logarithmic worst-case update times. This result has applications in various dynamic graph problems such as maintaining a maximal matching, where we obtain $O(\log n)$ worst-case update time compared to the $O(\frac{\log n}{\log\log n})$ amortized update time of Neiman and Solomon (2013).

preprint2013arXiv

Towards (1+ε)-Approximate Flow Sparsifiers

A useful approach to "compress" a large network $G$ is to represent it with a {\em flow-sparsifier}, i.e., a small network $H$ that supports the same flows as $G$, up to a factor $q \geq 1$ called the quality of sparsifier. Specifically, we assume the network $G$ contains a set of $k$ terminals $T$, shared with the network $H$, i.e., $T\subseteq V(G)\cap V(H)$, and we want $H$ to preserve all multicommodity flows that can be routed between the terminals $T$. The challenge is to construct $H$ that is small. These questions have received a lot of attention in recent years, leading to some known tradeoffs between the sparsifier's quality $q$ and its size $|V(H)|$. Nevertheless, it remains an outstanding question whether every $G$ admits a flow-sparsifier $H$ with quality $q=1+ε$, or even $q=O(1)$, and size $|V(H)|\leq f(k,ε)$ (in particular, independent of $|V(G)|$ and the edge capacities). Making a first step in this direction, we present new constructions for several scenarios: * Our main result is that for quasi-bipartite networks $G$, one can construct a $(1+ε)$-flow-sparsifier of size $\poly(k/\eps)$. In contrast, exact ($q=1$) sparsifiers for this family of networks are known to require size $2^{Ω(k)}$. * For networks $G$ of bounded treewidth $w$, we construct a flow-sparsifier with quality $q=O(\log w / \log\log w)$ and size $O(w\cdot \poly(k))$. * For general networks $G$, we construct a {\em sketch} $sk(G)$, that stores all the feasible multicommodity flows up to factor $q=1+\eps$, and its size (storage requirement) is $f(k,ε)$.

preprint2012arXiv

Everywhere-Sparse Spanners via Dense Subgraphs

The significant progress in constructing graph spanners that are sparse (small number of edges) or light (low total weight) has skipped spanners that are everywhere-sparse (small maximum degree). This disparity is in line with other network design problems, where the maximum-degree objective has been a notorious technical challenge. Our main result is for the Lowest Degree 2-Spanner (LD2S) problem, where the goal is to compute a 2-spanner of an input graph so as to minimize the maximum degree. We design a polynomial-time algorithm achieving approximation factor $\tilde O(Δ^{3-2\sqrt{2}}) \approx \tilde O(Δ^{0.172})$, where $Δ$ is the maximum degree of the input graph. The previous $\tilde O(Δ^{1/4})$ -approximation was proved nearly two decades ago by Kortsarz and Peleg [SODA 1994, SICOMP 1998]. Our main conceptual contribution is to establish a formal connection between LD2S and a variant of the Densest k-Subgraph (DkS) problem. Specifically, we design for both problems strong relaxations based on the Sherali-Adams linear programming (LP) hierarchy, and show that "faithful" randomized rounding of the DkS-variant can be used to round LD2S solutions. Our notion of faithfulness intuitively means that all vertices and edges are chosen with probability proportional to their LP value, but the precise formulation is more subtle. Unfortunately, the best algorithms known for DkS use the Lovász-Schrijver LP hierarchy in a non-faithful way [Bhaskara, Charikar, Chlamtac, Feige, and Vijayaraghavan, STOC 2010]. Our main technical contribution is to overcome this shortcoming, while still matching the gap that arises in random graphs by planting a subgraph with same log-density.

preprint2012arXiv

Faster Clustering via Preprocessing

We examine the efficiency of clustering a set of points, when the encompassing metric space may be preprocessed in advance. In computational problems of this genre, there is a first stage of preprocessing, whose input is a collection of points $M$; the next stage receives as input a query set $Q\subset M$, and should report a clustering of $Q$ according to some objective, such as 1-median, in which case the answer is a point $a\in M$ minimizing $\sum_{q\in Q} d_M(a,q)$. We design fast algorithms that approximately solve such problems under standard clustering objectives like $p$-center and $p$-median, when the metric $M$ has low doubling dimension. By leveraging the preprocessing stage, our algorithms achieve query time that is near-linear in the query size $n=|Q|$, and is (almost) independent of the total number of points $m=|M|$.

preprint2012arXiv

Mimicking Networks and Succinct Representations of Terminal Cuts

Given a large edge-weighted network $G$ with $k$ terminal vertices, we wish to compress it and store, using little memory, the value of the minimum cut (or equivalently, maximum flow) between every bipartition of terminals. One appealing methodology to implement a compression of $G$ is to construct a \emph{mimicking network}: a small network $G'$ with the same $k$ terminals, in which the minimum cut value between every bipartition of terminals is the same as in $G$. This notion was introduced by Hagerup, Katajainen, Nishimura, and Ragde [JCSS '98], who proved that such $G'$ of size at most $2^{2^k}$ always exists. Obviously, by having access to the smaller network $G'$, certain computations involving cuts can be carried out much more efficiently. We provide several new bounds, which together narrow the previously known gap from doubly-exponential to only singly-exponential, both for planar and for general graphs. Our first and main result is that every $k$-terminal planar network admits a mimicking network $G'$ of size $O(k^2 2^{2k})$, which is moreover a minor of $G$. On the other hand, some planar networks $G$ require $|E(G')| \ge Ω(k^2)$. For general networks, we show that certain bipartite graphs only admit mimicking networks of size $|V(G')| \geq 2^{Ω(k)}$, and moreover, every data structure that stores the minimum cut value between all bipartitions of the terminals must use $2^{Ω(k)}$ machine words.

preprint2012arXiv

Preserving Terminal Distances using Minors

We introduce the following notion of compressing an undirected graph G with edge-lengths and terminal vertices $R\subseteq V(G)$. A distance-preserving minor is a minor G' (of G) with possibly different edge-lengths, such that $R\subseteq V(G')$ and the shortest-path distance between every pair of terminals is exactly the same in G and in G'. What is the smallest f*(k) such that every graph G with k=|R| terminals admits a distance-preserving minor G' with at most f*(k) vertices? Simple analysis shows that $f*(k)\leq O(k^4)$. Our main result proves that $f*(k)\geq Ω(k^2)$, significantly improving over the trivial $f*(k)\geq k$. Our lower bound holds even for planar graphs G, in contrast to graphs G of constant treewidth, for which we prove that O(k) vertices suffice.

preprint2011arXiv

Fault-Tolerant Spanners: Better and Simpler

A natural requirement of many distributed structures is fault-tolerance: after some failures, whatever remains from the structure should still be effective for whatever remains from the network. In this paper we examine spanners of general graphs that are tolerant to vertex failures, and significantly improve their dependence on the number of faults $r$, for all stretch bounds. For stretch $k \geq 3$ we design a simple transformation that converts every $k$-spanner construction with at most $f(n)$ edges into an $r$-fault-tolerant $k$-spanner construction with at most $O(r^3 \log n) \cdot f(2n/r)$ edges. Applying this to standard greedy spanner constructions gives $r$-fault tolerant $k$-spanners with $\tilde O(r^{2} n^{1+\frac{2}{k+1}})$ edges. The previous construction by Chechik, Langberg, Peleg, and Roddity [STOC 2009] depends similarly on $n$ but exponentially on $r$ (approximately like $k^r$). For the case $k=2$ and unit-length edges, an $O(r \log n)$-approximation algorithm is known from recent work of Dinitz and Krauthgamer [arXiv 2010], where several spanner results are obtained using a common approach of rounding a natural flow-based linear programming relaxation. Here we use a different (stronger) LP relaxation and improve the approximation ratio to $O(\log n)$, which is, notably, independent of the number of faults $r$. We further strengthen this bound in terms of the maximum degree by using the \Lovasz Local Lemma. Finally, we show that most of our constructions are inherently local by designing equivalent distributed algorithms in the LOCAL model of distributed computation.

preprint2011arXiv

Min-Max Graph Partitioning and Small Set Expansion

We study graph partitioning problems from a min-max perspective, in which an input graph on n vertices should be partitioned into k parts, and the objective is to minimize the maximum number of edges leaving a single part. The two main versions we consider are where the k parts need to be of equal-size, and where they must separate a set of k given terminals. We consider a common generalization of these two problems, and design for it an $O(\sqrt{\log n\log k})$-approximation algorithm. This improves over an $O(\log^2 n)$ approximation for the second version, and roughly $O(k\log n)$ approximation for the first version that follows from other previous work. We also give an improved O(1)-approximation algorithm for graphs that exclude any fixed minor. Our algorithm uses a new procedure for solving the Small-Set Expansion problem. In this problem, we are given a graph G and the goal is to find a non-empty set $S\subseteq V$ of size $|S| \leq ρn$ with minimum edge-expansion. We give an $O(\sqrt{\log{n}\log{(1/ρ)}})$ bicriteria approximation algorithm for the general case of Small-Set Expansion, and O(1) approximation algorithm for graphs that exclude any fixed minor.

preprint2011arXiv

Streaming Algorithms from Precision Sampling

A technique introduced by Indyk and Woodruff [STOC 2005] has inspired several recent advances in data-stream algorithms. We show that a number of these results follow easily from the application of a single probabilistic method called Precision Sampling. Using this method, we obtain simple data-stream algorithms that maintain a randomized sketch of an input vector $x=(x_1,...x_n)$, which is useful for the following applications. 1) Estimating the $F_k$-moment of $x$, for $k>2$. 2) Estimating the $\ell_p$-norm of $x$, for $p\in[1,2]$, with small update time. 3) Estimating cascaded norms $\ell_p(\ell_q)$ for all $p,q>0$. 4) $\ell_1$ sampling, where the goal is to produce an element $i$ with probability (approximately) $|x_i|/\|x\|_1$. It extends to similarly defined $\ell_p$-sampling, for $p\in [1,2]$. For all these applications the algorithm is essentially the same: scale the vector x entry-wise by a well-chosen random vector, and run a heavy-hitter estimation algorithm on the resulting vector. Our sketch is a linear function of x, thereby allowing general updates to the vector x. Precision Sampling itself addresses the problem of estimating a sum $\sum_{i=1}^n a_i$ from weak estimates of each real $a_i\in[0,1]$. More precisely, the estimator first chooses a desired precision $u_i\in(0,1]$ for each $i\in[n]$, and then it receives an estimate of every $a_i$ within additive $u_i$. Its goal is to provide a good approximation to $\sum a_i$ while keeping a tab on the "approximation cost" $\sum_i (1/u_i)$. Here we refine previous work [Andoni, Krauthgamer, and Onak, FOCS 2010] which shows that as long as $\sum a_i=Ω(1)$, a good multiplicative approximation can be achieved using total precision of only $O(n\log n)$.

preprint2010arXiv

Approximating Sparsest Cut in Graphs of Bounded Treewidth

We give the first constant-factor approximation algorithm for Sparsest Cut with general demands in bounded treewidth graphs. In contrast to previous algorithms, which rely on the flow-cut gap and/or metric embeddings, our approach exploits the Sherali-Adams hierarchy of linear programming relaxations.

preprint2010arXiv

Directed Spanners via Flow-Based Linear Programs

We examine directed spanners through flow-based linear programming relaxations. We design an $Õ(n^{2/3})$-approximation algorithm for the directed $k$-spanner problem that works for all $k\geq 1$, which is the first sublinear approximation for arbitrary edge-lengths. Even in the more restricted setting of unit edge-lengths, our algorithm improves over the previous $Õ(n^{1-1/k})$ approximation of Bhattacharyya et al. when $k\ge 4$. For the special case of $k=3$ we design a different algorithm achieving an $Õ(\sqrt{n})$-approximation, improving the previous $Õ(n^{2/3})$. Both of our algorithms easily extend to the fault-tolerant setting, which has recently attracted attention but not from an approximation viewpoint. We also prove a nearly matching integrality gap of $Ω(n^{\frac13 - ε})$ for any constant $ε> 0$. A virtue of all our algorithms is that they are relatively simple. Technically, we introduce a new yet natural flow-based relaxation, and show how to approximately solve it even when its size is not polynomial. The main challenge is to design a rounding scheme that "coordinates" the choices of flow-paths between the many demand pairs while using few edges overall. We achieve this, roughly speaking, by randomization at the level of vertices.

preprint2010arXiv

Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity

We present a near-linear time algorithm that approximates the edit distance between two strings within a polylogarithmic factor; specifically, for strings of length n and every fixed epsilon>0, it can compute a (log n)^O(1/epsilon) approximation in n^(1+epsilon) time. This is an exponential improvement over the previously known factor, 2^(O (sqrt(log n))), with a comparable running time (Ostrovsky and Rabani J.ACM 2007; Andoni and Onak STOC 2009). Previously, no efficient polylogarithmic approximation algorithm was known for any computational task involving edit distance (e.g., nearest neighbor search or sketching). This result arises naturally in the study of a new asymmetric query model. In this model, the input consists of two strings x and y, and an algorithm can access y in an unrestricted manner, while being charged for querying every symbol of x. Indeed, we obtain our main result by designing an algorithm that makes a small number of queries in this model. We then provide a nearly-matching lower bound on the number of queries. Our lower bound is the first to expose hardness of edit distance stemming from the input strings being "repetitive", which means that many of their substrings are approximately identical. Consequently, our lower bound provides the first rigorous separation between edit distance and Ulam distance, which is edit distance on non-repetitive strings, such as permutations.

Robert Krauthgamer

What is connected

Connect this record

See the researcher in context

Building this map preview

37 published item(s)

Almost-Smooth Histograms and Sliding-Window Graph Algorithms

Breaking the Cubic Barrier for All-Pairs Max-Flow: Gomory-Hu Tree in Nearly Quadratic Time

Distributed Sparse Normal Means Estimation with Sublinear Communication

Exact Flow Sparsification Requires Unbounded Size

Faster Algorithms for Orienteering and $k$-TSP

Near-Linear $\varepsilon$-Emulators for Planar Graphs

Optimal Vertex-Cut Sparsification of Quasi-Bipartite Graphs

Coresets for Clustering in Excluded-minor Graphs and Beyond

Cut-Equivalent Trees are Optimal for Min-Cut Queries

Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension

Tight Recovery Guarantees for Orthogonal Matching Pursuit Under Gaussian Noise

Universal Streaming of Subset Norms

Cheeger-type approximation for sparsest $st$-cut

Metric Decompositions of Path-Separable Graphs

A Nonlinear Approach to Dimension Reduction

Adaptive Metric Dimensionality Reduction

Cutting corners cheaply, or how to remove Steiner points

Do semidefinite relaxations solve sparse PCA up to the information limit?

Sparsification of Two-Variable Valued CSPs

The Traveling Salesman Problem: Low-Dimensionality Implies a Polynomial Time Approximation Scheme

Towards Resistance Sparsifiers

Efficient Classification for Metric Data

Spectral Approaches to Nearest Neighbor Search

The Sketching Complexity of Graph Cuts

Vertex Sparsifiers: New Results from Old Techniques

Orienting Fully Dynamic Graphs with Worst-Case Time Bounds

Towards (1+ε)-Approximate Flow Sparsifiers

Everywhere-Sparse Spanners via Dense Subgraphs

Faster Clustering via Preprocessing

Mimicking Networks and Succinct Representations of Terminal Cuts

Preserving Terminal Distances using Minors

Fault-Tolerant Spanners: Better and Simpler

Min-Max Graph Partitioning and Small Set Expansion

Streaming Algorithms from Precision Sampling

Approximating Sparsest Cut in Graphs of Bounded Treewidth

Directed Spanners via Flow-Based Linear Programs

Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity