Source author record

Ilya Razenshteyn

Ilya Razenshteyn appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Machine Learning Computational Geometry Information Theory math.IT math.CO Computational Complexity Cryptography and Security Databases Discrete Mathematics math.NA Computer Vision Information Retrieval math.LO math.MG math.PR Neural and Evolutionary Computing

Catalog footprint

What is connected

23works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm

We provide a function space characterization of the inductive bias resulting from minimizing the $\ell_2$ norm of the weights in multi-channel convolutional neural networks with linear activations and empirically test our resulting hypothesis on ReLU networks trained using gradient descent. We define an induced regularizer in the function space as the minimum $\ell_2$ norm of weights of a network required to realize a function. For two layer linear convolutional networks with $C$ output channels and kernel size $K$, we show the following: (a) If the inputs to the network are single channeled, the induced regularizer for any $K$ is independent of the number of output channels $C$. Furthermore, we derive the regularizer is a norm given by a semidefinite program (SDP). (b) In contrast, for multi-channel inputs, multiple output channels can be necessary to merely realize all matrix-valued linear functions and thus the inductive bias does depend on $C$. However, for sufficiently large $C$, the induced regularizer is again given by an SDP that is independent of $C$. In particular, the induced regularizer for $K=1$ and $K=D$ (input dimension) is given in closed form as the nuclear norm and the $\ell_{2,1}$ group-sparse norm, respectively, of the Fourier coefficients of the linear predictor. We investigate the broader applicability of our theoretical results to implicit regularization from gradient descent on linear and ReLU networks through experiments on MNIST and CIFAR-10 datasets.

preprint2020arXiv

A Study of Performance of Optimal Transport

We investigate the problem of efficiently computing optimal transport (OT) distances, which is equivalent to the node-capacitated minimum cost maximum flow problem in a bipartite graph. We compare runtimes in computing OT distances on data from several domains, such as synthetic data of geometric shapes, embeddings of tokens in documents, and pixels in images. We show that in practice, combinatorial methods such as network simplex and augmenting path based algorithms can consistently outperform numerical matrix-scaling based methods such as Sinkhorn [Cuturi'13] and Greenkhorn [Altschuler et al'17], even in low accuracy regimes, with up to orders of magnitude speedups. Lastly, we present a new combinatorial algorithm that improves upon the classical Kuhn-Munkres algorithm.

preprint2020arXiv

Non-Adaptive Adaptive Sampling on Turnstile Streams

Adaptive sampling is a useful algorithmic tool for data summarization problems in the classical centralized setting, where the entire dataset is available to the single processor performing the computation. Adaptive sampling repeatedly selects rows of an underlying matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$, where $n\gg d$, with probabilities proportional to their distances to the subspace of the previously selected rows. Intuitively, adaptive sampling seems to be limited to trivial multi-pass algorithms in the streaming model of computation due to its inherently sequential nature of assigning sampling probabilities to each row only after the previous iteration is completed. Surprisingly, we show this is not the case by giving the first one-pass algorithms for adaptive sampling on turnstile streams and using space $\text{poly}(d,k,\log n)$, where $k$ is the number of adaptive sampling rounds to be performed. Our adaptive sampling procedure has a number of applications to various data summarization problems that either improve state-of-the-art or have only been previously studied in the more relaxed row-arrival model. We give the first relative-error algorithms for column subset selection, subspace approximation, projective clustering, and volume maximization on turnstile streams that use space sublinear in $n$. We complement our volume maximization algorithmic results with lower bounds that are tight up to lower order terms, even for multi-pass algorithms. By a similar construction, we also obtain lower bounds for volume maximization in the row-arrival model, which we match with competitive upper bounds. See paper for full abstract.

preprint2020arXiv

Performance of Johnson-Lindenstrauss Transform for k-Means and k-Medians Clustering

Consider an instance of Euclidean $k$-means or $k$-medians clustering. We show that the cost of the optimal solution is preserved up to a factor of $(1+\varepsilon)$ under a projection onto a random $O(\log(k / \varepsilon) / \varepsilon^2)$-dimensional subspace. Further, the cost of every clustering is preserved within $(1+\varepsilon)$. More generally, our result applies to any dimension reduction map satisfying a mild sub-Gaussian-tail condition. Our bound on the dimension is nearly optimal. Additionally, our result applies to Euclidean $k$-clustering with the distances raised to the $p$-th power for any constant $p$. For $k$-means, our result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for $k$-medians, it answers a question raised by Kannan.

preprint2020arXiv

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers

Recent works have shown the effectiveness of randomized smoothing as a scalable technique for building neural network-based classifiers that are provably robust to $\ell_2$-norm adversarial perturbations. In this paper, we employ adversarial training to improve the performance of randomized smoothing. We design an adapted attack for smoothed classifiers, and we show how this attack can be used in an adversarial training setting to boost the provable robustness of smoothed classifiers. We demonstrate through extensive experimentation that our method consistently outperforms all existing provably $\ell_2$-robust classifiers by a significant margin on ImageNet and CIFAR-10, establishing the state-of-the-art for provable $\ell_2$-defenses. Moreover, we find that pre-training and semi-supervised learning boost adversarially trained smoothed classifiers even further. Our code and trained models are available at http://github.com/Hadisalman/smoothing-adversarial .

preprint2020arXiv

Randomized Smoothing of All Shapes and Sizes

Randomized smoothing is the current state-of-the-art defense with provable robustness against $\ell_2$ adversarial attacks. Many works have devised new randomized smoothing schemes for other metrics, such as $\ell_1$ or $\ell_\infty$; however, substantial effort was needed to derive such new guarantees. This begs the question: can we find a general theory for randomized smoothing? We propose a novel framework for devising and analyzing randomized smoothing schemes, and validate its effectiveness in practice. Our theoretical contributions are: (1) we show that for an appropriate notion of "optimal", the optimal smoothing distributions for any "nice" norms have level sets given by the norm's *Wulff Crystal*; (2) we propose two novel and complementary methods for deriving provably robust radii for any smoothing distribution; and, (3) we show fundamental limits to current randomized smoothing techniques via the theory of *Banach space cotypes*. By combining (1) and (2), we significantly improve the state-of-the-art certified accuracy in $\ell_1$ on standard datasets. Meanwhile, we show using (3) that with only label statistics under random input perturbations, randomized smoothing cannot achieve nontrivial certified accuracy against perturbations of $\ell_p$-norm $Ω(\min(1, d^{\frac{1}{p} - \frac{1}{2}}))$, when the input dimension $d$ is large. We provide code in github.com/tonyduan/rs4a.

preprint2020arXiv

SANNS: Scaling Up Secure Approximate k-Nearest Neighbors Search

The $k$-Nearest Neighbor Search ($k$-NNS) is the backbone of several cloud-based services such as recommender systems, face recognition, and database search on text and images. In these services, the client sends the query to the cloud server and receives the response in which case the query and response are revealed to the service provider. Such data disclosures are unacceptable in several scenarios due to the sensitivity of data and/or privacy laws. In this paper, we introduce SANNS, a system for secure $k$-NNS that keeps client's query and the search result confidential. SANNS comprises two protocols: an optimized linear scan and a protocol based on a novel sublinear time clustering-based algorithm. We prove the security of both protocols in the standard semi-honest model. The protocols are built upon several state-of-the-art cryptographic primitives such as lattice-based additively homomorphic encryption, distributed oblivious RAM, and garbled circuits. We provide several contributions to each of these primitives which are applicable to other secure computation tasks. Both of our protocols rely on a new circuit for the approximate top-$k$ selection from $n$ numbers that is built from $O(n + k^2)$ comparators. We have implemented our proposed system and performed extensive experimental results on four datasets in two different computation environments, demonstrating more than $18-31\times$ faster response time compared to optimally implemented protocols from the prior work. Moreover, SANNS is the first work that scales to the database of 10 million entries, pushing the limit by more than two orders of magnitude.

preprint2020arXiv

Scaling up Kernel Ridge Regression via Locality Sensitive Hashing

Random binning features, introduced in the seminal paper of Rahimi and Recht (2007), are an efficient method for approximating a kernel matrix using locality sensitive hashing. Random binning features provide a very simple and efficient way of approximating the Laplace kernel but unfortunately do not apply to many important classes of kernels, notably ones that generate smooth Gaussian processes, such as the Gaussian kernel and Matern kernel. In this paper, we introduce a simple weighted version of random binning features and show that the corresponding kernel function generates Gaussian processes of any desired smoothness. We show that our weighted random binning features provide a spectral approximation to the corresponding kernel matrix, leading to efficient algorithms for kernel ridge regression. Experiments on large scale regression datasets show that our method outperforms the accuracy of random Fourier features method.

preprint2016arXiv

Lower Bounds on Time-Space Trade-Offs for Approximate Near Neighbors

We show tight lower bounds for the entire trade-off between space and query time for the Approximate Near Neighbor search problem. Our lower bounds hold in a restricted model of computation, which captures all hashing-based approaches. In articular, our lower bound matches the upper bound recently shown in [Laarhoven 2015] for the random instance on a Euclidean sphere (which we show in fact extends to the entire space $\mathbb{R}^d$ using the techniques from [Andoni, Razenshteyn 2015]). We also show tight, unconditional cell-probe lower bounds for one and two probes, improving upon the best known bounds from [Panigrahy, Talwar, Wieder 2010]. In particular, this is the first space lower bound (for any static data structure) for two probes which is not polynomially smaller than for one probe. To show the result for two probes, we establish and exploit a connection to locally-decodable codes.

preprint2016arXiv

On the Complexity of Inner Product Similarity Join

A number of tasks in classification, information retrieval, recommendation systems, and record linkage reduce to the core problem of inner product similarity join (IPS join): identifying pairs of vectors in a collection that have a sufficiently large inner product. IPS join is well understood when vectors are normalized and some approximation of inner products is allowed. However, the general case where vectors may have any length appears much more challenging. Recently, new upper bounds based on asymmetric locality-sensitive hashing (ALSH) and asymmetric embeddings have emerged, but little has been known on the lower bound side. In this paper we initiate a systematic study of inner product similarity join, showing new lower and upper bounds. Our main results are: * Approximation hardness of IPS join in subquadratic time, assuming the strong exponential time hypothesis. * New upper and lower bounds for (A)LSH-based algorithms. In particular, we show that asymmetry can be avoided by relaxing the LSH definition to only consider the collision probability of distinct elements. * A new indexing method for IPS based on linear sketches, implying that our hardness results are not far from being tight. Our technical contributions include new asymmetric embeddings that may be of independent interest. At the conceptual level we strive to provide greater clarity, for example by distinguishing among signed and unsigned variants of IPS join and shedding new light on the effect of asymmetry.

preprint2015arXiv

Nearly-optimal bounds for sparse recovery in generic norms, with applications to $k$-median sketching

We initiate the study of trade-offs between sparsity and the number of measurements in sparse recovery schemes for generic norms. Specifically, for a norm $\|\cdot\|$, sparsity parameter $k$, approximation factor $K>0$, and probability of failure $P>0$, we ask: what is the minimal value of $m$ so that there is a distribution over $m \times n$ matrices $A$ with the property that for any $x$, given $Ax$, we can recover a $k$-sparse approximation to $x$ in the given norm with probability at least $1-P$? We give a partial answer to this problem, by showing that for norms that admit efficient linear sketches, the optimal number of measurements $m$ is closely related to the doubling dimension of the metric induced by the norm $\|\cdot\|$ on the set of all $k$-sparse vectors. By applying our result to specific norms, we cast known measurement bounds in our general framework (for the $\ell_p$ norms, $p \in [1,2]$) as well as provide new, measurement-efficient schemes (for the Earth-Mover Distance norm). The latter result directly implies more succinct linear sketches for the well-studied planar $k$-median clustering problem. Finally, our lower bound for the doubling dimension of the EMD norm enables us to address the open question of [Frahling-Sohler, STOC'05] about the space complexity of clustering problems in the dynamic streaming model.

preprint2015arXiv

Optimal Data-Dependent Hashing for Approximate Near Neighbors

We show an optimal data-dependent hashing scheme for the approximate near neighbor problem. For an $n$-point data set in a $d$-dimensional space our data structure achieves query time $O(d n^{ρ+o(1)})$ and space $O(n^{1+ρ+o(1)} + dn)$, where $ρ=\tfrac{1}{2c^2-1}$ for the Euclidean space and approximation $c>1$. For the Hamming space, we obtain an exponent of $ρ=\tfrac{1}{2c-1}$. Our result completes the direction set forth in [AINR14] who gave a proof-of-concept that data-dependent hashing can outperform classical Locality Sensitive Hashing (LSH). In contrast to [AINR14], the new bound is not only optimal, but in fact improves over the best (optimal) LSH data structures [IM98,AI06] for all approximation factors $c>1$. From the technical perspective, we proceed by decomposing an arbitrary dataset into several subsets that are, in a certain sense, pseudo-random.

preprint2015arXiv

Practical and Optimal LSH for Angular Distance

We show the existence of a Locality-Sensitive Hashing (LSH) family for the angular distance that yields an approximate Near Neighbor Search algorithm with the asymptotically optimal running time exponent. Unlike earlier algorithms with this property (e.g., Spherical LSH [Andoni, Indyk, Nguyen, Razenshteyn 2014], [Andoni, Razenshteyn 2015]), our algorithm is also practical, improving upon the well-studied hyperplane LSH [Charikar, 2002] in practice. We also introduce a multiprobe version of this algorithm, and conduct experimental evaluation on real and synthetic data sets. We complement the above positive results with a fine-grained lower bound for the quality of any LSH family for angular distance. Our lower bound implies that the above LSH family exhibits a trade-off between evaluation time and quality that is close to optimal for a natural class of LSH functions.

preprint2015arXiv

Restricted Isometry Property for General p-Norms

The Restricted Isometry Property (RIP) is a fundamental property of a matrix which enables sparse recovery. Informally, an $m \times n$ matrix satisfies RIP of order $k$ for the $\ell_p$ norm, if $\|Ax\|_p \approx \|x\|_p$ for every vector $x$ with at most $k$ non-zero coordinates. For every $1 \leq p < \infty$ we obtain almost tight bounds on the minimum number of rows $m$ necessary for the RIP property to hold. Prior to this work, only the cases $p = 1$, $1 + 1 / \log k$, and $2$ were studied. Interestingly, our results show that the case $p = 2$ is a "singularity" point: the optimal number of rows $m$ is $\widetildeΘ(k^{p})$ for all $p\in [1,\infty)\setminus \{2\}$, as opposed to $\widetildeΘ(k)$ for $k=2$. We also obtain almost tight bounds for the column sparsity of RIP matrices and discuss implications of our results for the Stable Sparse Recovery problem.

preprint2015arXiv

Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing

We prove a tight lower bound for the exponent $ρ$ for data-dependent Locality-Sensitive Hashing schemes, recently used to design efficient solutions for the $c$-approximate nearest neighbor search. In particular, our lower bound matches the bound of $ρ\le \frac{1}{2c-1}+o(1)$ for the $\ell_1$ space, obtained via the recent algorithm from [Andoni-Razenshteyn, STOC'15]. In recent years it emerged that data-dependent hashing is strictly superior to the classical Locality-Sensitive Hashing, when the hash function is data-independent. In the latter setting, the best exponent has been already known: for the $\ell_1$ space, the tight bound is $ρ=1/c$, with the upper bound from [Indyk-Motwani, STOC'98] and the matching lower bound from [O'Donnell-Wu-Zhou, ITCS'11]. We prove that, even if the hashing is data-dependent, it must hold that $ρ\ge \frac{1}{2c-1}-o(1)$. To prove the result, we need to formalize the exact notion of data-dependent hashing that also captures the complexity of the hash functions (in addition to their collision properties). Without restricting such complexity, we would allow for obviously infeasible solutions such as the Voronoi diagram of a dataset. To preclude such solutions, we require our hash functions to be succinct. This condition is satisfied by all the known algorithmic results.

preprint2014arXiv

On Model-Based RIP-1 Matrices

The Restricted Isometry Property (RIP) is a fundamental property of a matrix enabling sparse recovery. Informally, an m x n matrix satisfies RIP of order k in the l_p norm if ||Ax||_p \approx ||x||_p for any vector x that is k-sparse, i.e., that has at most k non-zeros. The minimal number of rows m necessary for the property to hold has been extensively investigated, and tight bounds are known. Motivated by signal processing models, a recent work of Baraniuk et al has generalized this notion to the case where the support of x must belong to a given model, i.e., a given family of supports. This more general notion is much less understood, especially for norms other than l_2. In this paper we present tight bounds for the model-based RIP property in the l_1 norm. Our bounds hold for the two most frequently investigated models: tree-sparsity and block-sparsity. We also show implications of our results to sparse recovery problems.

preprint2013arXiv

Beyond Locality-Sensitive Hashing

We present a new data structure for the c-approximate near neighbor problem (ANN) in the Euclidean space. For n points in R^d, our algorithm achieves O(n^ρ + d log n) query time and O(n^{1 + ρ} + d log n) space, where ρ<= 7/(8c^2) + O(1 / c^3) + o(1). This is the first improvement over the result by Andoni and Indyk (FOCS 2006) and the first data structure that bypasses a locality-sensitive hashing lower bound proved by O'Donnell, Wu and Zhou (ICS 2011). By a standard reduction we obtain a data structure for the Hamming space and \ell_1 norm with ρ<= 7/(8c) + O(1/c^{3/2}) + o(1), which is the first improvement over the result of Indyk and Motwani (STOC 1998).

preprint2013arXiv

Separating Hierarchical and General Hub Labelings

In the context of distance oracles, a labeling algorithm computes vertex labels during preprocessing. An $s,t$ query computes the corresponding distance from the labels of $s$ and $t$ only, without looking at the input graph. Hub labels is a class of labels that has been extensively studied. Performance of the hub label query depends on the label size. Hierarchical labels are a natural special kind of hub labels. These labels are related to other problems and can be computed more efficiently. This brings up a natural question of the quality of hierarchical labels. We show that there is a gap: optimal hierarchical labels can be polynomially bigger than the general hub labels. To prove this result, we give tight upper and lower bounds on the size of hierarchical and general labels for hypercubes.

preprint2012arXiv

Common information revisited

One of the main notions of information theory is the notion of mutual information in two messages (two random variables in Shannon information theory or two binary strings in algorithmic information theory). The mutual information in $x$ and $y$ measures how much the transmission of $x$ can be simplified if both the sender and the recipient know $y$ in advance. Gács and Körner gave an example where mutual information cannot be presented as common information (a third message easily extractable from both $x$ and $y$). Then this question was studied in the framework of algorithmic information theory by An. Muchnik and A. Romashchenko who found many other examples of this type. K. Makarychev and Yu. Makarychev found a new proof of Gács--Körner results by means of conditionally independent random variables. The question about the difference between mutual and common information can be studied quantitatively: for a given $x$ and $y$ we look for three messages $a$, $b$, $c$ such that $a$ and $c$ are enough to reconstruct $x$, while $b$ and $c$ are enough to reconstruct $y$. In this paper: We state and prove (using hypercontractivity of product spaces) a quantitative version of Gács--Körner theorem; We study the tradeoff between $\abs{a}, \abs{b}, \abs{c}$ for a random pair $(x, y)$ such that Hamming distance between $x$ and $y$ is $\eps n$ (our bounds are almost tight); We construct "the worst possible" distribution on $(x, y)$ in terms of the tradeoff between $\abs{a}, \abs{b}, \abs{c}$.

preprint2012arXiv

On Epsilon-Nets, Distance Oracles, and Metric Embeddings

We give two new applications of an observation from \cite{ADFGW11}. The first is an almost linear sized constant time data structure for reporting very large distances in undirected graphs. The second is a generic transformation of results about $\ell_1$-embeddability of metrics to a setting, where we are interested in preservation of large distances only.

preprint2010arXiv

A Linear Time Algorithm for Finding Three Edge-Disjoint Paths in Eulerian Networks

Consider an undirected graph $G = (VG, EG)$ and a set of six \emph{terminals} $T = \set{s_1, s_2, s_3, t_1, t_2, t_3} \subseteq VG$. The goal is to find a collection $\calP$ of three edge-disjoint paths $P_1$, $P_2$, and $P_3$, where $P_i$ connects nodes $s_i$ and $t_i$ ($i = 1, 2, 3$). Results obtained by Robertson and Seymour by graph minor techniques imply a polynomial time solvability of this problem. The time bound of their algorithm is $O(m^3)$ (hereinafter we assume $n := \abs{VG}$, $m := \abs{EG}$, $n = O(m)$). In this paper we consider a special, \emph{Eulerian} case of $G$ and $T$. Namely, construct the \emph{demand graph} $H = (VG, \set{s_1t_1, s_2t_2, s_3t_3})$. The edges of $H$ correspond to the desired paths in $\calP$. In the Eulerian case the degrees of all nodes in the (multi-) graph $G + H$ ($ = (VG, EG \cup EH)$) are even. Schrijver showed that, under the assumption of Eulerianess, cut conditions provide a criterion for the existence of $\calP$. This, in particular, implies that checking for existence of $\calP$ can be done in $O(m)$ time. Our result is a combinatorial $O(m)$-time algorithm that constructs $\calP$ (if the latter exists).

preprint2010arXiv

Not Every Domain of a Plain Decompressor Contains the Domain of a Prefix-Free One

C.Calude, A.Nies, L.Staiger, and F.Stephan posed the following question about the relation between plain and prefix Kolmogorov complexities (see their paper in DLT 2008 conference proceedings): does the domain of every optimal decompressor contain the domain of some optimal prefix-free decompressor? In this paper we provide a negative answer to this question.

preprint2010arXiv

Triangle-Free 2-Matchings Revisited

A \emph{2-matching} in an undirected graph $G = (VG, EG)$ is a function $f \colon EG \to \set{0,1,2}$ such that for each node $v \in VG$ the sum of values $f(e)$ on all edges $e$ incident to $v$ does not exceed~2. The \emph{size} of $f$ is the sum $\sum_e f(e)$. If $\set{e \in EG \mid f(e) \ne 0}$ contains no triangles then $f$ is called \emph{triangle-free}. Cornuéjols and Pulleyblank devised a combinatorial $O(mn)$-algorithm that finds a triangle free 2-matching of maximum size (hereinafter $n := \abs{VG}$, $m := \abs{EG}$) and also established a min-max theorem. We claim that this approach is, in fact, superfluous by demonstrating how their results may be obtained directly from the Edmonds--Gallai decomposition. Applying the algorithm of Micali and Vazirani we are able to find a maximum triangle-free 2-matching in $O(m\sqrt{n})$-time. Also we give a short self-contained algorithmic proof of the min-max theorem. Next, we consider the case of regular graphs. It is well-known that every regular graph admits a perfect 2-matching. One can easily strengthen this result and prove that every $d$-regular graph (for $d \geq 3$) contains a perfect triangle-free 2-matching. We give the following algorithms for finding a perfect triangle-free 2-matching in a $d$-regular graph: an O(n)-algorithm for $d = 3$, an $O(m + n^{3/2})$-algorithm for $d = 2k$ ($k \ge 2$), and an $O(n^2)$-algorithm for $d = 2k + 1$ ($k \ge 2$). We also prove that there exists a constant $c > 1$ such that every 3-regular graph contains at least $c^n$ perfect triangle-free 2-matchings.

Ilya Razenshteyn

What is connected

Connect this record

See the researcher in context

Building this map preview

23 published item(s)

Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm

A Study of Performance of Optimal Transport

Non-Adaptive Adaptive Sampling on Turnstile Streams

Performance of Johnson-Lindenstrauss Transform for k-Means and k-Medians Clustering

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers

Randomized Smoothing of All Shapes and Sizes

SANNS: Scaling Up Secure Approximate k-Nearest Neighbors Search

Scaling up Kernel Ridge Regression via Locality Sensitive Hashing

Lower Bounds on Time-Space Trade-Offs for Approximate Near Neighbors

On the Complexity of Inner Product Similarity Join

Nearly-optimal bounds for sparse recovery in generic norms, with applications to $k$-median sketching

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Practical and Optimal LSH for Angular Distance

Restricted Isometry Property for General p-Norms

Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing

On Model-Based RIP-1 Matrices

Beyond Locality-Sensitive Hashing

Separating Hierarchical and General Hub Labelings

Common information revisited

On Epsilon-Nets, Distance Oracles, and Metric Embeddings

A Linear Time Algorithm for Finding Three Edge-Disjoint Paths in Eulerian Networks

Not Every Domain of a Plain Decompressor Contains the Domain of a Prefix-Free One

Triangle-Free 2-Matchings Revisited