Source author record

Ronitt Rubinfeld

Ronitt Rubinfeld appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms math.PR math.ST Statistics Theory Computational Complexity Databases Discrete Mathematics Distributed, Parallel, and Cluster Computing Machine Learning math.CO

Catalog footprint

What is connected

21works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover

We present $O(\log\log n)$-round algorithms in the Massively Parallel Computation (MPC) model, with $\tilde{O}(n)$ memory per machine, that compute a maximal independent set, a $1+ε$ approximation of maximum matching, and a $2+ε$ approximation of minimum vertex cover, for any $n$-vertex graph and any constant $ε>0$. These improve the state of the art as follows: - Our MIS algorithm leads to a simple $O(\log\log Δ)$-round MIS algorithm in the Congested Clique model of distributed computing, which improves on the $\tilde{O}(\sqrt{\log Δ})$-round algorithm of Ghaffari [PODC'17]. - Our $O(\log\log n)$-round $(1+ε)$-approximate maximum matching algorithm simplifies or improves on the following prior work: $O(\log^2\log n)$-round $(1+ε)$-approximation algorithm of Czumaj et al. [STOC'18] and $O(\log\log n)$-round $(1+ε)$-approximation algorithm of Assadi et al. [SODA'19]. - Our $O(\log\log n)$-round $(2+ε)$-approximate minimum vertex cover algorithm improves on an $O(\log\log n)$-round $O(1)$-approximation of Assadi et al. [arXiv'17].

preprint2022arXiv

Massively Parallel Algorithms for Small Subgraph Counting

Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the de-facto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate. Given a graph $G=(V, E)$ with $n$ vertices, $m$ edges and $T$ triangles, our first result is an algorithm that outputs a $(1+\varepsilon)$-approximation to $T$, with asymptotically \emph{optimal round and total space complexity} provided any $S \geq \max{(\sqrt m, n^2/m)}$ space per machine and assuming $T=Ω(\sqrt{m/n})$. Our result gives a quadratic improvement on the bound on $T$ over previous works. We also provide a simple extension of our result to counting \emph{any} subgraph of $k$ size for constant $k \geq 1$. Our second result is an $O_{\varepsilon}(\log \log n)$-round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the \emph{arboricity} $α$ of the input graph. We extend this result to exactly counting $k$-cliques for any constant $k$. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most $5$ can be implemented in the MPC model in total space.

preprint2022arXiv

Triangle and Four Cycle Counting with Predictions in Graph Streams

We propose data-driven one-pass streaming algorithms for estimating the number of triangles and four cycles, two fundamental problems in graph analytics that are widely studied in the graph data stream literature. Recently, (Hsu 2018) and (Jiang 2020) applied machine learning techniques in other data stream problems, using a trained oracle that can predict certain properties of the stream elements to improve on prior "classical" algorithms that did not use oracles. In this paper, we explore the power of a "heavy edge" oracle in multiple graph edge streaming models. In the adjacency list model, we present a one-pass triangle counting algorithm improving upon the previous space upper bounds without such an oracle. In the arbitrary order model, we present algorithms for both triangle and four cycle estimation with fewer passes and the same space complexity as in previous algorithms, and we show several of these bounds are optimal. We analyze our algorithms under several noise models, showing that the algorithms perform well even when the oracle errs. Our methodology expands upon prior work on "classical" streaming algorithms, as previous multi-pass and random order streaming algorithms can be seen as special cases of our algorithms, where the first pass or random order was used to implement the heavy edge oracle. Lastly, our experiments demonstrate advantages of the proposed method compared to state-of-the-art streaming algorithms.

preprint2021arXiv

Local Access to Random Walks

For a graph $G$ on $n$ vertices, naively sampling the position of a random walk of at time $t$ requires work $Ω(t)$. We desire local access algorithms supporting $\text{position}(G,s,t)$ queries, which return the position of a random walk from some start vertex $s$ at time $t$, where the joint distribution of returned positions is $1/\text{poly}(n)$ close to the uniform distribution over such walks in $\ell_1$ distance. We first give an algorithm for local access to walks on undirected regular graphs with $\widetilde{O}(\frac{1}{1-λ}\sqrt{n})$ runtime per query, where $λ$ is the second-largest eigenvalue in absolute value. Since random $d$-regular graphs are expanders with high probability, this gives an $\widetilde{O}(\sqrt{n})$ algorithm for $G(n,d)$, which improves on the naive method for small numbers of queries. We then prove that no that algorithm with sub-constant error given probe access to random $d$-regular graphs can have runtime better than $Ω(\sqrt{n}/\log(n))$ per query in expectation, obtaining a nearly matching lower bound. We further show an $Ω(n^{1/4})$ runtime per query lower bound even with an oblivious adversary (i.e. when the query sequence is fixed in advance). We then show that for families of graphs with additional group theoretic structure, dramatically better results can be achieved. We give local access to walks on small-degree abelian Cayley graphs, including cycles and hypercubes, with runtime $\text{polylog}(n)$ per query. This also allows for efficient local access to walks on $\text{polylog}$ degree expanders. We extend our results to graphs constructed using the tensor product (giving local access to walks on degree $n^ε$ graphs for any $ε\in (0,1]$) and Cartesian product.

preprint2020arXiv

Monotone probability distributions over the Boolean cube can be learned with sublinear samples

A probability distribution over the Boolean cube is monotone if flipping the value of a coordinate from zero to one can only increase the probability of an element. Given samples of an unknown monotone distribution over the Boolean cube, we give (to our knowledge) the first algorithm that learns an approximation of the distribution in statistical distance using a number of samples that is sublinear in the domain. To do this, we develop a structural lemma describing monotone probability distributions. The structural lemma has further implications to the sample complexity of basic testing tasks for analyzing monotone probability distributions over the Boolean cube: We use it to give nontrivial upper bounds on the tasks of estimating the distance of a monotone distribution to uniform and of estimating the support size of a monotone distribution. In the setting of monotone probability distributions over the Boolean cube, our algorithms are the first to have sample complexity lower than known lower bounds for the same testing tasks on arbitrary (not necessarily monotone) probability distributions. One further consequence of our learning algorithm is an improved sample complexity for the task of testing whether a distribution on the Boolean cube is monotone.

preprint2020arXiv

Online Page Migration with ML Advice

We consider online algorithms for the {\em page migration problem} that use predictions, potentially imperfect, to improve their performance. The best known online algorithms for this problem, due to Westbrook'94 and Bienkowski et al'17, have competitive ratios strictly bounded away from 1. In contrast, we show that if the algorithm is given a prediction of the input sequence, then it can achieve a competitive ratio that tends to $1$ as the prediction error rate tends to $0$. Specifically, the competitive ratio is equal to $1+O(q)$, where $q$ is the prediction error rate. We also design a ``fallback option'' that ensures that the competitive ratio of the algorithm for {\em any} input sequence is at most $O(1/q)$. Our result adds to the recent body of work that uses machine learning to improve the performance of ``classic'' algorithms.

preprint2020arXiv

Rapid Approximate Aggregation with Distribution-Sensitive Interval Guarantees

Aggregating data is fundamental to data analytics, data exploration, and OLAP. Approximate query processing (AQP) techniques are often used to accelerate computation of aggregates using samples, for which confidence intervals (CIs) are widely used to quantify the associated error. CIs used in practice fall into two categories: techniques that are tight but not correct, i.e., they yield tight intervals but only offer asymptotic guarantees, making them unreliable, or techniques that are correct but not tight, i.e., they offer rigorous guarantees, but are overly conservative, leading to confidence intervals that are too loose to be useful. In this paper, we develop a CI technique that is both correct and tighter than traditional approaches. Starting from conservative CIs, we identify two issues they often face: pessimistic mass allocation (PMA) and phantom outlier sensitivity (PHOS). By developing a novel range-trimming technique for eliminating PHOS and pairing it with known CI techniques without PMA, we develop a technique for computing CIs with strong guarantees that requires fewer samples for the same width. We implement our techniques underneath a sampling-optimized in-memory column store and show how to accelerate queries involving aggregates on a real dataset with speedups of up to 124x over traditional AQP-with-guarantees and more than 1000x over exact methods.

preprint2016arXiv

Sublinear-Time Algorithms for Counting Star Subgraphs with Applications to Join Selectivity Estimation

We study the problem of estimating the value of sums of the form $S_p \triangleq \sum \binom{x_i}{p}$ when one has the ability to sample $x_i \geq 0$ with probability proportional to its magnitude. When $p=2$, this problem is equivalent to estimating the selectivity of a self-join query in database systems when one can sample rows randomly. We also study the special case when $\{x_i\}$ is the degree sequence of a graph, which corresponds to counting the number of $p$-stars in a graph when one has the ability to sample edges randomly. Our algorithm for a $(1 \pm \varepsilon)$-multiplicative approximation of $S_p$ has query and time complexities $Ø(\frac{m \log \log n}{ε^2 S_p^{1/p}})$. Here, $m=\sum x_i/2$ is the number of edges in the graph, or equivalently, half the number of records in the database table. Similarly, $n$ is the number of vertices in the graph and the number of unique values in the database table. We also provide tight lower bounds (up to polylogarithmic factors) in almost all cases, even when $\{x_i\}$ is a degree sequence and one is allowed to use the structure of the graph to try to get a better estimate. We are not aware of any prior lower bounds on the problem of join selectivity estimation. For the graph problem, prior work which assumed the ability to sample only \emph{vertices} uniformly gave algorithms with matching lower bounds [Gonen, Ron, and Shavitt. \textit{SIAM J. Comput.}, 25 (2011), pp. 1365-1411]. With the ability to sample edges randomly, we show that one can achieve faster algorithms for approximating the number of star subgraphs, bypassing the lower bounds in this prior work. For example, in the regime where $S_p\leq n$, and $p=2$, our upper bound is $\tilde{O}(n/S_p^{1/2})$, in contrast to their $Ω(n/S_p^{1/3})$ lower bound when no random edge queries are available.

preprint2016arXiv

Testing Shape Restrictions of Discrete Distributions

We study the question of testing structured properties (classes) of discrete distributions. Specifically, given sample access to an arbitrary distribution $D$ over $[n]$ and a property $\mathcal{P}$, the goal is to distinguish between $D\in\mathcal{P}$ and $\ell_1(D,\mathcal{P})>\varepsilon$. We develop a general algorithm for this question, which applies to a large range of "shape-constrained" properties, including monotone, log-concave, $t$-modal, piecewise-polynomial, and Poisson Binomial distributions. Moreover, for all cases considered, our algorithm has near-optimal sample complexity with regard to the domain size and is computationally efficient. For most of these classes, we provide the first non-trivial tester in the literature. In addition, we also describe a generic method to prove lower bounds for this problem, and use it to show our upper bounds are nearly tight. Finally, we extend some of our techniques to tolerant testing, deriving nearly-tight upper and lower bounds for the corresponding questions.

preprint2015arXiv

A Self-Tester for Linear Functions over the Integers with an Elementary Proof of Correctness

We present simple, self-contained proofs of correctness for algorithms for linearity testing and program checking of linear functions on finite subsets of integers represented as n-bit numbers. In addition we explore a generalization of self-testing to homomorphisms on a multidimensional vector space. We show that our self-testing algorithm for the univariate case can be directly generalized to vector space domains. The number of queries made by our algorithms is independent of domain size.

preprint2015arXiv

Constructing Near Spanning Trees with Few Local Inspections

Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. Motivated by several recent studies of local graph algorithms, we consider the following variant of this problem. Let G be a connected bounded-degree graph. Given an edge $e$ in $G$ we would like to decide whether $e$ belongs to a connected subgraph $G'$ consisting of $(1+ε)n$ edges (for a prespecified constant $ε>0$), where the decision for different edges should be consistent with the same subgraph $G'$. Can this task be performed by inspecting only a {\em constant} number of edges in $G$? Our main results are: (1) We show that if every $t$-vertex subgraph of $G$ has expansion $1/(\log t)^{1+o(1)}$ then one can (deterministically) construct a sparse spanning subgraph $G'$ of $G$ using few inspections. To this end we analyze a "local" version of a famous minimum-weight spanning tree algorithm. (2) We show that the above expansion requirement is sharp even when allowing randomization. To this end we construct a family of $3$-regular graphs of high girth, in which every $t$-vertex subgraph has expansion $1/(\log t)^{1-o(1)}$.

preprint2015arXiv

Local Computation Algorithms for Graphs of Non-Constant Degrees

In the model of \emph{local computation algorithms} (LCAs), we aim to compute the queried part of the output by examining only a small (sublinear) portion of the input. Many recently developed LCAs on graph problems achieve time and space complexities with very low dependence on $n$, the number of vertices. Nonetheless, these complexities are generally at least exponential in $d$, the upper bound on the degree of the input graph. Instead, we consider the case where parameter $d$ can be moderately dependent on $n$, and aim for complexities with subexponential dependence on $d$, while maintaining polylogarithmic dependence on $n$. We present: a randomized LCA for computing maximal independent sets whose time and space complexities are quasi-polynomial in $d$ and polylogarithmic in $n$; for constant $ε> 0$, a randomized LCA that provides a $(1-ε)$-approximation to maximum matching whose time and space complexities are polynomial in $d$ and polylogarithmic in $n$.

preprint2014arXiv

Rapid Sampling for Visualizations with Ordering Guarantees

Visualizations are frequently used as a means to understand trends and gather insights from datasets, but often take a long time to generate. In this paper, we focus on the problem of rapidly generating approximate visualizations while preserving crucial visual proper- ties of interest to analysts. Our primary focus will be on sampling algorithms that preserve the visual property of ordering; our techniques will also apply to some other visual properties. For instance, our algorithms can be used to generate an approximate visualization of a bar chart very rapidly, where the comparisons between any two bars are correct. We formally show that our sampling algorithms are generally applicable and provably optimal in theory, in that they do not take more samples than necessary to generate the visualizations with ordering guarantees. They also work well in practice, correctly ordering output groups while taking orders of magnitude fewer samples and much less time than conventional sampling schemes.

preprint2014arXiv

Testing probability distributions underlying aggregated data

In this paper, we analyze and study a hybrid model for testing and learning probability distributions. Here, in addition to samples, the testing algorithm is provided with one of two different types of oracles to the unknown distribution $D$ over $[n]$. More precisely, we define both the dual and cumulative dual access models, in which the algorithm $A$ can both sample from $D$ and respectively, for any $i\in[n]$, - query the probability mass $D(i)$ (query access); or - get the total mass of $\{1,\dots,i\}$, i.e. $\sum_{j=1}^i D(j)$ (cumulative access) These two models, by generalizing the previously studied sampling and query oracle models, allow us to bypass the strong lower bounds established for a number of problems in these settings, while capturing several interesting aspects of these problems -- and providing new insight on the limitations of the models. Finally, we show that while the testing algorithms can be in most cases strictly more efficient, some tasks remain hard even with this additional power.

preprint2013arXiv

A simple online competitive adaptation of Lempel-Ziv compression with efficient random access support

We present a simple adaptation of the Lempel Ziv 78' (LZ78) compression scheme ({\em IEEE Transactions on Information Theory, 1978}) that supports efficient random access to the input string. Namely, given query access to the compressed string, it is possible to efficiently recover any symbol of the input string. The compression algorithm is given as input a parameter $\eps >0$, and with very high probability increases the length of the compressed string by at most a factor of $(1+\eps)$. The access time is $O(\log n + 1/\eps^2)$ in expectation, and $O(\log n/\eps^2)$ with high probability. The scheme relies on sparse transitive-closure spanners. Any (consecutive) substring of the input string can be retrieved at an additional additive cost in the running time of the length of the substring. We also formally establish the necessity of modifying LZ78 so as to allow efficient random access. Specifically, we construct a family of strings for which $Ω(n/\log n)$ queries to the LZ78-compressed string are required in order to recover a single symbol in the input string. The main benefit of the proposed scheme is that it preserves the online nature and simplicity of LZ78, and that for {\em every} input string, the length of the compressed string is only a small factor larger than that obtained by running LZ78.

preprint2013arXiv

Local reconstructors and tolerant testers for connectivity and diameter

A local property reconstructor for a graph property is an algorithm which, given oracle access to the adjacency list of a graph that is "close" to having the property, provides oracle access to the adjacency matrix of a "correction" of the graph, i.e. a graph which has the property and is close to the given graph. For this model, we achieve local property reconstructors for the properties of connectivity and $k$-connectivity in undirected graphs, and the property of strong connectivity in directed graphs. Along the way, we present a method of transforming a local reconstructor (which acts as a "adjacency matrix oracle" for the corrected graph) into an "adjacency list oracle". This allows us to recursively use our local reconstructor for $(k-1)$-connectivity to obtain a local reconstructor for $k$-connectivity. We also extend this notion of local property reconstruction to parametrized graph properties (for instance, having diameter at most $D$ for some parameter $D$) and require that the corrected graph has the property with parameter close to the original. We obtain a local reconstructor for the low diameter property, where if the original graph is close to having diameter $D$, then the corrected graph has diameter roughly 2D. We also exploit a connection between local property reconstruction and property testing, observed by Brakerski, to obtain new tolerant property testers for all of the aforementioned properties. Except for the one for connectivity, these are the first tolerant property testers for these properties.

preprint2011arXiv

A Near-Optimal Sublinear-Time Algorithm for Approximating the Minimum Vertex Cover Size

We give a nearly optimal sublinear-time algorithm for approximating the size of a minimum vertex cover in a graph G. The algorithm may query the degree deg(v) of any vertex v of its choice, and for each 1 <= i <= deg(v), it may ask for the i-th neighbor of v. Letting VC_opt(G) denote the minimum size of vertex cover in G, the algorithm outputs, with high constant success probability, an estimate VC_estimate(G) such that VC_opt(G) <= VC_estimate(G) <= 2 * VC_opt(G) + epsilon*n, where epsilon is a given additive approximation parameter. We refer to such an estimate as a (2,epsilon)-estimate. The query complexity and running time of the algorithm are ~O(avg_deg * poly(1/epsilon)), where avg_deg denotes the average vertex degree in the graph. The best previously known sublinear algorithm, of Yoshida et al. (STOC 2009), has query complexity and running time O(d^4/epsilon^2), where d is the maximum degree in the graph. Given the lower bound of Omega(avg_deg) (for constant epsilon) for obtaining such an estimate (with any constant multiplicative factor) due to Parnas and Ron (TCS 2007), our result is nearly optimal. In the case that the graph is dense, that is, the number of edges is Theta(n^2), we consider another model, in which the algorithm may ask, for any pair of vertices u and v, whether there is an edge between u and v. We show how to adapt the algorithm that uses neighbor queries to this model and obtain an algorithm that outputs a (2,epsilon)-estimate of the size of a minimum vertex cover whose query complexity and running time are ~O(n) * poly(1/epsilon).

preprint2011arXiv

Approximating the Influence of a monotone Boolean function in O(\sqrt{n}) query complexity

The {\em Total Influence} ({\em Average Sensitivity) of a discrete function is one of its fundamental measures. We study the problem of approximating the total influence of a monotone Boolean function \ifnum\plusminus=1 $f: \{\pm1\}^n \longrightarrow \{\pm1\}$, \else $f: \bitset^n \to \bitset$, \fi which we denote by $I[f]$. We present a randomized algorithm that approximates the influence of such functions to within a multiplicative factor of $(1\pm \eps)$ by performing $O(\frac{\sqrt{n}\log n}{I[f]} \poly(1/\eps)) $ queries. % \mnote{D: say something about technique?} We also prove a lower bound of % $Ω(\frac{\sqrt{n/\log n}}{I[f]})$ $Ω(\frac{\sqrt{n}}{\log n \cdot I[f]})$ on the query complexity of any constant-factor approximation algorithm for this problem (which holds for $I[f] = Ω(1)$), % and $I[f] = O(\sqrt{n}/\log n)$), hence showing that our algorithm is almost optimal in terms of its dependence on $n$. For general functions we give a lower bound of $Ω(\frac{n}{I[f]})$, which matches the complexity of a simple sampling algorithm.

preprint2011arXiv

Fast Local Computation Algorithms

For input $x$, let $F(x)$ denote the set of outputs that are the "legal" answers for a computational problem $F$. Suppose $x$ and members of $F(x)$ are so large that there is not time to read them in their entirety. We propose a model of {\em local computation algorithms} which for a given input $x$, support queries by a user to values of specified locations $y_i$ in a legal output $y \in F(x)$. When more than one legal output $y$ exists for a given $x$, the local computation algorithm should output in a way that is consistent with at least one such $y$. Local computation algorithms are intended to distill the common features of several concepts that have appeared in various algorithmic subfields, including local distributed computation, local algorithms, locally decodable codes, and local reconstruction. We develop a technique, based on known constructions of small sample spaces of $k$-wise independent random variables and Beck's analysis in his algorithmic approach to the Lov{á}sz Local Lemma, which under certain conditions can be applied to construct local computation algorithms that run in {\em polylogarithmic} time and space. We apply this technique to maximal independent set computations, scheduling radio network broadcasts, hypergraph coloring and satisfying $k$-SAT formulas.

preprint2011arXiv

Space-efficient Local Computation Algorithms

Recently Rubinfeld et al. (ICS 2011, pp. 223--238) proposed a new model of sublinear algorithms called \emph{local computation algorithms}. In this model, a computation problem $F$ may have more than one legal solution and each of them consists of many bits. The local computation algorithm for $F$ should answer in an online fashion, for any index $i$, the $i^{\mathrm{th}}$ bit of some legal solution of $F$. Further, all the answers given by the algorithm should be consistent with at least one solution of $F$. In this work, we continue the study of local computation algorithms. In particular, we develop a technique which under certain conditions can be applied to construct local computation algorithms that run not only in polylogarithmic time but also in polylogarithmic \emph{space}. Moreover, these local computation algorithms are easily parallelizable and can answer all parallel queries consistently. Our main technical tools are pseudorandom numbers with bounded independence and the theory of branching processes.

preprint2010arXiv

Testing Closeness of Discrete Distributions

Given samples from two distributions over an $n$-element set, we wish to test whether these distributions are statistically close. We present an algorithm which uses sublinear in $n$, specifically, $O(n^{2/3}ε^{-8/3}\log n)$, independent samples from each distribution, runs in time linear in the sample size, makes no assumptions about the structure of the distributions, and distinguishes the cases when the distance between the distributions is small (less than $\max\{ε^{4/3}n^{-1/3}/32, εn^{-1/2}/4\}$) or large (more than $ε$) in $\ell_1$ distance. This result can be compared to the lower bound of $Ω(n^{2/3}ε^{-2/3})$ for this problem given by Valiant. Our algorithm has applications to the problem of testing whether a given Markov process is rapidly mixing. We present sublinear for several variants of this problem as well.

Ronitt Rubinfeld

What is connected

Connect this record

See the researcher in context

Building this map preview

21 published item(s)

Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover

Massively Parallel Algorithms for Small Subgraph Counting

Triangle and Four Cycle Counting with Predictions in Graph Streams

Local Access to Random Walks

Monotone probability distributions over the Boolean cube can be learned with sublinear samples

Online Page Migration with ML Advice

Rapid Approximate Aggregation with Distribution-Sensitive Interval Guarantees

Sublinear-Time Algorithms for Counting Star Subgraphs with Applications to Join Selectivity Estimation

Testing Shape Restrictions of Discrete Distributions

A Self-Tester for Linear Functions over the Integers with an Elementary Proof of Correctness

Constructing Near Spanning Trees with Few Local Inspections

Local Computation Algorithms for Graphs of Non-Constant Degrees

Rapid Sampling for Visualizations with Ordering Guarantees

Testing probability distributions underlying aggregated data

A simple online competitive adaptation of Lempel-Ziv compression with efficient random access support

Local reconstructors and tolerant testers for connectivity and diameter

A Near-Optimal Sublinear-Time Algorithm for Approximating the Minimum Vertex Cover Size

Approximating the Influence of a monotone Boolean function in O(\sqrt{n}) query complexity

Fast Local Computation Algorithms

Space-efficient Local Computation Algorithms

Testing Closeness of Discrete Distributions