Researcher profile

Rafail Ostrovsky

Rafail Ostrovsky contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2013arXiv

How Hard is Counting Triangles in the Streaming Model

The problem of (approximately) counting the number of triangles in a graph is one of the basic problems in graph theory. In this paper we study the problem in the streaming model. We study the amount of memory required by a randomized algorithm to solve this problem. In case the algorithm is allowed one pass over the stream, we present a best possible lower bound of $Ω(m)$ for graphs $G$ with $m$ edges on $n$ vertices. If a constant number of passes is allowed, we show a lower bound of $Ω(m/T)$, $T$ the number of triangles. We match, in some sense, this lower bound with a 2-pass $O(m/T^{1/3})$-memory algorithm that solves the problem of distinguishing graphs with no triangles from graphs with at least $T$ triangles. We present a new graph parameter $ρ(G)$ -- the triangle density, and conjecture that the space complexity of the triangles problem is $Ω(m/ρ(G))$. We match this by a second algorithm that solves the distinguishing problem using $O(m/ρ(G))$-memory.

preprint2013arXiv

Secure End-to-End Communication with Optimal Throughput in Unreliable Networks

We demonstrate the feasibility of end-to-end communication in highly unreliable networks. Modeling a network as a graph with vertices representing nodes and edges representing the links between them, we consider two forms of unreliability: unpredictable edge-failures, and deliberate deviation from protocol specifications by corrupt nodes. We present a robust routing protocol for end-to-end communication that is simultaneously resilient to both forms of unreliability. In particular, we prove rigorously that our protocol is SECURE against the actions of the corrupt nodes, achieves correctness (Receiver gets ALL of the messages from Sender, in order and without modification), and enjoys provably optimal throughput performance, as measured using competitive analysis. Furthermore, our protocol does not incur any asymptotic memory overhead as compared to other protocols that are unable to handle malicious interference of corrupt nodes. In particular, our protocol requires O(n^2) memory per processor, where n is the size of the network. This represents an O(n^2) improvement over all existing protocols that have been designed for this network model.

preprint2012arXiv

Approximating Large Frequency Moments with Pick-and-Drop Sampling

Given data stream $D = \{p_1,p_2,...,p_m\}$ of size $m$ of numbers from $\{1,..., n\}$, the frequency of $i$ is defined as $f_i = |\{j: p_j = i\}|$. The $k$-th \emph{frequency moment} of $D$ is defined as $F_k = \sum_{i=1}^n f_i^k$. We consider the problem of approximating frequency moments in insertion-only streams for $k\ge 3$. For any constant $c$ we show an $O(n^{1-2/k}\log(n)\log^{(c)}(n))$ upper bound on the space complexity of the problem. Here $\log^{(c)}(n)$ is the iterative $\log$ function. To simplify the presentation, we make the following assumptions: $n$ and $m$ are polynomially far; approximation error $ε$ and parameter $k$ are constants. We observe a natural bijection between streams and special matrices. Our main technical contribution is a non-uniform sampling method on matrices. We call our method a \emph{pick-and-drop sampling}; it samples a heavy element (i.e., element $i$ with frequency $Ω(F_k)$) with probability $Ω(1/n^{1-2/k})$ and gives approximation $\tilde{f_i} \ge (1-ε)f_i$. In addition, the estimations never exceed the real values, that is $ \tilde{f_j} \le f_j$ for all $j$. As a result, we reduce the space complexity of finding a heavy element to $O(n^{1-2/k}\log(n))$ bits. We apply our method of recursive sketches and resolve the problem with $O(n^{1-2/k}\log(n)\log^{(c)}(n))$ bits.

preprint2012arXiv

Near-Optimal Radio Use For Wireless Network Synchronization

We consider the model of communication where wireless devices can either switch their radios off to save energy, or switch their radios on and engage in communication. We distill a clean theoretical formulation of this problem of minimizing radio use and present near-optimal solutions. Our base model ignores issues of communication interference, although we also extend the model to handle this requirement. We assume that nodes intend to communicate periodically, or according to some time-based schedule. Clearly, perfectly synchronized devices could switch their radios on for exactly the minimum periods required by their joint schedules. The main challenge in the deployment of wireless networks is to synchronize the devices&#39; schedules, given that their initial schedules may be offset relative to one another (even if their clocks run at the same speed). We significantly improve previous results, and show optimal use of the radio for two processors and near-optimal use of the radio for synchronization of an arbitrary number of processors. In particular, for two processors we prove deterministically matching $Θ(\sqrt{n})$ upper and lower bounds on the number of times the radio has to be on, where $n$ is the discretized uncertainty period of the clock shift between the two processors. (In contrast, all previous results for two processors are randomized.) For $m=n^β$ processors (for any $β< 1$) we prove $Ω(n^{(1-β)/2})$ is the lower bound on the number of times the radio has to be switched on (per processor), and show a nearly matching (in terms of the radio use) $Õ(n^{(1-β)/2})$ randomized upper bound per processor, with failure probability exponentially close to 0. For $β\geq 1$ our algorithm runs with at most $poly-log(n)$ radio invocations per processor. Our bounds also hold in a radio-broadcast model where interference must be taken into account.

preprint2010arXiv

AMS Without 4-Wise Independence on Product Domains

In their seminal work, Alon, Matias, and Szegedy introduced several sketching techniques, including showing that 4-wise independence is sufficient to obtain good approximations of the second frequency moment. In this work, we show that their sketching technique can be extended to product domains $[n]^k$ by using the product of 4-wise independent functions on $[n]$. Our work extends that of Indyk and McGregor, who showed the result for $k = 2$. Their primary motivation was the problem of identifying correlations in data streams. In their model, a stream of pairs $(i,j) \in [n]^2$ arrive, giving a joint distribution $(X,Y)$, and they find approximation algorithms for how close the joint distribution is to the product of the marginal distributions under various metrics, which naturally corresponds to how close $X$ and $Y$ are to being independent. By using our technique, we obtain a new result for the problem of approximating the $\ell_2$ distance between the joint distribution and the product of the marginal distributions for $k$-ary vectors, instead of just pairs, in a single pass. Our analysis gives a randomized algorithm that is a $(1 \pm ε)$ approximation (with probability $1-δ$) that requires space logarithmic in $n$ and $m$ and proportional to $3^k$.

preprint2010arXiv

Rademacher Chaos, Random Eulerian Graphs and The Sparse Johnson-Lindenstrauss Transform

The celebrated dimension reduction lemma of Johnson and Lindenstrauss has numerous computational and other applications. Due to its application in practice, speeding up the computation of a Johnson-Lindenstrauss style dimension reduction is an important question. Recently, Dasgupta, Kumar, and Sarlos (STOC 2010) constructed such a transform that uses a sparse matrix. This is motivated by the desire to speed up the computation when applied to sparse input vectors, a scenario that comes up in applications. The sparsity of their construction was further improved by Kane and Nelson (ArXiv 2010). We improve the previous bound on the number of non-zero entries per column of Kane and Nelson from $O(1/ε\log(1/δ)\log(k/δ))$ (where the target dimension is $k$, the distortion is $1\pm ε$, and the failure probability is $δ$) to $$ O\left({1\overε} \left({\log(1/δ)\log\log\log(1/δ) \over \log\log(1/δ)}\right)^2\right). $$ We also improve the amount of randomness needed to generate the matrix. Our results are obtained by connecting the moments of an order 2 Rademacher chaos to the combinatorial properties of random Eulerian multigraphs. Estimating the chance that a random multigraph is composed of a given number of node-disjoint Eulerian components leads to a new tail bound on the chaos. Our estimates may be of independent interest, and as this part of the argument is decoupled from the analysis of the coefficients of the chaos, we believe that our methods can be useful in the analysis of other chaoses.