Source author record

Riko Jacob

Riko Jacob appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Computational Complexity Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

6works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Fragile Complexity of Adaptive Algorithms

The fragile complexity of a comparison-based algorithm is $f(n)$ if each input element participates in $O(f(n))$ comparisons. In this paper, we explore the fragile complexity of algorithms adaptive to various restrictions on the input, i.e., algorithms with a fragile complexity parameterized by a quantity other than the input size n. We show that searching for the predecessor in a sorted array has fragile complexity $Θ(\log k)$, where $k$ is the rank of the query element, both in a randomized and a deterministic setting. For predecessor searches, we also show how to optimally reduce the amortized fragile complexity of the elements in the array. We also prove the following results: Selecting the $k$-th smallest element has expected fragile complexity $O(\log \log k)$ for the element selected. Deterministically finding the minimum element has fragile complexity $Θ(\log(Inv))$ and $Θ(\log(Runs))$, where $Inv$ is the number of inversions in a sequence and $Runs$ is the number of increasing runs in a sequence. Deterministically finding the median has fragile complexity $O(\log(Runs) + \log \log n)$ and $Θ(\log(Inv))$. Deterministic sorting has fragile complexity $Θ(\log(Inv))$ but it has fragile complexity $Θ(\log n)$ regardless of the number of runs.

preprint2020arXiv

Cache-Oblivious Priority Queues with Decrease-Key and Applications to Graph Algorithms

We present priority queues in the cache-oblivious external memory model with block size $B$ and main memory size $M$ that support on $N$ elements, operation \textsc{UPDATE} (combination of \textsc{INSERT} and \textsc{DECREASEKEY}) in $O \left(\frac{1}{B}\log_{\fracλ{B}} \frac{N}{B}\right)$ amortized I/Os and operations \textsc{EXTRACT-MIN} and \textsc{DELETE} in $O \left(\lceil \frac{λ^{\varepsilon}}{B} \log_{\fracλ{B}} \frac{N}{B} \rceil \log_{\fracλ{B}} \frac{N}{B}\right)$ amortized I/Os, using $O \left(\frac{N}{B}\log_{\fracλ{B}} \frac{N}{B}\right)$ blocks, for a user-defined parameter $λ\in [2, N ]$ and any real $\varepsilon \in (0,1)$. Our result improves upon previous I/O-efficient cache-oblivious and cache-aware priority queues [Chowdhury and Ramachandran, TALG 2018], [Brodal et al., SWAT 2004], [Kumar and Schwabe, SPDP 1996], [Arge et al., SICOMP 2007], [Fadel et al., TCS 1999]. We also present buffered repository trees that support on a multi-set of $N$ elements, operation \textsc{INSERT} in $O \left(\frac{1}{B}\log_{\fracλ{B}} \frac{N}{B}\right)$ I/Os and operation \textsc{EXTRACT} on $K$ extracted elements in $O \left(\frac{λ^{\varepsilon}}{B} \log_{\fracλ{B}} \frac{N}{B} + \frac{K}{B}\right)$ amortized I/Os, using $O \left(\frac{N}{B}\right)$ blocks, improving previous cache-aware and cache-oblivious results [Arge et al., SICOMP '07], [Buchsbaum et al., SODA '00]. In the cache-oblivious model, for $λ= O \left(E/V\right)$, we achieve $O \left(\frac{E}{B}\log_{\frac{E}{V B}} \frac{E}{B}\right)$ I/Os for single-source shortest paths, depth-first search and breadth-first search algorithms on massive directed dense graphs $(V,E)$. Our algorithms are I/O-optimal for $E/V = Ω(M)$ (and in the cache-aware setting for $λ= O(M)$).

preprint2020arXiv

On the I/O complexity of the k-nearest neighbor problem

We consider static, external memory indexes for exact and approximate versions of the $k$-nearest neighbor ($k$-NN) problem, and show new lower bounds under a standard indivisibility assumption: - Polynomial space indexing schemes for high-dimensional $k$-NN in Hamming space cannot take advantage of block transfers: $Ω(k)$ block reads are needed to to answer a query. - For the $\ell_\infty$ metric the lower bound holds even if we allow $c$-appoximate nearest neighbors to be returned, for $c \in (1, 3)$. - The restriction to $c < 3$ is necessary: For every metric there exists an indexing scheme in the indexability model of Hellerstein et al.~using space $O(kn)$, where $n$ is the number of points, that can retrieve $k$ 3-approximate nearest neighbors using $\lceil k/B\rceil$ I/Os, which is optimal. - For specific metrics, data structures with better approximation factors are possible. For $k$-NN in Hamming space and every approximation factor $c>1$ there exists a polynomial space data structure that returns $k$ $c$-approximate nearest neighbors in $\lceil k/B\rceil$ I/Os. To show these lower bounds we develop two new techniques: First, to handle that approximation algorithms have more freedom in deciding which result set to return we develop a relaxed version of the $λ$-set workload technique of Hellerstein et al. This technique allows us to show lower bounds that hold in $d\geq n$ dimensions. To extend the lower bounds down to $d = O(k \log(n/k))$ dimensions, we develop a new deterministic dimension reduction technique that may be of independent interest.

preprint2015arXiv

Tight Bounds for Low Dimensional Star Stencils in the Parallel External Memory Model

Stencil computations on low dimensional grids are kernels of many scientific applications including finite difference methods used to solve partial differential equations. On typical modern computer architectures, such stencil computations are limited by the performance of the memory subsystem, namely by the bandwidth between main memory and the cache. This work considers the computation of star stencils, like the 5-point and 7-point stencil, in the external memory model and parallel external memory model and analyses the constant of the leading term of the non-compulsory I/Os. While optimizing stencil computations is an active field of research, there has been a significant gap between the lower bounds and the performance of the algorithms so far. In two dimensions, this work provides matching constants for lower and upper bounds closing a multiplicative gap of 4. In three dimensions, the bounds match up to a factor of $\sqrt{2}$ improving the known results by a factor of $2 \sqrt{3}\sqrt{B}$, where $B$ is the block (cache line) size of the external memory model. For dimensions $d\geq 4$, the lower bound is improved between a factor of $4$ and $6$. For arbitrary dimension~$d$, the first analysis of the constant of the leading term of the non-compulsory I/Os is presented. For $d\geq 3$ the lower and upper bound match up to a factor of $\sqrt[d-1]{d!}\approx \frac{d}{e}$.

preprint2014arXiv

On the Complexity of List Ranking in the Parallel External Memory Model

We study the problem of list ranking in the parallel external memory (PEM) model. We observe an interesting dual nature for the hardness of the problem due to limited information exchange among the processors about the structure of the list, on the one hand, and its close relationship to the problem of permuting data, which is known to be hard for the external memory models, on the other hand. By carefully defining the power of the computational model, we prove a permuting lower bound in the PEM model. Furthermore, we present a stronger Ω(log^2 N) lower bound for a special variant of the problem and for a specific range of the model parameters, which takes us a step closer toward proving a non-trivial lower bound for the list ranking problem in the bulk-synchronous parallel (BSP) and MapReduce models. Finally, we also present an algorithm that is tight for a larger range of parameters of the model than in prior work.

preprint2011arXiv

The Efficiency of MapReduce in Parallel External Memory

Since its introduction in 2004, the MapReduce framework has become one of the standard approaches in massive distributed and parallel computation. In contrast to its intensive use in practise, theoretical footing is still limited and only little work has been done yet to put MapReduce on a par with the major computational models. Following pioneer work that relates the MapReduce framework with PRAM and BSP in their macroscopic structure, we focus on the functionality provided by the framework itself, considered in the parallel external memory model (PEM). In this, we present upper and lower bounds on the parallel I/O-complexity that are matching up to constant factors for the shuffle step. The shuffle step is the single communication phase where all information of one MapReduce invocation gets transferred from map workers to reduce workers. Hence, we move the focus towards the internal communication step in contrast to previous work. The results we obtain further carry over to the BSP* model. On the one hand, this shows how much complexity can be "hidden" for an algorithm expressed in MapReduce compared to PEM. On the other hand, our results bound the worst-case performance loss of the MapReduce approach in terms of I/O-efficiency.