Source author record

Robert E. Tarjan

Robert E. Tarjan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

13works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Foundation for Proving Splay is Dynamically Optimal

Consider the task of performing a sequence of searches in a binary search tree. After each search, we allow an algorithm to arbitrarily restructure the tree. The cost of executing the task is the sum of the time spent searching and the time spent optimizing the searches with restructuring operations. Sleator and Tarjan introduced this notion in 1985, along with an algorithm and a conjecture. The algorithm, Splay, is an elegant procedure for performing adjustments that move searched items to the top of the tree. The conjecture, called dynamic optimality, is that the cost of splaying is always within a constant factor of the optimal algorithm for performing searches. We lay a foundation for proving the dynamic optimality conjecture. Central to our method is approximate monotonicity. Approximately monotone algorithms are those whose cost does not increase by more than a fixed multiple after removing searches from the sequence. As we shall see, Splay is dynamically optimal if and only if it is approximately monotone. This result extends to a weaker form of approximate monotonicity as well as insertion, deletion, and related algorithms. We prove that a lower bound on optimal execution cost is approximately monotone and outline how to adapt this proof from the lower bound to Splay, and how to overcome the remaining barriers to establishing dynamic optimality.

preprint2022arXiv

Finding Strong Components Using Depth-First Search

We survey three algorithms that use depth-first search to find the strong components of a directed graph in linear time: (1) Tarjan's algorithm; (2) a cycle-finding algorithm; and (3) a bidirectional search algorithm.

preprint2020arXiv

Concurrent Disjoint Set Union

We develop and analyze concurrent algorithms for the disjoint set union (union-find) problem in the shared memory, asynchronous multiprocessor model of computation, with CAS (compare and swap) or DCAS (double compare and swap) as the synchronization primitive. We give a deterministic bounded wait-free algorithm that uses DCAS and has a total work bound of $O(m \cdot (\log(np/m + 1) + α(n, m/(np)))$ for a problem with $n$ elements and $m$ operations solved by $p$ processes, where $α$ is a functional inverse of Ackermann's function. We give two randomized algorithms that use only CAS and have the same work bound in expectation. The analysis of the second randomized algorithm is valid even if the scheduler is adversarial. Our DCAS and randomized algorithms take $O(\log n)$ steps per operation, worst-case for the DCAS algorithm, high-probability for the randomized algorithms. Our work and step bounds grow only logarithmically with $p$, making our algorithms truly scalable. We prove that for a class of symmetric algorithms that includes ours, no better step or work bound is possible.

preprint2020arXiv

Simple Concurrent Labeling Algorithms for Connected Components

We study a class of simple algorithms for concurrently computing the connected components of an $n$-vertex, $m$-edge graph. Our algorithms are easy to implement in either the COMBINING CRCW PRAM or the MPC computing model. For two related algorithms in this class, we obtain $Θ(\lg n)$ step and $Θ(m \lg n)$ work bounds. For two others, we obtain $O(\lg^2 n)$ step and $O(m \lg^2 n)$ work bounds, which are tight for one of them. All our algorithms are simpler than related algorithms in the literature. We also point out some gaps and errors in the analysis of previous algorithms. Our results show that even a basic problem like connected components still has secrets to reveal.

preprint2016arXiv

A Randomized Concurrent Algorithm for Disjoint Set Union

The disjoint set union problem is a basic problem in data structures with a wide variety of applications. We extend a known efficient sequential algorithm for this problem to obtain a simple and efficient concurrent wait-free algorithm running on an asynchronous parallel random access machine (APRAM). Crucial to our result is the use of randomization. Under a certain independence assumption, for a problem instance in which there are n elements, m operations, and p processes, our algorithm does Theta(m (alpha(n, m/(np)) + log(np/m + 1))) expected work, where the expectation is over the random choices made by the algorithm and alpha is a functional inverse of Ackermann's function. In addition, each operation takes O(log n) steps with high probability. Our algorithm is significantly simpler and more efficient than previous algorithms proposed by Anderson and Woll. Under our independence assumption, our algorithm achieves almost-linear speed-up for applications in which all or most of the processes can be kept busy.

preprint2015arXiv

A Note on Fault Tolerant Reachability for Directed Graphs

In this note we describe an application of low-high orders in fault-tolerant network design. Baswana et al. [DISC 2015] study the following reachability problem. We are given a flow graph $G = (V, A)$ with start vertex $s$, and a spanning tree $T =(V, A_T)$ rooted at $s$. We call a set of arcs $A'$ valid if the subgraph $G' = (V, A_T \cup A')$ of $G$ has the same dominators as $G$. The goal is to find a valid set of minimum size. Baswana et al. gave an $O(m \log{n})$-time algorithm to compute a minimum-size valid set in $O(m \log{n})$ time, where $n = |V|$ and $m = |A|$. Here we provide a simple $O(m)$-time algorithm that uses the dominator tree $D$ of $G$ and a low-high order of it.

preprint2015arXiv

Amortized Rotation Cost in AVL Trees

An AVL tree is the original type of balanced binary search tree. An insertion in an $n$-node AVL tree takes at most two rotations, but a deletion in an $n$-node AVL tree can take $Θ(\log n)$. A natural question is whether deletions can take many rotations not only in the worst case but in the amortized case as well. A sequence of $n$ successive deletions in an $n$-node tree takes $O(n)$ rotations, but what happens when insertions are intermixed with deletions? Heaupler, Sen, and Tarjan conjectured that alternating insertions and deletions in an $n$-node AVL tree can cause each deletion to do $Ω(\log n)$ rotations, but they provided no construction to justify their claim. We provide such a construction: we show that, for infinitely many $n$, there is a set $E$ of {\it expensive} $n$-node AVL trees with the property that, given any tree in $E$, deleting a certain leaf and then reinserting it produces a tree in $E$, with the deletion having done $Θ(\log n)$ rotations. One can do an arbitrary number of such expensive deletion-insertion pairs. The difficulty in obtaining such a construction is that in general the tree produced by an expensive deletion-insertion pair is not the original tree. Indeed, if the trees in $E$ have even height $k$, $2^{k/2}$ deletion-insertion pairs are required to reproduce the original tree.

preprint2015arXiv

Hollow Heaps

We introduce the hollow heap, a very simple data structure with the same amortized efficiency as the classical Fibonacci heap. All heap operations except delete and delete-min take $O(1)$ time, worst case as well as amortized; delete and delete-min take $O(\log n)$ amortized time on a heap of $n$ items. Hollow heaps are by far the simplest structure to achieve this. Hollow heaps combine two novel ideas: the use of lazy deletion and re-insertion to do decrease-key operations, and the use of a dag (directed acyclic graph) instead of a tree or set of trees to represent a heap. Lazy deletion produces hollow nodes (nodes without items), giving the data structure its name.

preprint2014arXiv

A Back-to-Basics Empirical Study of Priority Queues

The theory community has proposed several new heap variants in the recent past which have remained largely untested experimentally. We take the field back to the drawing board, with straightforward implementations of both classic and novel structures using only standard, well-known optimizations. We study the behavior of each structure on a variety of inputs, including artificial workloads, workloads generated by running algorithms on real map data, and workloads from a discrete event simulator used in recent systems networking research. We provide observations about which characteristics are most correlated to performance. For example, we find that the L1 cache miss rate appears to be strongly correlated with wallclock time. We also provide observations about how the input sequence affects the relative performance of the different heap variants. For example, we show (both theoretically and in practice) that certain random insertion-deletion sequences are degenerate and can lead to misleading results. Overall, our findings suggest that while the conventional wisdom holds in some cases, it is sorely mistaken in others.

preprint2014arXiv

Fibonacci Heaps Revisited

The Fibonacci heap is a classic data structure that supports deletions in logarithmic amortized time and all other heap operations in O(1) amortized time. We explore the design space of this data structure. We propose a version with the following improvements over the original: (i) Each heap is represented by a single heap-ordered tree, instead of a set of trees. (ii) Each decrease-key operation does only one cut and a cascade of rank changes, instead of doing a cascade of cuts. (iii) The outcomes of all comparisons done by the algorithm are explicitly represented in the data structure, so none are wasted. We also give an example to show that without cascading cuts or rank changes, both the original data structure and the new version fail to have the desired efficiency, solving an open problem of Fredman. Finally, we illustrate the richness of the design space by proposing several alternative ways to do cascading rank changes, including a randomized strategy related to one previously proposed by Karger. We leave the analysis of these alternatives as intriguing open problems.

preprint2013arXiv

Dominator Tree Certification and Independent Spanning Trees

How does one verify that the output of a complicated program is correct? One can formally prove that the program is correct, but this may be beyond the power of existing methods. Alternatively one can check that the output produced for a particular input satisfies the desired input-output relation, by running a checker on the input-output pair. Then one only needs to prove the correctness of the checker. But for some problems even such a checker may be too complicated to formally verify. There is a third alternative: augment the original program to produce not only an output but also a correctness certificate, with the property that a very simple program (whose correctness is easy to prove) can use the certificate to verify that the input-output pair satisfies the desired input-output relation. We consider the following important instance of this general question: How does one verify that the dominator tree of a flow graph is correct? Existing fast algorithms for finding dominators are complicated, and even verifying the correctness of a dominator tree in the absence of additional information seems complicated. We define a correctness certificate for a dominator tree, show how to use it to easily verify the correctness of the tree, and show how to augment fast dominator-finding algorithms so that they produce a correctness certificate. We also relate the dominator certificate problem to the problem of finding independent spanning trees in a flow graph, and we develop algorithms to find such trees. All our algorithms run in linear time. Previous algorithms apply just to the special case of only trivial dominators, and they take at least quadratic time.

preprint2013arXiv

Finding Dominators via Disjoint Set Union

The problem of finding dominators in a directed graph has many important applications, notably in global optimization of computer code. Although linear and near-linear-time algorithms exist, they use sophisticated data structures. We develop an algorithm for finding dominators that uses only a "static tree" disjoint set data structure in addition to simple lists and maps. The algorithm runs in near-linear or linear time, depending on the implementation of the disjoint set data structure. We give several versions of the algorithm, including one that computes loop nesting information (needed in many kinds of global code optimization) and that can be made self-certifying, so that the correctness of the computed dominators is very easy to verify.

preprint2011arXiv

A New Approach to Incremental Cycle Detection and Related Problems

We consider the problem of detecting a cycle in a directed graph that grows by arc insertions, and the related problems of maintaining a topological order and the strong components of such a graph. For these problems, we give two algorithms, one suited to sparse graphs, and the other to dense graphs. The former takes the minimum of O(m^{3/2}) and O(mn^{2/3}) time to insert m arcs into an n-vertex graph; the latter takes O(n^2 log(n)) time. Our sparse algorithm is considerably simpler than a previous O(m^{3/2})-time algorithm; it is also faster on graphs of sufficient density. The time bound of our dense algorithm beats the previously best time bound of O(n^{5/2}) for dense graphs. Our algorithms rely for their efficiency on topologically ordered vertex numberings; bounds on the size of the numbers give bound on running times.

Robert E. Tarjan

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

A Foundation for Proving Splay is Dynamically Optimal

Finding Strong Components Using Depth-First Search

Concurrent Disjoint Set Union

Simple Concurrent Labeling Algorithms for Connected Components

A Randomized Concurrent Algorithm for Disjoint Set Union

A Note on Fault Tolerant Reachability for Directed Graphs

Amortized Rotation Cost in AVL Trees

Hollow Heaps

A Back-to-Basics Empirical Study of Priority Queues

Fibonacci Heaps Revisited

Dominator Tree Certification and Independent Spanning Trees

Finding Dominators via Disjoint Set Union

A New Approach to Incremental Cycle Detection and Related Problems