Source author record

Michael A. Bender

Michael A. Bender appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Distributed, Parallel, and Cluster Computing Databases Networking and Internet Architecture

Catalog footprint

What is connected

13works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

GraphZeppelin: Storage-Friendly Sketching for Connected Components on Dynamic Graph Streams

Finding the connected components of a graph is a fundamental problem with uses throughout computer science and engineering. The task of computing connected components becomes more difficult when graphs are very large, or when they are dynamic, meaning the edge set changes over time subject to a stream of edge insertions and deletions. A natural approach to computing the connected components on a large, dynamic graph stream is to buy enough RAM to store the entire graph. However, the requirement that the graph fit in RAM is prohibitive for very large graphs. Thus, there is an unmet need for systems that can process dense dynamic graphs, especially when those graphs are larger than available RAM. We present a new high-performance streaming graph-processing system for computing the connected components of a graph. This system, which we call GraphZeppelin, uses new linear sketching data structures (CubeSketches) to solve the streaming connected components problem and as a result requires space asymptotically smaller than the space required for a lossless representation of the graph. GraphZeppelin is optimized for massive dense graphs: GraphZeppelin can process millions of edge updates (both insertions and deletions) per second, even when the underlying graph is far too large to fit in available RAM. As a result GraphZeppelin vastly increases the scale of graphs that can be processed.

preprint2022arXiv

Online List Labeling: Breaking the $\log^2n$ Barrier

The online list labeling problem is an algorithmic primitive with a large literature of upper bounds, lower bounds, and applications. The goal is to store a dynamically-changing set of $n$ items in an array of $m$ slots, while maintaining the invariant that the items appear in sorted order, and while minimizing the relabeling cost, defined to be the number of items that are moved per insertion/deletion. For the linear regime, where $m = (1 + Θ(1)) n$, an upper bound of $O(\log^2 n)$ on the relabeling cost has been known since 1981. A lower bound of $Ω(\log^2 n)$ is known for deterministic algorithms and for so-called smooth algorithms, but the best general lower bound remains $Ω(\log n)$. The central open question in the field is whether $O(\log^2 n)$ is optimal for all algorithms. In this paper, we give a randomized data structure that achieves an expected relabeling cost of $O(\log^{3/2} n)$ per operation. More generally, if $m = (1 + \varepsilon) n$ for $\varepsilon = O(1)$, the expected relabeling cost becomes $O(\varepsilon^{-1} \log^{3/2} n)$. Our solution is history independent, meaning that the state of the data structure is independent of the order in which items are inserted/deleted. For history-independent data structures, we also prove a matching lower bound: for all $ε$ between $1 / n^{1/3}$ and some sufficiently small positive constant, the optimal expected cost for history-independent list-labeling solutions is $Θ(\varepsilon^{-1}\log^{3/2} n)$.

preprint2022arXiv

What Does Dynamic Optimality Mean in External Memory?

In this paper, we revisit the question of how the dynamic optimality of search trees should be defined in external memory. A defining characteristic of external-memory data structures is that there is a stark asymmetry between queries and inserts/updates/deletes: by making the former slightly asymptotically slower, one can make the latter significantly asymptotically faster (even allowing for operations with sub-constant amortized I/Os). This asymmetry makes it so that rotation-based search trees are not optimal (or even close to optimal) in insert/update/delete-heavy external-memory workloads. To study dynamic optimality for such workloads, one must consider a different class of data structures. The natural class of data structures to consider are what we call buffered-propagation trees. Such trees can adapt dynamically to the locality properties of an input sequence in order to optimize the interactions between different inserts/updates/deletes and queries. We also present a new form of beyond-worst-case analysis that allows for us to formally study a continuum between static and dynamic optimality. Finally, we give a novel data structure, called the \jellotree, that is statically optimal and that achieves dynamic optimality for a large natural class of inputs defined by our beyond-worst-case analysis.

preprint2020arXiv

Batched Predecessor and Sorting with Size-Priced Information in External Memory

In the unit-cost comparison model, a black box takes an input two items and outputs the result of the comparison. Problems like sorting and searching have been studied in this model, and it has been generalized to include the concept of priced information, where different pairs of items (say database records) have different comparison costs. These comparison costs can be arbitrary (in which case no algorithm can be close to optimal (Charikar et al. STOC 2000)), structured (for example, the comparison cost may depend on the length of the databases (Gupta et al. FOCS 2001)), or stochastic (Angelov et al. LATIN 2008). Motivated by the database setting where the cost depends on the sizes of the items, we consider the problems of sorting and batched predecessor where two non-uniform sets of items $A$ and $B$ are given as input. (1) In the RAM setting, we consider the scenario where both sets have $n$ keys each. The cost to compare two items in $A$ is $a$, to compare an item of $A$ to an item of $B$ is $b$, and to compare two items in $B$ is $c$. We give upper and lower bounds for the case $a \le b \le c$. Notice that the case $b=1, a=c=\infty$ is the famous ``nuts and bolts'' problem. (2) In the Disk-Access Model (DAM), where transferring elements between disk and internal memory is the main bottleneck, we consider the scenario where elements in $B$ are larger than elements in $A$. The larger items take more I/Os to be brought into memory, consume more space in internal memory, and are required in their entirety for comparisons. We first give output-sensitive lower and upper bounds on the batched predecessor problem, and use these to derive bounds on the complexity of sorting in the two models. Our bounds are tight in most cases, and require novel generalizations of the classical lower bound techniques in external memory to accommodate the non-uniformity of keys.

preprint2020arXiv

Contention Resolution Without Collision Detection

This paper focuses on the contention resolution problem on a shared communication channel that does not support collision detection. A shared communication channel is a multiple access channel, which consists of a sequence of synchronized time slots. Players on the channel may attempt to broadcast a packet (message) in any time slot. A player's broadcast succeeds if no other player broadcasts during that slot. If two or more players broadcast in the same time slot, then the broadcasts collide and both broadcasts fail. The lack of collision detection means that a player monitoring the channel cannot differentiate between the case of two or more players broadcasting in the same slot (a collision) and zero players broadcasting. In the contention-resolution problem, players arrive on the channel over time, and each player has one packet to transmit. The goal is to coordinate the players so that each player is able to successfully transmit its packet within reasonable time. However, the players can only communicate via the shared channel by choosing to either broadcast or not. A contention-resolution protocol is measured in terms of its throughput (channel utilization). Previous work on contention resolution that achieved constant throughput assumed that either players could detect collisions, or the players' arrival pattern is generated by a memoryless (non-adversarial) process. The foundational question answered by this paper is whether collision detection is a luxury or necessity when the objective is to achieve constant throughput. We show that even without collision detection, one can solve contention resolution, achieving constant throughput, with high probability.

preprint2016arXiv

The Minimum Backlog Problem

We study the minimum backlog problem (MBP). This online problem arises, e.g., in the context of sensor networks. We focus on two main variants of MBP. The discrete MBP is a 2-person game played on a graph $G=(V,E)$. The player is initially located at a vertex of the graph. In each time step, the adversary pours a total of one unit of water into cups that are located on the vertices of the graph, arbitrarily distributing the water among the cups. The player then moves from her current vertex to an adjacent vertex and empties the cup at that vertex. The player's objective is to minimize the backlog, i.e., the maximum amount of water in any cup at any time. The geometric MBP is a continuous-time version of the MBP: the cups are points in the two-dimensional plane, the adversary pours water continuously at a constant rate, and the player moves in the plane with unit speed. Again, the player's objective is to minimize the backlog. We show that the competitive ratio of any algorithm for the MBP has a lower bound of $Ω(D)$, where $D$ is the diameter of the graph (for the discrete MBP) or the diameter of the point set (for the geometric MBP). Therefore we focus on determining a strategy for the player that guarantees a uniform upper bound on the absolute value of the backlog. For the absolute value of the backlog there is a trivial lower bound of $Ω(D)$, and the deamortization analysis of Dietz and Sleator gives an upper bound of $O(D\log N)$ for $N$ cups. Our main result is a tight upper bound for the geometric MBP: we show that there is a strategy for the player that guarantees a backlog of $O(D)$, independently of the number of cups.

preprint2015arXiv

How to Scale Exponential Backoff

Randomized exponential backoff is a widely deployed technique for coordinating access to a shared resource. A good backoff protocol should, arguably, satisfy three natural properties: (i) it should provide constant throughput, wasting as little time as possible; (ii) it should require few failed access attempts, minimizing the amount of wasted effort; and (iii) it should be robust, continuing to work efficiently even if some of the access attempts fail for spurious reasons. Unfortunately, exponential backoff has some well-known limitations in two of these areas: it provides poor (sub-constant) throughput (in the worst case), and is not robust (to resource acquisition failures). The goal of this paper is to "fix" exponential backoff by making it scalable, particularly focusing on the case where processes arrive in an on-line, worst-case fashion. We present a relatively simple backoff protocol~Re-Backoff~that has, at its heart, a version of exponential backoff. It guarantees expected constant throughput with dynamic process arrivals and requires only an expected polylogarithmic number of access attempts per process. Re-Backoff is also robust to periods where the shared resource is unavailable for a period of time. If it is unavailable for $D$ time slots, Re-Backoff provides the following guarantees. When the number of packets is a finite $n$, the average expected number of access attempts for successfully sending a packet is $O(\log^2( n + D))$. In the infinite case, the average expected number of access attempts for successfully sending a packet is $O( \log^2(η) + \log^2(D) )$ where $η$ is the maximum number of processes that are ever in the system concurrently.

preprint2015arXiv

Run Generation Revisited: What Goes Up May or May Not Come Down

In this paper, we revisit the classic problem of run generation. Run generation is the first phase of external-memory sorting, where the objective is to scan through the data, reorder elements using a small buffer of size M , and output runs (contiguously sorted chunks of elements) that are as long as possible. We develop algorithms for minimizing the total number of runs (or equivalently, maximizing the average run length) when the runs are allowed to be sorted or reverse sorted. We study the problem in the online setting, both with and without resource augmentation, and in the offline setting. (1) We analyze alternating-up-down replacement selection (runs alternate between sorted and reverse sorted), which was studied by Knuth as far back as 1963. We show that this simple policy is asymptotically optimal. Specifically, we show that alternating-up-down replacement selection is 2-competitive and no deterministic online algorithm can perform better. (2) We give online algorithms having smaller competitive ratios with resource augmentation. Specifically, we exhibit a deterministic algorithm that, when given a buffer of size 4M , is able to match or beat any optimal algorithm having a buffer of size M . Furthermore, we present a randomized online algorithm which is 7/4-competitive when given a buffer twice that of the optimal. (3) We demonstrate that performance can also be improved with a small amount of foresight. We give an algorithm, which is 3/2-competitive, with foreknowledge of the next 3M elements of the input stream. For the extreme case where all future elements are known, we design a PTAS for computing the optimal strategy a run generation algorithm must follow. (4) Finally, we present algorithms tailored for nearly sorted inputs which are guaranteed to have optimal solutions with sufficiently long runs.

preprint2013arXiv

Reallocation Problems in Scheduling

In traditional on-line problems, such as scheduling, requests arrive over time, demanding available resources. As each request arrives, some resources may have to be irrevocably committed to servicing that request. In many situations, however, it may be possible or even necessary to reallocate previously allocated resources in order to satisfy a new request. This reallocation has a cost. This paper shows how to service the requests while minimizing the reallocation cost. We focus on the classic problem of scheduling jobs on a multiprocessor system. Each unit-size job has a time window in which it can be executed. Jobs are dynamically added and removed from the system. We provide an algorithm that maintains a valid schedule, as long as a sufficiently feasible schedule exists. The algorithm reschedules only a total number of O(min{log^* n, log^* Delta}) jobs for each job that is inserted or deleted from the system, where n is the number of active jobs and Delta is the size of the largest window.

preprint2012arXiv

Don't Thrash: How to Cache Your Hash on Flash

This paper presents new alternatives to the well-known Bloom filter data structure. The Bloom filter, a compact data structure supporting set insertion and membership queries, has found wide application in databases, storage systems, and networks. Because the Bloom filter performs frequent random reads and writes, it is used almost exclusively in RAM, limiting the size of the sets it can represent. This paper first describes the quotient filter, which supports the basic operations of the Bloom filter, achieving roughly comparable performance in terms of space and time, but with better data locality. Operations on the quotient filter require only a small number of contiguous accesses. The quotient filter has other advantages over the Bloom filter: it supports deletions, it can be dynamically resized, and two quotient filters can be efficiently merged. The paper then gives two data structures, the buffered quotient filter and the cascade filter, which exploit the quotient filter advantages and thus serve as SSD-optimized alternatives to the Bloom filter. The cascade filter has better asymptotic I/O performance than the buffered quotient filter, but the buffered quotient filter outperforms the cascade filter on small to medium data sets. Both data structures significantly outperform recently-proposed SSD-optimized Bloom filter variants, such as the elevator Bloom filter, buffered Bloom filter, and forest-structured Bloom filter. In experiments, the cascade filter and buffered quotient filter performed insertions 8.6-11 times faster than the fastest Bloom filter variant and performed lookups 0.94-2.56 times faster.

preprint2011arXiv

A New Approach to Incremental Cycle Detection and Related Problems

We consider the problem of detecting a cycle in a directed graph that grows by arc insertions, and the related problems of maintaining a topological order and the strong components of such a graph. For these problems, we give two algorithms, one suited to sparse graphs, and the other to dense graphs. The former takes the minimum of O(m^{3/2}) and O(mn^{2/3}) time to insert m arcs into an n-vertex graph; the latter takes O(n^2 log(n)) time. Our sparse algorithm is considerably simpler than a previous O(m^{3/2})-time algorithm; it is also faster on graphs of sufficient density. The time bound of our dense algorithm beats the previously best time bound of O(n^{5/2}) for dense graphs. Our algorithms rely for their efficiency on topologically ordered vertex numberings; bounds on the size of the numbers give bound on running times.

preprint2011arXiv

Maintaining Arrays of Contiguous Objects

In this paper we consider methods for dynamically storing a set of different objects ("modules") in a physical array. Each module requires one free contiguous subinterval in order to be placed. Items are inserted or removed, resulting in a fragmented layout that makes it harder to insert further modules. It is possible to relocate modules, one at a time, to another free subinterval that is contiguous and does not overlap with the current location of the module. These constraints clearly distinguish our problem from classical memory allocation. We present a number of algorithmic results, including a bound of Theta(n^2) on physical sorting if there is a sufficiently large free space and sum up NP-hardness results for arbitrary initial layouts. For online scenarios in which modules arrive one at a time, we present a method that requires O(1) moves per insertion or deletion and amortized cost O(m_i log M) per insertion or deletion, where m_i is the module's size, M is the size of the largest module and costs for moves are linear in the size of a module.

preprint2007arXiv

Scheduling Algorithms for Procrastinators

This paper presents scheduling algorithms for procrastinators, where the speed that a procrastinator executes a job increases as the due date approaches. We give optimal off-line scheduling policies for linearly increasing speed functions. We then explain the computational/numerical issues involved in implementing this policy. We next explore the online setting, showing that there exist adversaries that force any online scheduling policy to miss due dates. This impossibility result motivates the problem of minimizing the maximum interval stretch of any job; the interval stretch of a job is the job's flow time divided by the job's due date minus release time. We show that several common scheduling strategies, including the "hit-the-highest-nail" strategy beloved by procrastinators, have arbitrarily large maximum interval stretch. Then we give the "thrashing" scheduling policy and show that it is a Θ(1) approximation algorithm for the maximum interval stretch.

Michael A. Bender

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

GraphZeppelin: Storage-Friendly Sketching for Connected Components on Dynamic Graph Streams

Online List Labeling: Breaking the $\log^2n$ Barrier

What Does Dynamic Optimality Mean in External Memory?

Batched Predecessor and Sorting with Size-Priced Information in External Memory

Contention Resolution Without Collision Detection

The Minimum Backlog Problem

How to Scale Exponential Backoff

Run Generation Revisited: What Goes Up May or May Not Come Down

Reallocation Problems in Scheduling

Don't Thrash: How to Cache Your Hash on Flash

A New Approach to Incremental Cycle Detection and Related Problems

Maintaining Arrays of Contiguous Objects

Scheduling Algorithms for Procrastinators