Source author record

Michael Luby

Michael Luby appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Data Structures and Algorithms Distributed, Parallel, and Cluster Computing Information Retrieval Networking and Internet Architecture

Catalog footprint

What is connected

3works

6topics

1close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Distributed storage algorithms with optimal tradeoffs

One of the primary objectives of a distributed storage system is to reliably store large amounts of source data for long durations using a large number $N$ of unreliable storage nodes, each with $c$ bits of storage capacity. Storage nodes fail randomly over time and are replaced with nodes of equal capacity initialized to zeroes, and thus bits are erased at some rate $e$. To maintain recoverability of the source data, a repairer continually reads data over a network from nodes at an average rate $r$, and generates and writes data to nodes based on the read data. The distributed storage source capacity is the maximum amount of source that can be reliably stored for long periods of time. Previous research shows that asymptotically the distributed storage source capacity is at most $\left(1-\frac{e}{2 \cdot r}\right) \cdot N \cdot c$ as $N$ and $r$ grow. In this work we introduce and analyze algorithms such that asymptotically the distributed storage source data capacity is at least the above equation. Thus, the above equation expresses a fundamental trade-off between network traffic and storage overhead to reliably store source data.

preprint2020arXiv

Repair rate lower bounds for distributed storage

One of the primary objectives of a distributed storage system is to reliably store a large amount $dsize$ of source data for a long duration using a large number $N$ of unreliable storage nodes, each with capacity $nsize$. The storage overhead $β$ is the fraction of system capacity available beyond $dsize$, i.e., $β= 1- \frac{dsize}{N \cdot nsize}$. Storage nodes fail randomly over time and are replaced with initially empty nodes, and thus data is erased from the system at an average rate $erate = λ\cdot N \cdot nsize$, where $1/λ$ is the average lifetime of a node before failure. To maintain recoverability of the source data, a repairer continually reads data over a network from nodes at some average rate $rrate$, and generates and writes data to nodes based on the read data. The main result is that, for any repairer, if the source data is recoverable at each point in time then it must be the case that $rrate \ge \frac{erate}{2 \cdot β}$ asymptotically as $N$ goes to infinity and beta goes to zero. This inequality provides a fundamental lower bound on the average rate that any repairer needs to read data from the system in order to maintain recoverability of the source data.

preprint2020arXiv

SOPI design and analysis for LDN

Liquid Data Networking (LDN) is an ICN architecture that is designed to enable the benefits of erasure-code enabled object delivery. A primary contribution of LDN is the introduction of SOPIs, which enables client s to concurrently download encoded data for the same object from multiple edge nodes, optimizes caching efficiency, and enables seamless mobility. This paper provides an enhanced design and analysis of SOPI s.