Source author record

Lluis Pamies-Juarez

Lluis Pamies-Juarez appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Information Theory math.IT Networking and Internet Architecture

Catalog footprint

What is connected

5works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2013arXiv

In-Network Redundancy Generation for Opportunistic Speedup of Backup

Erasure coding is a storage-efficient alternative to replication for achieving reliable data backup in distributed storage systems. During the storage process, traditional erasure codes require a unique source node to create and upload all the redundant data to the different storage nodes. However, such a source node may have limited communication and computation capabilities, which constrain the storage process throughput. Moreover, the source node and the different storage nodes might not be able to send and receive data simultaneously -- e.g., nodes might be busy in a datacenter setting, or simply be offline in a peer-to-peer setting -- which can further threaten the efficacy of the overall storage process. In this paper we propose an "in-network" redundancy generation process which distributes the data insertion load among the source and storage nodes by allowing the storage nodes to generate new redundant data by exchanging partial information among themselves, improving the throughput of the storage process. The process is carried out asynchronously, utilizing spare bandwidth and computing resources from the storage nodes. The proposed approach leverages on the local repairability property of newly proposed erasure codes tailor made for the needs of distributed storage systems. We analytically show that the performance of this technique relies on an efficient usage of the spare node resources, and we derive a set of scheduling algorithms to maximize the same. We experimentally show, using availability traces from real peer-to-peer applications as well as Google data center availability and workload traces, that our algorithms can, depending on the environment characteristics, increase the throughput of the storage process significantly (up to 90% in data centers, and 60% in peer-to-peer settings) with respect to the classical naive data insertion approach.

preprint2013arXiv

Locally Repairable Codes with Multiple Repair Alternatives

Distributed storage systems need to store data redundantly in order to provide some fault-tolerance and guarantee system reliability. Different coding techniques have been proposed to provide the required redundancy more efficiently than traditional replication schemes. However, compared to replication, coding techniques are less efficient for repairing lost redundancy, as they require retrieval of larger amounts of data from larger subsets of storage nodes. To mitigate these problems, several recent works have presented locally repairable codes designed to minimize the repair traffic and the number of nodes involved per repair. Unfortunately, existing methods often lead to codes where there is only one subset of nodes able to repair a piece of lost data, limiting the local repairability to the availability of the nodes in this subset. In this paper, we present a new family of locally repairable codes that allows different trade-offs between the number of contacted nodes per repair, and the number of different subsets of nodes that enable this repair. We show that slightly increasing the number of contacted nodes per repair allows to have repair alternatives, which in turn increases the probability of being able to perform efficient repairs. Finally, we present pg-BLRC, an explicit construction of locally repairable codes with multiple repair alternatives, constructed from partial geometries, in particular from Generalized Quadrangles. We show how these codes can achieve practical lengths and high rates, while requiring a small number of nodes per repair, and providing multiple repair alternatives.

preprint2013arXiv

The CORE Storage Primitive: Cross-Object Redundancy for Efficient Data Repair & Access in Erasure Coded Storage

Erasure codes are an integral part of many distributed storage systems aimed at Big Data, since they provide high fault-tolerance for low overheads. However, traditional erasure codes are inefficient on reading stored data in degraded environments (when nodes might be unavailable), and on replenishing lost data (vital for long term resilience). Consequently, novel codes optimized to cope with distributed storage system nuances are vigorously being researched. In this paper, we take an engineering alternative, exploring the use of simple and mature techniques -juxtaposing a standard erasure code with RAID-4 like parity. We carry out an analytical study to determine the efficacy of this approach over traditional as well as some novel codes. We build upon this study to design CORE, a general storage primitive that we integrate into HDFS. We benchmark this implementation in a proprietary cluster and in EC2. Our experiments show that compared to traditional erasure codes, CORE uses 50% less bandwidth and is up to 75% faster while recovering a single failed node, while the gains are respectively 15% and 60% for double node failures.

preprint2012arXiv

RapidRAID: Pipelined Erasure Codes for Fast Data Archival in Distributed Storage Systems

To achieve reliability in distributed storage systems, data has usually been replicated across different nodes. However the increasing volume of data to be stored has motivated the introduction of erasure codes, a storage efficient alternative to replication, particularly suited for archival in data centers, where old datasets (rarely accessed) can be erasure encoded, while replicas are maintained only for the latest data. Many recent works consider the design of new storage-centric erasure codes for improved repairability. In contrast, this paper addresses the migration from replication to encoding: traditionally erasure coding is an atomic operation in that a single node with the whole object encodes and uploads all the encoded pieces. Although large datasets can be concurrently archived by distributing individual object encodings among different nodes, the network and computing capacity of individual nodes constrain the archival process due to such atomicity. We propose a new pipelined coding strategy that distributes the network and computing load of single-object encodings among different nodes, which also speeds up multiple object archival. We further present RapidRAID codes, an explicit family of pipelined erasure codes which provides fast archival without compromising either data reliability or storage overheads. Finally, we provide a real implementation of RapidRAID codes and benchmark its performance using both a cluster of 50 nodes and a set of Amazon EC2 instances. Experiments show that RapidRAID codes reduce a single object's coding time by up to 90%, while when multiple objects are encoded concurrently, the reduction is up to 20%.

preprint2011arXiv

Cost Analysis of Redundancy Schemes for Distributed Storage Systems

Distributed storage infrastructures require the use of data redundancy to achieve high data reliability. Unfortunately, the use of redundancy introduces storage and communication overheads, which can either reduce the overall storage capacity of the system or increase its costs. To mitigate the storage and communication overhead, different redundancy schemes have been proposed. However, due to the great variety of underlaying storage infrastructures and the different application needs, optimizing these redundancy schemes for each storage infrastructure is cumbersome. The lack of rules to determine the optimal level of redundancy for each storage configuration leads developers in industry to often choose simpler redundancy schemes, which are usually not the optimal ones. In this paper we analyze the cost of different redundancy schemes and derive a set of rules to determine which redundancy scheme minimizes the storage and the communication costs for a given system configuration. Additionally, we use simulation to show that theoretically-optimal schemes may not be viable in a realistic setting where nodes can go off-line and repairs may be delayed. In these cases, we identify which are the trade-offs between the storage and communication overheads of the redundancy scheme and its data reliability.

Lluis Pamies-Juarez

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

In-Network Redundancy Generation for Opportunistic Speedup of Backup

Locally Repairable Codes with Multiple Repair Alternatives

The CORE Storage Primitive: Cross-Object Redundancy for Efficient Data Repair & Access in Erasure Coded Storage

RapidRAID: Pipelined Erasure Codes for Fast Data Archival in Distributed Storage Systems

Cost Analysis of Redundancy Schemes for Distributed Storage Systems