Source author record

Guanfeng Liang

Guanfeng Liang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Networking and Internet Architecture Cryptography and Security Information Theory math.IT Performance

Catalog footprint

What is connected

14works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2014arXiv

On Throughput-Delay Optimal Access to Storage Clouds via Load Adaptive Coding and Chunking

Recent literature including our past work provide analysis and solutions for using (i) erasure coding, (ii) parallelism, or (iii) variable slicing/chunking (i.e., dividing an object of a specific size into a variable number of smaller chunks) in speeding the I/O performance of storage clouds. However, a comprehensive approach that considers all three dimensions together to achieve the best throughput-delay trade-off curve had been lacking. This paper presents the first set of solutions that can pick the best combination of coding rate and object chunking/slicing options as the load dynamically changes. Our specific contributions are as follows: (1) We establish via measurement that combining variable coding rate and chunking is mostly feasible over a popular public cloud. (2) We relate the delay optimal values for chunking level and code rate to the queue backlogs via an approximate queueing analysis. (3) Based on this analysis, we propose TOFEC that adapts the chunking level and coding rate against the queue backlogs. Our trace-driven simulation results show that TOFEC's adaptation mechanism converges to an appropriate code that provides the optimal throughput-delay trade-off without reducing system capacity. Compared to a non-adaptive strategy optimized for throughput, TOFEC delivers $2.5\times$ lower latency under light workloads; compared to a non-adaptive strategy optimized for latency, TOFEC can scale to support over $3\times$ as many requests. (4) We propose a simpler greedy solution that performs on a par with TOFEC in average delay performance, but exhibits significantly more performance variations.

preprint2014arXiv

When Queueing Meets Coding: Optimal-Latency Data Retrieving Scheme in Storage Clouds

In this paper, we study the problem of reducing the delay of downloading data from cloud storage systems by leveraging multiple parallel threads, assuming that the data has been encoded and stored in the clouds using fixed rate forward error correction (FEC) codes with parameters (n, k). That is, each file is divided into k equal-sized chunks, which are then expanded into n chunks such that any k chunks out of the n are sufficient to successfully restore the original file. The model can be depicted as a multiple-server queue with arrivals of data retrieving requests and a server corresponding to a thread. However, this is not a typical queueing model because a server can terminate its operation, depending on when other servers complete their service (due to the redundancy that is spread across the threads). Hence, to the best of our knowledge, the analysis of this queueing model remains quite uncharted. Recent traces from Amazon S3 show that the time to retrieve a fixed size chunk is random and can be approximated as a constant delay plus an i.i.d. exponentially distributed random variable. For the tractability of the theoretical analysis, we assume that the chunk downloading time is i.i.d. exponentially distributed. Under this assumption, we show that any work-conserving scheme is delay-optimal among all on-line scheduling schemes when k = 1. When k > 1, we find that a simple greedy scheme, which allocates all available threads to the head of line request, is delay optimal among all on-line scheduling schemes. We also provide some numerical results that point to the limitations of the exponential assumption, and suggest further research directions.

preprint2013arXiv

FAST CLOUD: Pushing the Envelope on Delay Performance of Cloud Storage with Coding

Our paper presents solutions that can significantly improve the delay performance of putting and retrieving data in and out of cloud storage. We first focus on measuring the delay performance of a very popular cloud storage service Amazon S3. We establish that there is significant randomness in service times for reading and writing small and medium size objects when assigned distinct keys. We further demonstrate that using erasure coding, parallel connections to storage cloud and limited chunking (i.e., dividing the object into a few smaller objects) together pushes the envelope on service time distributions significantly (e.g., 76%, 80%, and 85% reductions in mean, 90th, and 99th percentiles for 2 Mbyte files) at the expense of additional storage (e.g., 1.75x). However, chunking and erasure coding increase the load and hence the queuing delays while reducing the supportable rate region in number of requests per second per node. Thus, in the second part of our paper we focus on analyzing the delay performance when chunking, FEC, and parallel connections are used together. Based on this analysis, we develop load adaptive algorithms that can pick the best code rate on a per request basis by using off-line computed queue backlog thresholds. The solutions work with homogeneous services with fixed object sizes, chunk sizes, operation type (e.g., read or write) as well as heterogeneous services with mixture of object sizes, chunk sizes, and operation types. We also present a simple greedy solution that opportunistically uses idle connections and picks the erasure coding rate accordingly on the fly. Both backlog and greedy solutions support the full rate region and provide best mean delay performance when compared to the best fixed coding rate policy. Our evaluations show that backlog based solutions achieve better delay performance at higher percentile values than the greedy solution.

preprint2013arXiv

On Diagnosis of Forwarding Plane via Static Forwarding Rules in Software Defined Networks

Software Defined Networks (SDN) decouple the forwarding and control planes from each other. The control plane is assumed to have a global knowledge of the underlying physical and/or logical network topology so that it can monitor, abstract and control the forwarding plane. In our paper, we present solutions that install an optimal or near-optimal (i.e., within 14% of the optimal) number of static forwarding rules on switches/routers so that any controller can verify the topology connectivity and detect/locate link failures at data plane speeds without relying on state updates from other controllers. Our upper bounds on performance indicate that sub-second link failure localization is possible even at data-center scale networks. For networks with hundreds or few thousand links, tens of milliseconds of latency is achievable.

preprint2012arXiv

Byzantine Broadcast in Point-to-Point Networks using Local Linear Coding

The goal of Byzantine Broadcast (BB) is to allow a set of fault-free nodes to agree on information that a source node wants to broadcast to them, in the presence of Byzantine faulty nodes. We consider design of efficient algorithms for BB in {\em synchronous} point-to-point networks, where the rate of transmission over each communication link is limited by its "link capacity". The throughput of a particular BB algorithm is defined as the average number of bits that can be reliably broadcast to all fault-free nodes per unit time using the algorithm without violating the link capacity constraints. The {\em capacity} of BB in a given network is then defined as the supremum of all achievable BB throughputs in the given network, over all possible BB algorithms. We develop NAB -- a Network-Aware Byzantine broadcast algorithm -- for arbitrary point-to-point networks consisting of $n$ nodes, wherein the number of faulty nodes is at most $f$, $f<n/3$, and the network connectivity is at least $2f+1$. We also prove an upper bound on the capacity of Byzantine broadcast, and conclude that NAB can achieve throughput at least 1/3 of the capacity. When the network satisfies an additional condition, NAB can achieve throughput at least 1/2 of the capacity. To the best of our knowledge, NAB is the first algorithm that can achieve a constant fraction of capacity of Byzantine Broadcast (BB) in arbitrary point-to-point networks.

preprint2012arXiv

Iterative Approximate Byzantine Consensus in Arbitrary Directed Graphs

In this paper, we explore the problem of iterative approximate Byzantine consensus in arbitrary directed graphs. In particular, we prove a necessary and sufficient condition for the existence of iterative byzantine consensus algorithms. Additionally, we use our sufficient condition to examine whether such algorithms exist for some specific graphs.

preprint2012arXiv

Iterative Approximate Byzantine Consensus in Arbitrary Directed Graphs - Part II: Synchronous and Asynchronous Systems

This report contains two related sets of results with different assumptions on synchrony. The first part is about iterative algorithms in synchronous systems. Following our previous work on synchronous iterative approximate Byzantine consensus (IABC) algorithms, we provide a more intuitive tight necessary and sufficient condition for the existence of such algorithms in synchronous networks1. We believe this condition and the previous results also hold in partially asynchronous algorithmic model. In the second part of the report, we explore the problem in asynchronous networks. While the traditional Byzantine consensus is not solvable in asynchronous systems, approximate Byzantine consensus can be solved using iterative algorithms.

preprint2011arXiv

Capacity of Byzantine Consensus with Capacity-Limited Point-to-Point Links

We consider the problem of maximizing the throughput of Byzantine consensus, when communication links have finite capacity. Byzantine consensus is a classical problem in distributed computing. In existing literature, the communication links are implicitly assumed to have infinite capacity. The problem changes significantly when the capacity of links is finite. We define the throughput and capacity of consensus, and identify upper bound of achievable consensus throughput. We propose an algorithm that achieves consensus capacity in complete four-node networks with at most 1 failure with arbitrary distribution of link capacities.

preprint2011arXiv

Error-Free Multi-Valued Consensus with Byzantine Failures

In this paper, we present an efficient deterministic algorithm for consensus in presence of Byzantine failures. Our algorithm achieves consensus on an $L$-bit value with communication complexity $O(nL + n^4 L^{0.5} + n^6)$ bits, in a network consisting of $n$ processors with up to $t$ Byzantine failures, such that $t<n/3$. For large enough $L$, communication complexity of the proposed algorithm approaches $O(nL)$ bits. In other words, for large $L$, the communication complexity is linear in the number of processors in the network. This is an improvement over the work of Fitzi and Hirt (from PODC 2006), who proposed a probabilistically correct multi-valued Byzantine consensus algorithm with a similar complexity for large $L$. In contrast to the algorithm by Fitzi and Hirt, our algorithm is guaranteed to be always error-free. Our algorithm require no cryptographic technique, such as authentication, nor any secret sharing mechanism. To the best of our knowledge, we are the first to show that, for large $L$, error-free multi-valued Byzantine consensus on an $L$-bit value is achievable with $O(nL)$ bits of communication.

preprint2011arXiv

New Efficient Error-Free Multi-Valued Consensus with Byzantine Failures

In this report, we investigate the multi-valued Byzantine consensus problem. We introduce two algorithms: the first one achieves traditional validity requirement for consensus, and the second one achieves a stronger "q-validity" requirement. Both algorithms are more efficient than the ones introduces in our recent PODC 2011 paper titled "Error-Free Multi-Valued Consensus with Byzantine Failures".

preprint2010arXiv

Complexity of Multi-Value Byzantine Agreement

In this paper, we consider the problem of maximizing the throughput of Byzantine agreement, given that the sum capacity of all links in between nodes in the system is finite. We have proposed a highly efficient Byzantine agreement algorithm on values of length l>1 bits. This algorithm uses error detecting network codes to ensure that fault-free nodes will never disagree, and routing scheme that is adaptive to the result of error detection. Our algorithm has a bit complexity of n(n-1)l/(n-t), which leads to a linear cost (O(n)) per bit agreed upon, and overcomes the quadratic lower bound (Omega(n^2)) in the literature. Such linear per bit complexity has only been achieved in the literature by allowing a positive probability of error. Our algorithm achieves the linear per bit complexity while guaranteeing agreement is achieved correctly even in the worst case. We also conjecture that our algorithm can be used to achieve agreement throughput arbitrarily close to the agreement capacity of a network, when the sum capacity is given.

preprint2010arXiv

Deterministic Consensus Algorithm with Linear Per-Bit Complexity

In this report, building on the deterministic multi-valued one-to-many Byzantine agreement (broadcast) algorithm in our recent technical report [2], we introduce a deterministic multi-valued all-to-all Byzantine agreement algorithm (consensus), with linear complexity per bit agreed upon. The discussion in this note is not self-contained, and relies heavily on the material in [2] - please refer to [2] for the necessary background.

preprint2010arXiv

Multiparty Equality Function Computation in Networks with Point-to-Point Links

In this report, we study the multiparty communication complexity problem of the multiparty equality function (MEQ): EQ(x_1,...,x_n) = 1 if x_1=...=x_n, and 0 otherwise. The input vector (x_1,...,x_n) is distributed among n>=2 nodes, with x_i known to node i, where x_i is chosen from the set {1,...,M}, for some integer M>0. Instead of the "number on the forehand" model, we consider a point-to-point communication model (similar to the message passing model), which we believe is more realistic in networking settings. We assume a synchronous fully connected network of n nodes, the node IDs (identifiers) are common knowledge. We assume that all point-to-point communication channels/links are private such that when a node transmits, only the designated recipient can receive the message. The identity of the sender is known to the recipient. We demonstrate that traditional techniques generalized from two-party communication complexity problem are not sufficient to obtain tight bounds under the point-to-point communication model. We then introduce techniques which significantly reduce the space of protocols to study. These techniques are used to study some instances of the MEQ problem.

preprint2010arXiv

Short Note on Complexity of Multi-Value Byzantine Agreement

Randomized algorithm that achieves multi-valued Byzantine agreement with high probability, and achieves optimal complexity.

Guanfeng Liang

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

On Throughput-Delay Optimal Access to Storage Clouds via Load Adaptive Coding and Chunking

When Queueing Meets Coding: Optimal-Latency Data Retrieving Scheme in Storage Clouds

FAST CLOUD: Pushing the Envelope on Delay Performance of Cloud Storage with Coding

On Diagnosis of Forwarding Plane via Static Forwarding Rules in Software Defined Networks

Byzantine Broadcast in Point-to-Point Networks using Local Linear Coding

Iterative Approximate Byzantine Consensus in Arbitrary Directed Graphs

Iterative Approximate Byzantine Consensus in Arbitrary Directed Graphs - Part II: Synchronous and Asynchronous Systems

Capacity of Byzantine Consensus with Capacity-Limited Point-to-Point Links

Error-Free Multi-Valued Consensus with Byzantine Failures

New Efficient Error-Free Multi-Valued Consensus with Byzantine Failures

Complexity of Multi-Value Byzantine Agreement

Deterministic Consensus Algorithm with Linear Per-Bit Complexity

Multiparty Equality Function Computation in Networks with Point-to-Point Links

Short Note on Complexity of Multi-Value Byzantine Agreement