Source author record

Aditya Ramamoorthy

Aditya Ramamoorthy appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning Networking and Internet Architecture Distributed, Parallel, and Cluster Computing Cryptography and Security Data Structures and Algorithms math.NA Numerical Analysis

Catalog footprint

What is connected

31works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Unified Treatment of Partial Stragglers and Sparse Matrices in Coded Matrix Computation

The overall execution time of distributed matrix computations is often dominated by slow worker nodes (stragglers) within the clusters. Recently, different coding techniques have been utilized to mitigate the effect of stragglers where worker nodes are assigned the job of processing encoded submatrices of the original matrices. In many machine learning or optimization problems the relevant matrices are often sparse. Several prior coded computation methods operate with dense linear combinations of the original submatrices; this can significantly increase the worker node computation times and consequently the overall job execution time. Moreover, several existing techniques treat the stragglers as failures (erasures) and discard their computations. In this work, we present a coding approach which operates with limited encoding of the original submatrices and utilizes the partial computations done by the slower workers. While our scheme can continue to have the optimal threshold of prior work, it also allows us to trade off the straggler resilience with the worker computation speed for sparse input matrices. Extensive numerical experiments done in AWS (Amazon Web Services) cluster confirm that the proposed approach enhances the speed of the worker computations (and thus the whole process) significantly.

preprint2022arXiv

Aspis: Robust Detection for Distributed Learning

State-of-the-art machine learning models are routinely trained on large-scale distributed clusters. Crucially, such systems can be compromised when some of the computing devices exhibit abnormal (Byzantine) behavior and return arbitrary results to the parameter server (PS). This behavior may be attributed to a plethora of reasons, including system failures and orchestrated attacks. Existing work suggests robust aggregation and/or computational redundancy to alleviate the effect of distorted gradients. However, most of these schemes are ineffective when an adversary knows the task assignment and can choose the attacked workers judiciously to induce maximal damage. Our proposed method Aspis assigns gradient computations to worker nodes using a subset-based assignment which allows for multiple consistency checks on the behavior of a worker node. Examination of the calculated gradients and post-processing (clique-finding in an appropriately constructed graph) by the central node allows for efficient detection and subsequent exclusion of adversaries from the training process. We prove the Byzantine resilience and detection guarantees of Aspis under weak and strong attacks and extensively evaluate the system on various large-scale training scenarios. The principal metric for our experiments is the test accuracy, for which we demonstrate a significant improvement of about 30% compared to many state-of-the-art approaches on the CIFAR-10 dataset. The corresponding reduction of the fraction of corrupted gradients ranges from 16% to 99%.

preprint2022arXiv

Federated Over-Air Subspace Tracking from Incomplete and Corrupted Data

In this work we study the problem of Subspace Tracking with missing data (ST-miss) and outliers (Robust ST-miss). We propose a novel algorithm, and provide a guarantee for both these problems. Unlike past work on this topic, the current work does not impose the piecewise constant subspace change assumption. Additionally, the proposed algorithm is much simpler (uses fewer parameters) than our previous work. Secondly, we extend our approach and its analysis to provably solving these problems when the data is federated and when the over-air data communication modality is used for information exchange between the $K$ peer nodes and the center. We validate our theoretical claims with extensive numerical experiments.

preprint2021arXiv

ByzShield: An Efficient and Robust System for Distributed Training

Training of large scale models on distributed clusters is a critical component of the machine learning pipeline. However, this training can easily be made to fail if some workers behave in an adversarial (Byzantine) fashion whereby they return arbitrary results to the parameter server (PS). A plethora of existing papers consider a variety of attack models and propose robust aggregation and/or computational redundancy to alleviate the effects of these attacks. In this work we consider an omniscient attack model where the adversary has full knowledge about the gradient computation assignments of the workers and can choose to attack (up to) any q out of K worker nodes to induce maximal damage. Our redundancy-based method ByzShield leverages the properties of bipartite expander graphs for the assignment of tasks to workers; this helps to effectively mitigate the effect of the Byzantine behavior. Specifically, we demonstrate an upper bound on the worst case fraction of corrupted gradients based on the eigenvalues of our constructions which are based on mutually orthogonal Latin squares and Ramanujan graphs. Our numerical experiments indicate over a 36% reduction on average in the fraction of corrupted gradients compared to the state of the art. Likewise, our experiments on training followed by image classification on the CIFAR-10 dataset show that ByzShield has on average a 20% advantage in accuracy under the most sophisticated attacks. ByzShield also tolerates a much larger fraction of adversarial nodes compared to prior work.

preprint2020arXiv

Asynchronous Coded Caching with Uncoded Prefetching

Coded caching is a technique that promises huge reductions in network traffic in content-delivery networks. However, the original formulation and several subsequent contributions in the area, assume that the file requests from the users are synchronized, i.e., they arrive at the server at the same time. In this work we formulate and study the coded caching problem when the file requests from the users arrive at different times. We assume that each user also has a prescribed deadline by which they want their request to be completed. In the offline case, we assume that the server knows the arrival times before starting transmission and in the online case, the user requests are revealed to the server over time. We present a linear programming formulation for the offline case that minimizes the overall rate subject to constraint that each user meets his/her deadline. While the online case is much harder, we introduce a novel heuristic for the online case and show that under certain conditions, with high probability the request of each user can be satisfied with her/his deadline. Our simulation results indicate that in the presence of mild asynchronism, much of the benefit of coded caching can still be leveraged.

preprint2020arXiv

Efficient and Robust Distributed Matrix Computations via Convolutional Coding

Distributed matrix computations -- matrix-matrix or matrix-vector multiplications -- are well-recognized to suffer from the problem of stragglers (slow or failed worker nodes). Much of prior work in this area is (i) either sub-optimal in terms of its straggler resilience, or (ii) suffers from numerical problems, i.e., there is a blow-up of round-off errors in the decoded result owing to the high condition numbers of the corresponding decoding matrices. Our work presents convolutional coding approach to this problem that removes these limitations. It is optimal in terms of its straggler resilience, and has excellent numerical robustness as long as the workers' storage capacity is slightly higher than the fundamental lower bound. Moreover, it can be decoded using a fast peeling decoder that only involves add/subtract operations. Our second approach has marginally higher decoding complexity than the first one, but allows us to operate arbitrarily close to the lower bound. Its numerical robustness can be theoretically quantified by deriving a computable upper bound on the worst case condition number over all possible decoding matrices by drawing connections with the properties of large Toeplitz matrices. All above claims are backed up by extensive experiments done on the AWS cloud platform.

preprint2020arXiv

Resolvable Designs for Speeding up Distributed Computing

Distributed computing frameworks such as MapReduce are often used to process large computational jobs. They operate by partitioning each job into smaller tasks executed on different servers. The servers also need to exchange intermediate values to complete the computation. Experimental evidence suggests that this so-called Shuffle phase can be a significant part of the overall execution time for several classes of jobs. Prior work has demonstrated a natural tradeoff between computation and communication whereby running redundant copies of jobs can reduce the Shuffle traffic load, thereby leading to reduced overall execution times. For a single job, the main drawback of this approach is that it requires the original job to be split into a number of files that grows exponentially in the system parameters. When extended to multiple jobs (with specific function types), these techniques suffer from a limitation of a similar flavor, i.e., they require an exponentially large number of jobs to be executed. In practical scenarios, these requirements can significantly reduce the promised gains of the method. In this work, we show that a class of combinatorial structures called resolvable designs can be used to develop efficient coded distributed computing schemes for both the single and multiple job scenarios considered in prior work. We present both theoretical analysis and exhaustive experimental results (on Amazon EC2 clusters) that demonstrate the performance advantages of our method. For the single and multiple job cases, we obtain speed-ups of 4.69x (and 2.6x over prior work) and 4.31x over the baseline approach, respectively.

preprint2020arXiv

Straggler-resistant distributed matrix computation via coding theory

The current BigData era routinely requires the processing of large scale data on massive distributed computing clusters. Such large scale clusters often suffer from the problem of "stragglers", which are defined as slow or failed nodes. The overall speed of a computational job on these clusters is typically dominated by stragglers in the absence of a sophisticated assignment of tasks to the worker nodes. In recent years, approaches based on coding theory (referred to as "coded computation") have been effectively used for straggler mitigation. Coded computation offers significant benefits for specific classes of problems such as distributed matrix computations (which play a crucial role in several parts of the machine learning pipeline). The essential idea is to create redundant tasks so that the desired result can be recovered as long as a certain number of worker nodes complete their tasks. In this survey article, we overview recent developments in the field of coding for straggler-resilient distributed matrix computations.

preprint2016arXiv

Coded Caching for Networks with the Resolvability Property

Coded caching is a recently proposed technique for dealing with large scale content distribution over the Internet. As in conventional caching, it leverages the presence of local caches at the end users. However, it considers coding in the caches and/or coded transmission from the central server and demonstrates that huge savings in transmission rate are possible when the server and the end users are connected via a single shared link. In this work, we consider a more general topology where there is a layer of relay nodes between the server and the users, e.g., combination networks studied in network coding are an instance of these networks. We propose novel schemes for a class of such networks that satisfy a so-called resolvability property and demonstrate that the performance of our scheme is strictly better than previously proposed schemes.

preprint2016arXiv

Coded Caching with Low Subpacketization Levels

Caching is popular technique in content delivery networks that allows for reductions in transmission rates from the content-hosting server to the end users. Coded caching is a generalization of conventional caching that considers the possibility of coding in the caches and transmitting coded signals from the server. Prior results in this area demonstrate that huge reductions in transmission rates are possible and this makes coded caching an attractive option for the next generation of content-delivery networks. However, these results require that each file hosted in the server be partitioned into a large number (i.e., the subpacketization level) of non-overlapping subfiles. From a practical perspective, this is problematic as it means that prior schemes are only applicable when the size of the files is extremely large. In this work, we propose a novel coded caching scheme that enjoys a significantly lower subpacketization level than prior schemes, while only suffering a marginal increase in the transmission rate. In particular, for a fixed cache size, the scaling with the number of users is such that the increase in transmission rate is negligible, but the decrease in subpacketization level is exponential.

preprint2016arXiv

Fractional repetition codes with flexible repair from combinatorial designs

Fractional repetition (FR) codes are a class of regenerating codes for distributed storage systems with an exact (table-based) repair process that is also uncoded, i.e., upon failure, a node is regenerated by simply downloading packets from the surviving nodes. In our work, we present constructions of FR codes based on Steiner systems and resolvable combinatorial designs such as affine geometries, Hadamard designs and mutually orthogonal Latin squares. The failure resilience of our codes can be varied in a simple manner. We construct codes with normalized repair bandwidth ($β$) strictly larger than one; these cannot be obtained trivially from codes with $β= 1$. Furthermore, we present the Kronecker product technique for generating new codes from existing ones and elaborate on their properties. FR codes with locality are those where the repair degree is smaller than the number of nodes contacted for reconstructing the stored file. For these codes we establish a tradeoff between the local repair property and failure resilience and construct codes that meet this tradeoff. Much of prior work only provided lower bounds on the FR code rate. In our work, for most of our constructions we determine the code rate for certain parameter ranges.

preprint2016arXiv

Improved Lower Bounds for Coded Caching

Content delivery networks often employ caching to reduce transmission rates from the central server to the end users. Recently, the technique of coded caching was introduced whereby coding in the caches and coded transmission signals from the central server are considered. Prior results in this area demonstrate that carefully designing the placement of content in the caches and designing appropriate coded delivery signals from the server allow for a system where the delivery rates can be significantly smaller than conventional schemes. However, matching upper and lower bounds on the transmission rate have not yet been obtained. In this work, we derive tighter lower bounds on the coded caching rate than were known previously. We demonstrate that this problem can equivalently be posed as a combinatorial problem of optimally labeling the leaves of a directed tree. Our proposed labeling algorithm allows for significantly improved lower bounds on the coded caching rate. Furthermore, we study certain structural properties of our algorithm that allow us to analytically quantify improvements on the rate lower bound for general values of the problem parameters. This allows us to obtain a multiplicative gap of at most four between the achievable rate and our lower bound.

preprint2016arXiv

On Computation Rates for Arithmetic Sum

For zero-error function computation over directed acyclic networks, existing upper and lower bounds on the computation capacity are known to be loose. In this work we consider the problem of computing the arithmetic sum over a specific directed acyclic network that is not a tree. We assume the sources to be i.i.d. Bernoulli with parameter $1/2$. Even in this simple setting, we demonstrate that upper bounding the computation rate is quite nontrivial. In particular, it requires us to consider variable length network codes and relate the upper bound to equivalently lower bounding the entropy of descriptions observed by the terminal conditioned on the function value. This lower bound is obtained by further lower bounding the entropy of a so-called \textit{clumpy distribution}. We also demonstrate an achievable scheme that uses variable length network codes and in-network compression.

preprint2016arXiv

Sum-networks from undirected graphs: construction and capacity analysis

We consider a directed acyclic network with multiple sources and multiple terminals where each terminal is interested in decoding the sum of independent sources generated at the source nodes. We describe a procedure whereby a simple undirected graph can be used to construct such a sum-network and demonstrate an upper bound on its computation rate. Furthermore, we show sufficient conditions for the construction of a linear network code that achieves this upper bound. Our procedure allows us to construct sum-networks that have any arbitrary computation rate $\frac{p}{q}$ (where $p,q$ are non-negative integers). Our work significantly generalizes a previous approach for constructing sum-networks with arbitrary capacities. Specifically, we answer an open question in prior work by demonstrating sum-networks with significantly fewer number of sources and terminals.

preprint2015arXiv

Capacity of Sum-networks for Different Message Alphabets

A sum-network is a directed acyclic network in which all terminal nodes demand the `sum' of the independent information observed at the source nodes. Many characteristics of the well-studied multiple-unicast network communication problem also hold for sum-networks due to a known reduction between instances of these two problems. Our main result is that unlike a multiple unicast network, the coding capacity of a sum-network is dependent on the message alphabet. We demonstrate this using a construction procedure and show that the choice of a message alphabet can reduce the coding capacity of a sum-network from $1$ to close to $0$.

preprint2013arXiv

Communicating the sum of sources over a network

We consider the network communication scenario, over directed acyclic networks with unit capacity edges in which a number of sources $s_i$ each holding independent unit-entropy information $X_i$ wish to communicate the sum $\sum{X_i}$ to a set of terminals $t_j$. We show that in the case in which there are only two sources or only two terminals, communication is possible if and only if each source terminal pair $s_i/t_j$ is connected by at least a single path. For the more general communication problem in which there are three sources and three terminals, we prove that a single path connecting the source terminal pairs does not suffice to communicate $\sum{X_i}$. We then present an efficient encoding scheme which enables the communication of $\sum{X_i}$ for the three sources, three terminals case, given that each source terminal pair is connected by {\em two} edge disjoint paths.

preprint2013arXiv

On the multiple unicast capacity of 3-source, 3-terminal directed acyclic networks

We consider the multiple unicast problem with three source-terminal pairs over directed acyclic networks with unit-capacity edges. The three $s_i-t_i$ pairs wish to communicate at unit-rate via network coding. The connectivity between the $s_i - t_i$ pairs is quantified by means of a connectivity level vector, $[k_1 k_2 k_3]$ such that there exist $k_i$ edge-disjoint paths between $s_i$ and $t_i$. In this work we attempt to classify networks based on the connectivity level. It can be observed that unit-rate transmission can be supported by routing if $k_i \geq 3$, for all $i = 1, \dots, 3$. In this work, we consider, connectivity level vectors such that $\min_{i = 1, \dots, 3} k_i < 3$. We present either a constructive linear network coding scheme or an instance of a network that cannot support the desired unit-rate requirement, for all such connectivity level vectors except the vector $[1~2~4]$ (and its permutations). The benefits of our schemes extend to networks with higher and potentially different edge capacities. Specifically, our experimental results indicate that for networks where the different source-terminal paths have a significant overlap, our constructive unit-rate schemes can be packed along with routing to provide higher throughput as compared to a pure routing approach.

preprint2013arXiv

PREMIER - PRobabilistic Error-correction using Markov Inference in Errored Reads

In this work we present a flexible, probabilistic and reference-free method of error correction for high throughput DNA sequencing data. The key is to exploit the high coverage of sequencing data and model short sequence outputs as independent realizations of a Hidden Markov Model (HMM). We pose the problem of error correction of reads as one of maximum likelihood sequence detection over this HMM. While time and memory considerations rule out an implementation of the optimal Baum-Welch algorithm (for parameter estimation) and the optimal Viterbi algorithm (for error correction), we propose low-complexity approximate versions of both. Specifically, we propose an approximate Viterbi and a sequential decoding based algorithm for the error correction. Our results show that when compared with Reptile, a state-of-the-art error correction method, our methods consistently achieve superior performances on both simulated and real data sets.

preprint2013arXiv

Replication based storage systems with local repair

We consider the design of regenerating codes for distributed storage systems that enjoy the property of local, exact and uncoded repair, i.e., (a) upon failure, a node can be regenerated by simply downloading packets from the surviving nodes and (b) the number of surviving nodes contacted is strictly smaller than the number of nodes that need to be contacted for reconstructing the stored file. Our codes consist of an outer MDS code and an inner fractional repetition code that specifies the placement of the encoded symbols on the storage nodes. For our class of codes, we identify the tradeoff between the local repair property and the minimum distance. We present codes based on graphs of high girth, affine resolvable designs and projective planes that meet the minimum distance bound for specific choices of file sizes.

preprint2012arXiv

Repairable Replication-based Storage Systems Using Resolvable Designs

We consider the design of regenerating codes for distributed storage systems at the minimum bandwidth regeneration (MBR) point. The codes allow for a repair process that is exact and uncoded, but table-based. These codes were introduced in prior work and consist of an outer MDS code followed by an inner fractional repetition (FR) code where copies of the coded symbols are placed on the storage nodes. The main challenge in this domain is the design of the inner FR code. In our work, we consider generalizations of FR codes, by establishing their connection with a family of combinatorial structures known as resolvable designs. Our constructions based on affine geometries, Hadamard designs and mutually orthogonal Latin squares allow the design of systems where a new node can be exactly regenerated by downloading $β\geq 1$ packets from a subset of the surviving nodes (prior work only considered the case of $β= 1$). Our techniques allow the design of systems over a large range of parameters. Specifically, the repetition degree of a symbol, which dictates the resilience of the system can be varied over a large range in a simple manner. Moreover, the actual table needed for the repair can also be implemented in a rather straightforward way. Furthermore, we answer an open question posed in prior work by demonstrating the existence of codes with parameters that are not covered by Steiner systems.

preprint2011arXiv

A note on the multiple unicast capacity of directed acyclic networks

We consider the multiple unicast problem under network coding over directed acyclic networks with unit capacity edges. There is a set of n source-terminal (s_i - t_i) pairs that wish to communicate at unit rate over this network. The connectivity between the s_i - t_i pairs is quantified by means of a connectivity level vector, [k_1 k_2 ... k_n] such that there exist k_i edge-disjoint paths between s_i and t_i. Our main aim is to characterize the feasibility of achieving this for different values of n and [k_1 ... k_n]. For 3 unicast connections (n = 3), we characterize several achievable and unachievable values of the connectivity 3-tuple. In addition, in this work, we have found certain network topologies, and capacity characterizations that are useful in understanding the case of general n.

preprint2011arXiv

Algebraic codes for Slepian-Wolf code design

Practical constructions of lossless distributed source codes (for the Slepian-Wolf problem) have been the subject of much investigation in the past decade. In particular, near-capacity achieving code designs based on LDPC codes have been presented for the case of two binary sources, with a binary-symmetric correlation. However, constructing practical codes for the case of non-binary sources with arbitrary correlation remains by and large open. From a practical perspective it is also interesting to consider coding schemes whose performance remains robust to uncertainties in the joint distribution of the sources. In this work we propose the usage of Reed-Solomon (RS) codes for the asymmetric version of this problem. We show that algebraic soft-decision decoding of RS codes can be used effectively under certain correlation structures. In addition, RS codes offer natural rate adaptivity and performance that remains constant across a family of correlation structures with the same conditional entropy. The performance of RS codes is compared with dedicated and rate adaptive multistage LDPC codes (Varodayan et al. '06), where each LDPC code is used to compress the individual bit planes. Our simulations show that in classical Slepian-Wolf scenario, RS codes outperform both dedicated and rate-adaptive LDPC codes under $q$-ary symmetric correlation, and are better than rate-adaptive LDPC codes in the case of sparse correlation models, where the conditional distribution of the sources has only a few dominant entries. In a feedback scenario, the performance of RS codes is comparable with both designs of LDPC codes. Our simulations also demonstrate that the performance of RS codes in the presence of inaccuracies in the joint distribution of the sources is much better as compared to multistage LDPC codes.

preprint2011arXiv

An achievable region for the double unicast problem based on a minimum cut analysis

We consider the multiple unicast problem under network coding over directed acyclic networks when there are two source-terminal pairs, $s_1-t_1$ and $s_2-t_2$. Current characterizations of the multiple unicast capacity region in this setting have a large number of inequalities, which makes them hard to explicitly evaluate. In this work we consider a slightly different problem. We assume that we only know certain minimum cut values for the network, e.g., mincut$(S_i, T_j)$, where $S_i \subseteq \{s_1, s_2\}$ and $T_j \subseteq \{t_1, t_2\}$ for different subsets $S_i$ and $T_j$. Based on these values, we propose an achievable rate region for this problem based on linear codes. Towards this end, we begin by defining a base region where both sources are multicast to both the terminals. Following this we enlarge the region by appropriately encoding the information at the source nodes, such that terminal $t_i$ is only guaranteed to decode information from the intended source $s_i$, while decoding a linear function of the other source. The rate region takes different forms depending upon the relationship of the different cut values in the network.

preprint2011arXiv

Degrees of Freedom Region for an Interference Network with General Message Demands

We consider a single hop interference network with $K$ transmitters and $J$ receivers, all having $M$ antennas. Each transmitter emits an independent message and each receiver requests an arbitrary subset of the messages. This generalizes the well-known $K$-user $M$-antenna interference channel, where each message is requested by a unique receiver. For our setup, we derive the degrees of freedom (DoF) region. The achievability scheme generalizes the interference alignment schemes proposed by Cadambe and Jafar. In particular, we achieve general points in the DoF region by using multiple base vectors and aligning all interferers at a given receiver to the interferer with the largest DoF. As a byproduct, we obtain the DoF region for the original interference channel. We also discuss extensions of our approach where the same region can be achieved by considering a reduced set of interference alignment constraints, thus reducing the time-expansion duration needed. The DoF region for the considered system depends only on a subset of receivers whose demands meet certain characteristics. The geometric shape of the DoF region is also discussed.

preprint2010arXiv

Improved Combinatorial Algorithms for Wireless Information Flow

The work of Avestimehr et al. '07 has recently proposed a deterministic model for wireless networks and characterized the unicast capacity C of such networks as the minimum rank of the adjacency matrices describing all possible source-destination cuts. Amaudruz & Fragouli first proposed a polynomial-time algorithm for finding the unicast capacity of a linear deterministic wireless network in their 2009 paper. In this work, we improve upon Amaudruz & Fragouli's work and further reduce the computational complexity of the algorithm by fully exploring the useful combinatorial features intrinsic in the problem. Our improvement applies generally with any size of finite fields associated with the channel model. Comparing with other algorithms on solving the same problem, our improved algorithm is very competitive in terms of complexity.

preprint2010arXiv

Maximum-Likelihood Sequence Detector for Dynamic Mode High Density Probe Storage

There is an increasing need for high density data storage devices driven by the increased demand of consumer electronics. In this work, we consider a data storage system that operates by encoding information as topographic profiles on a polymer medium. A cantilever probe with a sharp tip (few nm radius) is used to create and sense the presence of topographic profiles, resulting in a density of few Tb per in.2. The prevalent mode of using the cantilever probe is the static mode that is harsh on the probe and the media. In this article, the high quality factor dynamic mode operation, that is less harsh on the media and the probe, is analyzed. The read operation is modeled as a communication channel which incorporates system memory due to inter-symbol interference and the cantilever state. We demonstrate an appropriate level of abstraction of this complex nanoscale system that obviates the need for an involved physical model. Next, a solution to the maximum likelihood sequence detection problem based on the Viterbi algorithm is devised. Experimental and simulation results demonstrate that the performance of this detector is several orders of magnitude better than the performance of other existing schemes.

preprint2010arXiv

Minimum cost mirror sites using network coding: Replication vs. coding at the source nodes

Content distribution over networks is often achieved by using mirror sites that hold copies of files or portions thereof to avoid congestion and delay issues arising from excessive demands to a single location. Accordingly, there are distributed storage solutions that divide the file into pieces and place copies of the pieces (replication) or coded versions of the pieces (coding) at multiple source nodes. We consider a network which uses network coding for multicasting the file. There is a set of source nodes that contains either subsets or coded versions of the pieces of the file. The cost of a given storage solution is defined as the sum of the storage cost and the cost of the flows required to support the multicast. Our interest is in finding the storage capacities and flows at minimum combined cost. We formulate the corresponding optimization problems by using the theory of information measures. In particular, we show that when there are two source nodes, there is no loss in considering subset sources. For three source nodes, we derive a tight upper bound on the cost gap between the coded and uncoded cases. We also present algorithms for determining the content of the source nodes.

preprint2010arXiv

Overlay Protection Against Link Failures Using Network Coding

This paper introduces a network coding-based protection scheme against single and multiple link failures. The proposed strategy ensures that in a connection, each node receives two copies of the same data unit: one copy on the working circuit, and a second copy that can be extracted from linear combinations of data units transmitted on a shared protection path. This guarantees instantaneous recovery of data units upon the failure of a working circuit. The strategy can be implemented at an overlay layer, which makes its deployment simple and scalable. While the proposed strategy is similar in spirit to the work of Kamal '07 & '10, there are significant differences. In particular, it provides protection against multiple link failures. The new scheme is simpler, less expensive, and does not require the synchronization required by the original scheme. The sharing of the protection circuit by a number of connections is the key to the reduction of the cost of protection. The paper also conducts a comparison of the cost of the proposed scheme to the 1+1 and shared backup path protection (SBPP) strategies, and establishes the benefits of our strategy.

preprint2010arXiv

Performance evaluation for ML sequence detection in ISI channels with Gauss Markov Noise

Inter-symbol interference (ISI) channels with data dependent Gauss Markov noise have been used to model read channels in magnetic recording and other data storage systems. The Viterbi algorithm can be adapted for performing maximum likelihood sequence detection in such channels. However, the problem of finding an analytical upper bound on the bit error rate of the Viterbi detector in this case has not been fully investigated. Current techniques rely on an exhaustive enumeration of short error events and determine the BER using a union bound. In this work, we consider a subset of the class of ISI channels with data dependent Gauss-Markov noise. We derive an upper bound on the pairwise error probability (PEP) between the transmitted bit sequence and the decoded bit sequence that can be expressed as a product of functions depending on current and previous states in the (incorrect) decoded sequence and the (correct) transmitted sequence. In general, the PEP is asymmetric. The average BER over all possible bit sequences is then determined using a pairwise state diagram. Simulations results which corroborate the analysis of upper bound, demonstrate that analytic bound on BER is tight in high SNR regime. In the high SNR regime, our proposed upper bound obviates the need for computationally expensive simulation.

preprint2010arXiv

Protection against link errors and failures using network coding

We propose a network-coding based scheme to protect multiple bidirectional unicast connections against adversarial errors and failures in a network. The network consists of a set of bidirectional primary path connections that carry the uncoded traffic. The end nodes of the bidirectional connections are connected by a set of shared protection paths that provide the redundancy required for protection. Such protection strategies are employed in the domain of optical networks for recovery from failures. In this work we consider the problem of simultaneous protection against adversarial errors and failures. Suppose that n_e paths are corrupted by the omniscient adversary. Under our proposed protocol, the errors can be corrected at all the end nodes with 4n_e protection paths. More generally, if there are n_e adversarial errors and n_f failures, 4n_e + 2n_f protection paths are sufficient. The number of protection paths only depends on the number of errors and failures being protected against and is independent of the number of unicast connections.

preprint2007arXiv

The Design of Efficiently-Encodable Rate-Compatible LDPC Codes

We present a new class of irregular low-density parity-check (LDPC) codes for moderate block lengths (up to a few thousand bits) that are well-suited for rate-compatible puncturing. The proposed codes show good performance under puncturing over a wide range of rates and are suitable for usage in incremental redundancy hybrid-automatic repeat request (ARQ) systems. In addition, these codes are linear-time encodable with simple shift-register circuits. For a block length of 1200 bits the codes outperform optimized irregular LDPC codes and extended irregular repeat-accumulate (eIRA) codes for all puncturing rates 0.6~0.9 (base code performance is almost the same) and are particularly good at high puncturing rates where good puncturing performance has been previously difficult to achieve.

Aditya Ramamoorthy

What is connected

Connect this record

See the researcher in context

Building this map preview

31 published item(s)

A Unified Treatment of Partial Stragglers and Sparse Matrices in Coded Matrix Computation

Aspis: Robust Detection for Distributed Learning

Federated Over-Air Subspace Tracking from Incomplete and Corrupted Data

ByzShield: An Efficient and Robust System for Distributed Training

Asynchronous Coded Caching with Uncoded Prefetching

Efficient and Robust Distributed Matrix Computations via Convolutional Coding

Resolvable Designs for Speeding up Distributed Computing

Straggler-resistant distributed matrix computation via coding theory

Coded Caching for Networks with the Resolvability Property

Coded Caching with Low Subpacketization Levels

Fractional repetition codes with flexible repair from combinatorial designs

Improved Lower Bounds for Coded Caching

On Computation Rates for Arithmetic Sum

Sum-networks from undirected graphs: construction and capacity analysis

Capacity of Sum-networks for Different Message Alphabets

Communicating the sum of sources over a network

On the multiple unicast capacity of 3-source, 3-terminal directed acyclic networks

PREMIER - PRobabilistic Error-correction using Markov Inference in Errored Reads

Replication based storage systems with local repair

Repairable Replication-based Storage Systems Using Resolvable Designs

A note on the multiple unicast capacity of directed acyclic networks

Algebraic codes for Slepian-Wolf code design

An achievable region for the double unicast problem based on a minimum cut analysis

Degrees of Freedom Region for an Interference Network with General Message Demands

Improved Combinatorial Algorithms for Wireless Information Flow

Maximum-Likelihood Sequence Detector for Dynamic Mode High Density Probe Storage

Minimum cost mirror sites using network coding: Replication vs. coding at the source nodes

Overlay Protection Against Link Failures Using Network Coding

Performance evaluation for ML sequence detection in ISI channels with Gauss Markov Noise

Protection against link errors and failures using network coding

The Design of Efficiently-Encodable Rate-Compatible LDPC Codes