Source author record

Emina Soljanin

Emina Soljanin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Distributed, Parallel, and Cluster Computing Networking and Internet Architecture Performance quant-ph Data Structures and Algorithms Cryptography and Security Databases Discrete Mathematics math.CO math.OC

Catalog footprint

What is connected

36works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

On the Service Rate Region of Reed-Muller Codes

We study the Service Rate Region of Reed-Muller codes in the context of distributed storage systems. The service rate region is a convex polytope comprising all achievable data access request rates under a given coding scheme. It represents a critical metric for evaluating system efficiency and scalability. Using the geometric properties of Reed-Muller codes, we characterize recovery sets for data objects, including their existence, uniqueness, and enumeration. This analysis reveals a connection between recovery sets and minimum-weight codewords in the dual Reed-Muller code, providing a framework for identifying those recovery sets. Leveraging these results, we derive explicit and tight bounds on the maximal achievable demand for individual data objects, thereby defining the maximal simplex within the service rate region and the smallest simplex containing it. These two provide a tight approximation of the service rate region of Reed-Muller codes.

preprint2026arXiv

Optimum 1-Step Majority-Logic Decoding of Binary Reed-Muller Codes

The classical majority-logic decoder proposed by Reed for Reed-Muller codes RM(r, m) of order r and length 2^m, unfolds in r+1 sequential steps, decoding message symbols from highest to lowest degree. Several follow-up decoding algorithms reduced the number of steps, but for a limited set of parameters, or at the expense of reduced performance, or relying on the existence of some combinatorial structures. We show that any one-step majority-logic decoder-that is, a decoder performing all majority votes in one step simultaneously without sequential processing-can correct at most d_min/4 errors for all values of r and m, where d_min denotes the code's minimum distance. We then introduce a new hard-decision decoder that completes the decoding in a single step and attains this error-correction limit. It applies to all r and m, and can be viewed as a parallel realization of Reed's original algorithm, decoding all message symbols simultaneously. Remarkably, we also prove that the decoder is optimum in the erasure setting: it recovers the message from any erasure pattern of up to d_min-1 symbols-the theoretical limit. To our knowledge, this is the first 1-step decoder for RM codes that achieves both optimal erasure correction and the maximum one-step error correction capability.

preprint2023arXiv

Information Rates with Non Ideal Photon Detectors in Time-Entanglement Based QKD

We develop new methods of quantifying the impact of photon detector imperfections on achievable secret key rates in Time-Entanglement based Quantum Key Distribution (QKD). We address photon detection timing jitter, detector downtime, and photon dark counts and show how each may decrease the maximum achievable secret key rate in different ways. We begin with a standard Discrete Memoryless Channel (DMC) model to get a good bound on the mutual information lost due to the timing jitter, then introduce a novel Markov Chain (MC) based model to characterize the effect of detector downtime and show how it introduces memory to the key generation process. Finally, we propose a new method of including dark counts in the analysis that shows how dark counts can be especially detrimental when using the common Pulse Position Modulation (PPM) for key generation. Our results show that these three imperfections can significantly reduce the achievable secret key rate when using PPM for QKD. Additionally, one of our main results is providing tooling for experimentalists to predict their systems' achievable secret key rate given the detector specifications.

preprint2023arXiv

Time-Entanglement QKD: Secret Key Rates and Information Reconciliation Coding

In time entanglement-based quantum key distribution (QKD), Alice and Bob extract the raw key bits from the (identical) arrival times of entangled photon pairs by time-binning. Each of them individually discretizes time into bins and groups them into frames. They retain only the frames with a single occupied bin. Thus, Alice and Bob can use the position of the occupied bin within a frame to generate random key bits, as in PPM modulation. Because of entanglement, their occupied bins and their keys should be identical. However, practical photon detectors suffer from time jitter errors. These errors cause discrepancies between Alice's and Bob's keys. Alice sends information to Bob through the public channel to reconcile the keys. The amount of information determines the secret key rate. This paper computes the secret key rates possible with detector jitter errors and constructs codes for information reconciliation to approach these rates.

preprint2022arXiv

Balanced Nonadaptive Redundancy Scheduling

Distributed computing systems implement redundancy to reduce the job completion time and variability. Despite a large body of work about computing redundancy, the analytical performance evaluation of redundancy techniques in queuing systems is still an open problem. In this work, we take one step forward to analyze the performance of scheduling policies in systems with redundancy. In particular, we study the pattern of shared servers among replicas of different jobs. To this end, we employ combinatorics and graph theory and define and derive performance indicators using the statistics of the overlaps. We consider two classical nonadaptive scheduling policies: random and round-robin. We then propose a scheduling policy based on combinatorial block designs. Compared with conventional scheduling, the proposed scheduling improves the performance indicators. We study the expansion property of the graphs associated with round-robin and block design-based policies. It turns out the superior performance of the block design-based policy results from better expansion properties of its associated graph. As indicated by the performance indicators, the simulation results show that the block design-based policy outperforms random and round-robin scheduling in different scenarios. Specifically, it reduces the average waiting time in the queue to up to 25% compared to the random policy and up to 100% compared to the round-robin policy.

preprint2022arXiv

Dual-Code Bounds on Multiple Concurrent (Local) Data Recovery

We are concerned with linear redundancy storage schemes regarding their ability to provide concurrent (local) recovery of multiple data objects. This paper initiates a study of such systems within the classical coding theory. We show how we can use the structural properties of the generator matrix defining the scheme to obtain a bounding polytope for the set of data access rates the system can support. We derive two dual distance outer bounds, which are sharp for some large classes of matrix families.

preprint2021arXiv

Evaluating Load Balancing Performance in Distributed Storage with Redundancy

To facilitate load balancing, distributed systems store data redundantly. We evaluate the load balancing performance of storage schemes in which each object is stored at $d$ different nodes, and each node stores the same number of objects. In our model, the load offered for the objects is sampled uniformly at random from all the load vectors with a fixed cumulative value. We find that the load balance in a system of $n$ nodes improves multiplicatively with $d$ as long as $d = o\left(\log(n)\right)$, and improves exponentially once $d = Θ\left(\log(n)\right)$. We show that the load balance improves in the same way with $d$ when the service choices are created with XOR's of $r$ objects rather than object replicas. In such redundancy schemes, storage overhead is reduced multiplicatively by $r$. However, recovery of an object requires downloading content from $r$ nodes. At the same time, the load balance increases additively by $r$. We express the system's load balance in terms of the maximal spacing or maximum of $d$ consecutive spacings between the ordered statistics of uniform random variables. Using this connection and the limit results on the maximal $d$-spacings, we derive our main results.

preprint2020arXiv

A Combinatorial View of the Service Rates of Codes Problem, its Equivalence to Fractional Matching and its Connection with Batch Codes

We propose a novel technique for constructing a graph representation of a code through which we establish a significant connection between the service rate problem and the well-known fractional matching problem. Using this connection, we show that the service capacity of a coded storage system equals the fractional matching number in the graph representation of the code, and thus is lower bounded and upper bounded by the matching number and the vertex cover number, respectively. This is of great interest because if the graph representation of a code is bipartite, then the derived upper and lower bounds are equal, and we obtain the capacity. Leveraging this result, we characterize the service capacity of the binary simplex code whose graph representation, as we show, is bipartite. Moreover, we show that the service rate problem can be viewed as a generalization of the multiset primitive batch codes problem.

preprint2020arXiv

A Geometric View of the Service Rates of Codes Problem and its Application to the Service Rate of the First Order Reed-Muller Codes

Service rate is an important, recently introduced, performance metric associated with distributed coded storage systems. Among other interpretations, it measures the number of users that can be simultaneously served by the storage system. We introduce a geometric approach to address this problem. One of the most significant advantages of this approach over the existing approaches is that it allows one to derive bounds on the service rate of a code without explicitly knowing the list of all possible recovery sets. To illustrate the power of our geometric approach, we derive upper bounds on the service rates of the first order Reed-Muller codes and simplex codes. Then, we show how these upper bounds can be achieved. Furthermore, utilizing the proposed geometric technique, we show that given the service rate region of a code, a lower bound on the minimum distance of the code can be obtained.

preprint2020arXiv

Data Freshness in Leader-Based Replicated Storage

Leader-based data replication improves consistency in highly available distributed storage systems via sequential writes to the leader nodes. After a write has been committed by the leaders, follower nodes are written by a multicast mechanism and are only guaranteed to be eventually consistent. With Age of Information (AoI) as the freshness metric, we characterize how the number of leaders affects the freshness of the data retrieved by an instantaneous read query. In particular, we derive the average age of a read query for a deterministic model for the leader writing time and a probabilistic model for the follower writing time. We obtain a closed-form expression for the average age for exponentially distributed follower writing time. Our numerical results show that, depending on the relative speed of the write operation to the two groups of nodes, there exists an optimal number of leaders which minimizes the average age of the retrieved data, and that this number increases as the relative speed of writing on leaders increases.

preprint2020arXiv

Increasing the Raw Key Rate in Energy-Time Entanglement Based Quantum Key Distribution

A Quantum Key Distribution (QKD) protocol describes how two remote parties can establish a secret key by communicating over a quantum and a public classical channel that both can be accessed by an eavesdropper. QKD protocols using energy-time entangled photon pairs are of growing practical interest because of their potential to provide a higher secure key rate over long distances by carrying multiple bits per entangled photon pair. We consider a system where information can be extracted by measuring random times of a sequence of entangled photon arrivals. Our goal is to maximize the utility of each such pair. We propose a discrete time model for the photon arrival process, and establish a theoretical bound on the number of raw bits that can be generated under this model. We first analyse a well known simple binning encoding scheme, and show that it generates significantly lower information rate than what is theoretically possible. We then propose three adaptive schemes that increase the number of raw bits generated per photon, and compute and compare the information rates they offer. Moreover, the effect of public channel communication on the secret key rates of the proposed schemes is investigated.

preprint2020arXiv

Quantum Information Processing: An Essential Primer

Quantum information science is an exciting, wide, rapidly progressing, cross-disciplinary field, and that very nature makes it both attractive and hard to enter. In this primer, we first provide answers to the three essential questions that any newcomer needs to know: How is quantum information represented? How is quantum information processed? How is classical information extracted from quantum states? We then introduce the most basic quantum information theoretic notions concerning entropy, sources, and channels, as well as secure communications and error correction. We conclude with examples that illustrate the power of quantum correlations. No prior knowledge of quantum mechanics is assumed.

preprint2019arXiv

Data Replication for Reducing Computing Time in Distributed Systems with Stragglers

In distributed computing systems with stragglers, various forms of redundancy can improve the average delay performance. We study the optimal replication of data in systems where the job execution time is a stochastically decreasing and convex random variable. We show that in such systems, the optimum assignment policy is the balanced replication of disjoint batches of data. Furthermore, for Exponential and Shifted-Exponential service times, we derive the optimum redundancy levels for minimizing both expected value and the variance of the job completion time. Our analysis shows that, the optimum redundancy level may not be the same for the two metrics, thus there is a trade-off between reducing the expected value of the completion time and reducing its variance.

preprint2019arXiv

Scheduling in the Presence of Data Intensive Compute Jobs

We study the performance of non-adaptive scheduling policies in computing systems with multiple servers. Compute jobs are mostly regular, with modest service requirements. However, there are sporadic data intensive jobs, whose expected service time is much higher than that of the regular jobs. Forthis model, we are interested in the effect of scheduling policieson the average time a job spends in the system. To this end, we introduce two performance indicators in a simplified, only-arrival system. We believe that these performance indicators are good predictors of the relative performance of the policies in the queuing system, which is supported by simulations results.

preprint2016arXiv

On Storage Allocation for Maximum Service Rate in Distributed Storage Systems

Storage allocation affects important performance measures of distributed storage systems. Most previous studies on the storage allocation consider its effect separately either on the success of the data recovery or on the service rate (time) where it is assumed that no access failure happens in the system. In this paper, we go one step further and incorporate the access model and the success of data recovery into the service rate analysis. In particular, we focus on quasi-uniform storage allocation and provide a service rate analysis for both fixed-size and probabilistic access models at the nodes. Using this analysis, we then show that for the case of exponential waiting time distribution at individuals storage nodes, minimal spreading allocation results in the highest system service rate for both access models. This means that for a given storage budget, replication provides a better service rate than a coded storage solution.

preprint2015arXiv

Efficient Replication of Queued Tasks for Latency Reduction in Cloud Systems

In cloud computing systems, assigning a job to multiple servers and waiting for the earliest copy to finish is an effective method to combat the variability in response time of individual servers. Although adding redundant replicas always reduces service time, the total computing time spent per job may be higher, thus increasing waiting time in queue. The total time spent per job is also proportional to the cost of computing resources. We analyze how different redundancy strategies, for eg. number of replicas, and the time when they are issued and canceled, affect the latency and computing cost. We get the insight that the log-concavity of the service time distribution is a key factor in determining whether adding redundancy reduces latency and cost. If the service distribution is log-convex, then adding maximum redundancy reduces both latency and cost. And if it is log-concave, then having fewer replicas and canceling the redundant requests early is more effective.

preprint2015arXiv

SEARS: Space Efficient And Reliable Storage System in the Cloud

Today's cloud storage services must offer storage reliability and fast data retrieval for large amount of data without sacrificing storage cost. We present SEARS, a cloud-based storage system which integrates erasure coding and data deduplication to support efficient and reliable data storage with fast user response time. With proper association of data to storage server clusters, SEARS provides flexible mixing of different configurations, suitable for real-time and archival applications. Our prototype implementation of SEARS over Amazon EC2 shows that it outperforms existing storage systems in storage efficiency and file retrieval time. For 3 MB files, SEARS delivers retrieval time of $2.5$ s compared to $7$ s with existing systems.

preprint2014arXiv

Isn't Hybrid ARQ Sufficient?

In practical systems, reliable communication is often accomplished by coding at different network layers. We question the necessity of this approach and examine when it can be beneficial. Through conceptually simple probabilistic models (based on coin tossing), we argue that multicast scenarios and protocol restrictions may make concatenated multi-layer coding preferable to physical layer coding alone, which is mostly not the case in point-to-point communications.

preprint2013arXiv

On the Delay-Storage Trade-off in Content Download from Coded Distributed Storage Systems

In this paper we study how coding in distributed storage reduces expected download time, in addition to providing reliability against disk failures. The expected download time is reduced because when a content file is encoded to add redundancy and distributed across multiple disks, reading only a subset of the disks is sufficient to reconstruct the content. For the same total storage used, coding exploits the diversity in storage better than simple replication, and hence gives faster download. We use a novel fork-join queuing framework to model multiple users requesting the content simultaneously, and derive bounds on the expected download time. Our system model and results are a novel generalization of the fork-join system that is studied in queueing theory literature. Our results demonstrate the fundamental trade-off between the expected download time and the amount of storage space. This trade-off can be used for design of the amount of redundancy required to meet the delay constraints on content delivery.

preprint2013arXiv

Rate-Distortion-Based Physical Layer Secrecy with Applications to Multimode Fiber

Optical networks are vulnerable to physical layer attacks; wiretappers can improperly receive messages intended for legitimate recipients. Our work considers an aspect of this security problem within the domain of multimode fiber (MMF) transmission. MMF transmission can be modeled via a broadcast channel in which both the legitimate receiver's and wiretapper's channels are multiple-input-multiple-output complex Gaussian channels. Source-channel coding analyses based on the use of distortion as the metric for secrecy are developed. Alice has a source sequence to be encoded and transmitted over this broadcast channel so that the legitimate user Bob can reliably decode while forcing the distortion of wiretapper, or eavesdropper, Eve's estimate as high as possible. Tradeoffs between transmission rate and distortion under two extreme scenarios are examined: the best case where Eve has only her channel output and the worst case where she also knows the past realization of the source. It is shown that under the best case, an operationally separate source-channel coding scheme guarantees maximum distortion at the same rate as needed for reliable transmission. Theoretical bounds are given, and particularized for MMF. Numerical results showing the rate distortion tradeoff are presented and compared with corresponding results for the perfect secrecy case.

preprint2013arXiv

Resolution-aware network coded storage

In this paper, we show that coding can be used in storage area networks (SANs) to improve various quality of service metrics under normal SAN operating conditions, without requiring additional storage space. For our analysis, we develop a model which captures modern characteristics such as constrained I/O access bandwidth limitations. Using this model, we consider two important cases: single-resolution (SR) and multi-resolution (MR) systems. For SR systems, we use blocking probability as the quality of service metric and propose the network coded storage (NCS) scheme as a way to reduce blocking probability. The NCS scheme codes across file chunks in time, exploiting file striping and file duplication. Under our assumptions, we illustrate cases where SR NCS provides an order of magnitude savings in blocking probability. For MR systems, we introduce saturation probability as a quality of service metric to manage multiple user types, and we propose the uncoded resolution- aware storage (URS) and coded resolution-aware storage (CRS) schemes as ways to reduce saturation probability. In MR URS, we align our MR layout strategy with traffic requirements. In MR CRS, we code videos across MR layers. Under our assumptions, we illustrate that URS can in some cases provide an order of magnitude gain in saturation probability over classic non-resolution aware systems. Further, we illustrate that CRS provides additional saturation probability savings over URS.

preprint2012arXiv

Coding for Fast Content Download

We study the fundamental trade-off between storage and content download time. We show that the download time can be significantly reduced by dividing the content into chunks, encoding it to add redundancy and then distributing it across multiple disks. We determine the download time for two content access models - the fountain and fork-join models that involve simultaneous content access, and individual access from enqueued user requests respectively. For the fountain model we explicitly characterize the download time, while in the fork-join model we derive the upper and lower bounds. Our results show that coding reduces download time, through the diversity of distributing the data across more disks, even for the total storage used.

preprint2012arXiv

Low Complexity Differentiating Adaptive Erasure Codes for Multimedia Wireless Broadcast

Based on the erasure channel FEC model as defined in multimedia wireless broadcast standards, we illustrate how doping mechanisms included in the design of erasure coding and decoding may improve the scalability of the packet throughput, decrease overall latency and potentially differentiate among classes of multimedia subscribers regardless of their signal quality. We describe decoding mechanisms that allow for linear complexity and give complexity bounds when feedback is available. We show that elaborate coding schemes which include pre-coding stages are inferior to simple Ideal Soliton based rateless codes, combined with the proposed two-phase decoder. The simplicity of this scheme and the availability of tight bounds on latency given pre-allocated radio resources makes it a practical and efficient design solution.

preprint2012arXiv

Modeling Network Coded TCP: Analysis of Throughput and Energy Cost

We analyze the performance of TCP and TCP with network coding (TCP/NC) in lossy networks. We build upon the framework introduced by Padhye et al. and characterize the throughput behavior of classical TCP and TCP/NC as a function of erasure probability, round-trip time, maximum window size, and duration of the connection. Our analytical results show that network coding masks random erasures from TCP, thus preventing TCP's performance degradation in lossy networks. It is further seen that TCP/NC has significant throughput gains over TCP. In addition, we show that TCP/NC may lead to cost reduction for wireless network providers while maintaining a certain quality of service to their users. We measure the cost in terms of number of base stations, which is highly correlated to the energy, capital, and operational costs of a network provider. We show that increasing the available bandwidth may not necessarily lead to increase in throughput, particularly in lossy networks in which TCP does not perform well. We show that using protocols such as TCP/NC, which are more resilient to erasures, may lead to a throughput commensurate the bandwidth dedicated to each user.

preprint2012arXiv

Optimized IR-HARQ Schemes Based on Punctured LDPC Codes over the BEC

We study incremental redundancy hybrid ARQ (IR-HARQ) schemes based on punctured, finite-length, LDPC codes. The transmission is assumed to take place over time varying binary erasure channels, such as mobile wireless channels at the applications layer. We analyze and optimize the throughput and delay performance of these IR-HARQ protocols under iterative, message-passing decoding. We derive bounds on the performance that are achievable by such schemes, and show that, with a simple extension, the iteratively decoded, punctured LDPC code based IR-HARQ protocol can be made rateless, and operating close to the general theoretical optimum for a wide range of channel erasure rates.

preprint2012arXiv

Round-Robin Streaming with Generations

We consider three types of application layer coding for streaming over lossy links: random linear coding, systematic random linear coding, and structured coding. The file being streamed is divided into sub-blocks (generations). Code symbols are formed by combining data belonging to the same generation, and transmitted in a round-robin fashion. We compare the schemes based on delivery packet count, net throughput, and energy consumption for a range of generation sizes. We determine these performance measures both analytically and in an experimental configuration. We find our analytical predictions to match the experimental results. We show that coding at the application layer brings about a significant increase in net data throughput, and thereby reduction in energy consumption due to reduced communication time. On the other hand, on devices with constrained computing resources, heavy coding operations cause packet drops in higher layers and negatively affect the net throughput. We find from our experimental results that low-rate MDS codes are best for small generation sizes, whereas systematic random linear coding has the best net throughput and lowest energy consumption for larger generation sizes due to its low decoding complexity.

preprint2012arXiv

Three Schemes for Wireless Coded Broadcast to Heterogeneous Users

We study and compare three coded schemes for single-server wireless broadcast of multiple description coded content to heterogeneous users. The users (sink nodes) demand different number of descriptions over links with different packet loss rates. The three coded schemes are based on the LT codes, growth codes, and randomized chunked codes. The schemes are compared on the basis of the total number of transmissions required to deliver the demands of all users, which we refer to as the server (source) delivery time. We design the degree distributions of LT codes by solving suitably defined linear optimization problems, and numerically characterize the achievable delivery time for different coding schemes. We find that including a systematic phase (uncoded transmission) is significantly beneficial for scenarios with low demands, and that coding is necessary for efficiently delivering high demands. Different demand and error rate scenarios may require very different coding schemes. Growth codes and chunked codes do not perform as well as optimized LT codes in the heterogeneous communication scenario.

preprint2012arXiv

Toward Sustainable Networking: Storage Area Networks with Network Coding

This manuscript provides a model to characterize the energy savings of network coded storage (NCS) in storage area networks (SANs). We consider blocking probability of drives as our measure of performance. A mapping technique to analyze SANs as independent M/G/K/K queues is presented, and blocking probabilities for uncoded storage schemes and NCS are derived and compared. We show that coding operates differently than the amalgamation of file chunks and energy savings are shown to scale well with striping number. We illustrate that for enterprise-level SANs energy savings of 20-50% can be realized.

preprint2012arXiv

Trade-off between cost and goodput in wireless: Replacing transmitters with coding

We study the cost of improving the goodput, or the useful data rate, to user in a wireless network. We measure the cost in terms of number of base stations, which is highly correlated to the energy cost as well as capital and operational costs of a network provider.We show that increasing the available bandwidth, or throughput, may not necessarily lead to increase in goodput, particularly in lossy wireless networks in which TCP does not perform well. As a result, much of the resources dedicated to the user may not translate to high goodput, resulting in an inefficient use of the network resources. We show that using protocols such as TCP/NC, which are more resilient to erasures and failures in the network, may lead to a goodput commensurate the throughput dedicated to each user. By increasing goodput, users' transactions are completed faster; thus, the resources dedicated to these users can be released to serve other requests or transactions. Consequently, we show that translating efficiently throughput to goodput may bring forth better connection to users while reducing the cost for the network providers.

preprint2010arXiv

Collecting Coded Coupons over Overlapping Generations

Coding over subsets (known as generations) rather than over all content blocks in P2P distribution networks and other applications is necessary for a number of practical reasons such as computational complexity. A penalty for coding only within generations is an overall throughput reduction. It has been previously shown that allowing contiguous generations to overlap in a head-to-toe manner improves the throughput. We here propose and study a scheme, referred to as the {\it random annex code}, that creates shared packets between any two generations at random rather than only the neighboring ones. By optimizing very few design parameters, we obtain a simple scheme that outperforms both the non-overlapping and the head-to-toe overlapping schemes of comparable computational complexity, both in the expected throughput and in the rate of convergence of the probability of decoding failure to zero. We provide a practical algorithm for accurate analysis of the expected throughput of the random annex code for finite-length information. This algorithm enables us to quantify the throughput vs.computational complexity tradeoff, which is necessary for optimal selection of the scheme parameters.

preprint2010arXiv

Doped Fountain Coding for Minimum Delay Data Collection in Circular Networks

This paper studies decentralized, Fountain and network-coding based strategies for facilitating data collection in circular wireless sensor networks, which rely on the stochastic diversity of data storage. The goal is to allow for a reduced delay collection by a data collector who accesses the network at a random position and random time. Data dissemination is performed by a set of relays which form a circular route to exchange source packets. The storage nodes within the transmission range of the route's relays linearly combine and store overheard relay transmissions using random decentralized strategies. An intelligent data collector first collects a minimum set of coded packets from a subset of storage nodes in its proximity, which might be sufficient for recovering the original packets and, by using a message-passing decoder, attempts recovering all original source packets from this set. Whenever the decoder stalls, the source packet which restarts decoding is polled/doped from its original source node. The random-walk-based analysis of the decoding/doping process furnishes the collection delay analysis with a prediction on the number of required doped packets. The number of doped packets can be surprisingly small when employed with an Ideal Soliton code degree distribution and, hence, the doping strategy may have the least collection delay when the density of source nodes is sufficiently large. Furthermore, we demonstrate that network coding makes dissemination more efficient at the expense of a larger collection delay. Not surprisingly, a circular network allows for a significantly more (analytically and otherwise) tractable strategies relative to a network whose model is a random geometric graph.

preprint2010arXiv

Effects of the Generation Size and Overlap on Throughput and Complexity in Randomized Linear Network Coding

To reduce computational complexity and delay in randomized network coded content distribution, and for some other practical reasons, coding is not performed simultaneously over all content blocks, but over much smaller, possibly overlapping subsets of these blocks, known as generations. A penalty of this strategy is throughput reduction. To analyze the throughput loss, we model coding over generations with random generation scheduling as a coupon collector's brotherhood problem. This model enables us to derive the expected number of coded packets needed for successful decoding of the entire content as well as the probability of decoding failure (the latter only when generations do not overlap) and further, to quantify the tradeoff between computational complexity and throughput. Interestingly, with a moderate increase in the generation size, throughput quickly approaches link capacity. Overlaps between generations can further improve throughput substantially for relatively small generation sizes.

preprint2010arXiv

Memory Allocation in Distributed Storage Networks

We consider the problem of distributing a file in a network of storage nodes whose storage budget is limited but at least equals to the size file. We first generate $T$ encoded symbols (from the file) which are then distributed among the nodes. We investigate the optimal allocation of $T$ encoded packets to the storage nodes such that the probability of reconstructing the file by using any $r$ out of $n$ nodes is maximized. Since the optimal allocation of encoded packets is difficult to find in general, we find another objective function which well approximates the original problem and yet is easier to optimize. We find the optimal symmetric allocation for all coding redundancy constraints using the equivalent approximate problem. We also investigate the optimal allocation in random graphs. Finally, we provide simulations to verify the theoretical results.

preprint2010arXiv

Rateless Codes for Single-Server Streaming to Diverse Users

We investigate the performance of rateless codes for single-server streaming to diverse users, assuming that diversity in users is present not only because they have different channel conditions, but also because they demand different amounts of information and have different decoding capabilities. The LT encoding scheme is employed. While some users accept output symbols of all degrees and decode using belief propagation, others only collect degree- 1 output symbols and run no decoding algorithm. We propose several performance measures, and optimize the performance of the rateless code used at the server through the design of the code degree distribution. Optimization problems are formulated for the asymptotic regime and solved as linear programming problems. Optimized performance shows great improvement in total bandwidth consumption over using the conventional ideal soliton distribution, or simply sending separately encoded streams to different types of user nodes. Simulation experiments confirm the usability of the optimization results obtained for the asymptotic regime as a guideline for finite-length code design.

preprint2009arXiv

Fountain Codes Based Distributed Storage Algorithms for Large-scale Wireless Sensor Networks

We consider large-scale sensor networks with n nodes, out of which k are in possession, (e.g., have sensed or collected in some other way) k information packets. In the scenarios in which network nodes are vulnerable because of, for example, limited energy or a hostile environment, it is desirable to disseminate the acquired information throughout the network so that each of the n nodes stores one (possibly coded) packet and the original k source packets can be recovered later in a computationally simple way from any (1 + ε)k nodes for some small ε> 0. We developed two distributed algorithms for solving this problem based on simple random walks and Fountain codes. Unlike all previously developed schemes, our solution is truly distributed, that is, nodes do not know n, k or connectivity in the network, except in their own neighborhoods, and they do not maintain any routing tables. In the first algorithm, all the sensors have the knowledge of n and k. In the second algorithm, each sensor estimates these parameters through the random walk dissemination. We present analysis of the communication/transmission and encoding/decoding complexity of these two algorithms, and provide extensive simulation results as well

preprint2009arXiv

Raptor Codes Based Distributed Storage Algorithms for Wireless Sensor Networks

We consider a distributed storage problem in a large-scale wireless sensor network with $n$ nodes among which $k$ acquire (sense) independent data. The goal is to disseminate the acquired information throughout the network so that each of the $n$ sensors stores one possibly coded packet and the original $k$ data packets can be recovered later in a computationally simple way from any $(1+ε)k$ of nodes for some small $ε>0$. We propose two Raptor codes based distributed storage algorithms for solving this problem. In the first algorithm, all the sensors have the knowledge of $n$ and $k$. In the second one, we assume that no sensor has such global information.

Emina Soljanin

What is connected

Connect this record

See the researcher in context

Building this map preview

36 published item(s)

On the Service Rate Region of Reed-Muller Codes

Optimum 1-Step Majority-Logic Decoding of Binary Reed-Muller Codes

Information Rates with Non Ideal Photon Detectors in Time-Entanglement Based QKD

Time-Entanglement QKD: Secret Key Rates and Information Reconciliation Coding

Balanced Nonadaptive Redundancy Scheduling

Dual-Code Bounds on Multiple Concurrent (Local) Data Recovery

Evaluating Load Balancing Performance in Distributed Storage with Redundancy

A Combinatorial View of the Service Rates of Codes Problem, its Equivalence to Fractional Matching and its Connection with Batch Codes

A Geometric View of the Service Rates of Codes Problem and its Application to the Service Rate of the First Order Reed-Muller Codes

Data Freshness in Leader-Based Replicated Storage

Increasing the Raw Key Rate in Energy-Time Entanglement Based Quantum Key Distribution

Quantum Information Processing: An Essential Primer

Data Replication for Reducing Computing Time in Distributed Systems with Stragglers

Scheduling in the Presence of Data Intensive Compute Jobs

On Storage Allocation for Maximum Service Rate in Distributed Storage Systems

Efficient Replication of Queued Tasks for Latency Reduction in Cloud Systems

SEARS: Space Efficient And Reliable Storage System in the Cloud

Isn't Hybrid ARQ Sufficient?

On the Delay-Storage Trade-off in Content Download from Coded Distributed Storage Systems

Rate-Distortion-Based Physical Layer Secrecy with Applications to Multimode Fiber

Resolution-aware network coded storage

Coding for Fast Content Download

Low Complexity Differentiating Adaptive Erasure Codes for Multimedia Wireless Broadcast

Modeling Network Coded TCP: Analysis of Throughput and Energy Cost

Optimized IR-HARQ Schemes Based on Punctured LDPC Codes over the BEC

Round-Robin Streaming with Generations

Three Schemes for Wireless Coded Broadcast to Heterogeneous Users

Toward Sustainable Networking: Storage Area Networks with Network Coding

Trade-off between cost and goodput in wireless: Replacing transmitters with coding

Collecting Coded Coupons over Overlapping Generations

Doped Fountain Coding for Minimum Delay Data Collection in Circular Networks

Effects of the Generation Size and Overlap on Throughput and Complexity in Randomized Linear Network Coding

Memory Allocation in Distributed Storage Networks

Rateless Codes for Single-Server Streaming to Diverse Users

Fountain Codes Based Distributed Storage Algorithms for Large-scale Wireless Sensor Networks

Raptor Codes Based Distributed Storage Algorithms for Wireless Sensor Networks