Source author record

Mohammad Ali Maddah-Ali

Mohammad Ali Maddah-Ali appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Distributed, Parallel, and Cluster Computing Machine Learning Cryptography and Security Networking and Internet Architecture Artificial Intelligence eess.SP math.OC

Catalog footprint

What is connected

38works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

\mathsf{VISTA}: Decentralized Machine Learning in Adversary Dominated Environments

Decentralized machine learning often relies on outsourcing computations, such as gradient evaluations, to untrusted worker nodes. Existing robust aggregation methods can mitigate malicious behavior under honest-majority assumptions, but may fail when adversaries control a majority of the workers. We study this adversary-dominated setting through an incentive-oriented framework in which reports are accepted and rewarded only when they are mutually consistent up to a threshold. This turns the adversary from a pure saboteur into a rational agent that trades off increasing estimation error against the risk of rejection and loss of reward. We consider iterative optimization under this model. Unlike one-shot computation, iterative learning requires long-horizon decisions: permissive acceptance rules enable faster early progress but admit more adversarial corruption, while strict rules improve estimation accuracy but cause frequent rejections. We propose \mathsf{VISTA}, an adaptive algorithm that tunes the acceptance threshold using the optimization history. Numerical results show that \mathsf{VISTA} improves convergence over static thresholds. We also provide a rigorous convergence analysis showing that, with suitable incentive-aware adaptation, adversary-dominated decentralized learning can retain the asymptotic convergence behavior of standard SGD without relying on an honest majority.

preprint2026arXiv

Game of Coding: Coding Theory in the Presence of Rational Adversaries, Motivated by Decentralized Machine Learning

Coding theory plays a crucial role in enabling reliable communication, storage, and computation. Classical approaches assume a worst-case adversarial model and ensure error correction and data recovery only when the number of honest nodes exceeds the number of adversarial ones by some margin. However, in some emerging decentralized applications, particularly in decentralized machine learning (DeML), participating nodes are rewarded for accepted contributions. This incentive structure naturally gives rise to rational adversaries who act strategically rather than behaving in purely malicious ways. In this paper, we first motivate the need for coding in the presence of rational adversaries, particularly in the context of outsourced computation in decentralized systems. We contrast this need with existing approaches and highlight their limitations. We then introduce the game of coding, a novel game-theoretic framework that extends coding theory to trust-minimized settings where honest nodes are not in the majority. Focusing on repetition coding, we highlight two key features of this framework: (1) the ability to achieve a non-zero probability of data recovery even when adversarial nodes are in the majority, and (2) Sybil resistance, i.e., the equilibrium remains unchanged even as the number of adversarial nodes increases. Finally, we explore scenarios in which the adversary's strategy is unknown and outline several open problems for future research.

preprint2026arXiv

Game of Coding: Sybil Resistant Decentralized Machine Learning with Minimal Trust Assumption

Coding theory plays a crucial role in ensuring data integrity and reliability across various domains, from communication to computation and storage systems. However, its reliance on trust assumptions for data recovery, which requires the number of honest nodes to exceed adversarial nodes by a certain margin, poses significant challenges, particularly in emerging decentralized systems where trust is a scarce resource. To address this, the game of coding framework was introduced, offering insights into strategies for data recovery within incentive-oriented environments. In such environments, participant nodes are rewarded as long as the system remains functional (live). This incentivizes adversaries to maximize their rewards (utility) by ensuring that the decoder, as the data collector (DC), successfully recovers the data, preferably with a high estimation error. This rational behavior is leveraged in a game-theoretic framework, where the equilibrium leads to a robust and resilient system, referred to as the game of coding. The focus of the earliest version of the game of coding was limited to scenarios involving only two nodes. In this paper, we generalize the game of coding framework to scenarios with $N \ge 2$ nodes, exploring critical aspects of system behavior. Specifically, we (i) demonstrate that the adversary's utility at equilibrium is non-increasing with additional adversarial nodes, ensuring no gain for the adversary and no pain for the DC, thus establishing the game of coding framework's Sybil resistance; (ii) show that increasing the number of honest nodes does not always enhance the DC's utility, providing examples and proposing an algorithm to identify and mitigate this counterintuitive effect; and (iii) outline the optimal strategies for both the DC and the adversary, demonstrating that the system achieves enhanced liveness at equilibrium.

preprint2026arXiv

Learning from Acceptance: Cumulative Regret in the Game of Coding

Classical coding-theoretic guarantees often rely on trust assumptions, such as requiring sufficiently many honest nodes compared with adversarial ones. These assumptions are difficult to enforce in open decentralized systems where participants are not centrally certified. At the same time, such environments often contain incentive mechanisms: participants may be rewarded only when their submitted data are accepted and the system remains functional. This changes the role of an adversary. Rather than acting as a pure saboteur, a strategic adversary may submit data that are consistent enough to be accepted while still degrading the quality of the final estimate. The game-of-coding framework models this strategic interaction between a data collector (DC) and an adversary. Existing works on the game of coding mostly consider the complete-information case, where the DC knows how the adversary trades off acceptance and estimation error. In this paper, we study an incomplete-information version of the game of coding in which the DC, acting as a Stackelberg leader, does not know the adversary's utility trade-off and must learn through repeated interaction. Prior work on the unknown-adversary setting considered an explore-then-commit objective, where only the final selected acceptance rule is evaluated. In contrast, we study the full learning trajectory: every acceptance rule used during the algorithm is executed and contributes to performance. We propose an algorithm that refines its search around promising acceptance rules, prove that it achieves sublinear cumulative regret, and evaluate its performance through numerical experiments.

preprint2022arXiv

Distributed Attribute-based Private Access Control

In attribute-based access control, users with certain verified attributes will gain access to some particular data. Concerning with privacy of the users' attributes, we study the problem of distributed attribute-based private access control (DAPAC) with multiple authorities, where each authority will learn and verify only one of the attributes. To investigate its fundamental limits, we introduce an information theoretic DAPAC framework, with $N \in \mathbb{N}$, $N\geq 2$, replicated non-colluding servers (authorities) and some users. Each user has an attribute vector $\mathbf{v^*}=(v_1^*, ..., v_N^*)$ of dimension $N$ and is eligible to retrieve a message $W^{\mathbf{v}^*}$, available in all servers. Each server $n\in [N]$ is able to only observe and verify the $n$'th attribute of a user. In response, it sends a function of its data to the user. The system must satisfy the following conditions: (1) Correctness: the user with attribute vector $\mathbf{v^*}$ is able to retrieve his intended message $W^{\mathbf{v}^*}$ from the servers' response, (2) Data Secrecy: the user will not learn anything about the other messages, (3) Attribute Privacy: each Server~$n$ learns nothing beyond attribute $n$ of the user. The capacity of the DAPAC is defined as the ratio of the file size and the aggregated size of the responses, maximized over all feasible schemes. We obtain a lower bound on the capacity of this problem by proposing an achievable algorithm with rate $\frac{1}{2K}$, where $K$ is the size of the alphabet of each attribute.

preprint2022arXiv

SwiftAgg: Communication-Efficient and Dropout-Resistant Secure Aggregation for Federated Learning with Worst-Case Security Guarantees

We propose SwiftAgg, a novel secure aggregation protocol for federated learning systems, where a central server aggregates local models of $N$ distributed users, each of size $L$, trained on their local data, in a privacy-preserving manner. Compared with state-of-the-art secure aggregation protocols, SwiftAgg significantly reduces the communication overheads without any compromise on security. Specifically, in presence of at most $D$ dropout users, SwiftAgg achieves a users-to-server communication load of $(T+1)L$ and a users-to-users communication load of up to $(N-1)(T+D+1)L$, with a worst-case information-theoretic security guarantee, against any subset of up to $T$ semi-honest users who may also collude with the curious server. The key idea of SwiftAgg is to partition the users into groups of size $D+T+1$, then in the first phase, secret sharing and aggregation of the individual models are performed within each group, and then in the second phase, model aggregation is performed on $D+T+1$ sequences of users across the groups. If a user in a sequence drops out in the second phase, the rest of the sequence remain silent. This design allows only a subset of users to communicate with each other, and only the users in a single group to directly communicate with the server, eliminating the requirements of 1) all-to-all communication network across users; and 2) all users communicating with the server, for other secure aggregation protocols. This helps to substantially slash the communication costs of the system.

preprint2022arXiv

SwiftAgg+: Achieving Asymptotically Optimal Communication Loads in Secure Aggregation for Federated Learning

We propose SwiftAgg+, a novel secure aggregation protocol for federated learning systems, where a central server aggregates local models of $N \in \mathbb{N}$ distributed users, each of size $L \in \mathbb{N}$, trained on their local data, in a privacy-preserving manner. SwiftAgg+ can significantly reduce the communication overheads without any compromise on security, and achieve optimal communication loads within diminishing gaps. Specifically, in presence of at most $D=o(N)$ dropout users, SwiftAgg+ achieves a per-user communication load of $(1+\mathcal{O}(\frac{1}{N}))L$ symbols and a server communication load of $(1+\mathcal{O}(\frac{1}{N}))L$ symbols, with a worst-case information-theoretic security guarantee, against any subset of up to $T=o(N)$ semi-honest users who may also collude with the curious server. Moreover, the proposed SwiftAgg+ allows for a flexible trade-off between communication loads and the number of active communication links. In particular, for $T<N-D$ and for any $K\in\mathbb{N}$, SwiftAgg+ can achieve the server communication load of $(1+\frac{T}{K})L$ symbols, and per-user communication load of up to $(1+\frac{T+D}{K})L$ symbols, where the number of pair-wise active connections in the network is $\frac{N}{2}(K+T+D+1)$.

preprint2021arXiv

CodedSketch: A Coding Scheme for Distributed Computation of Approximated Matrix Multiplication

In this paper, we propose CodedSketch, as a distributed straggler-resistant scheme to compute an approximation of the multiplication of two massive matrices. The objective is to reduce the recovery threshold, defined as the total number of worker nodes that we need to wait for to be able to recover the final result. To exploit the fact that only an approximated result is required, in reducing the recovery threshold, some sorts of pre-compression are required. However, compression inherently involves some randomness that would lose the structure of the matrices. On the other hand, considering the structure of the matrices is crucial to reduce the recovery threshold. In CodedSketch, we use count--sketch, as a hash-based compression scheme, on the rows of the first and columns of the second matrix, and a structured polynomial code on the columns of the first and rows of the second matrix. This arrangement allows us to exploit the gain of both in reducing the recovery threshold. To increase the accuracy of computation, multiple independent count--sketches are needed. This independency allows us to theoretically characterize the accuracy of the result and establish the recovery threshold achieved by the proposed scheme. To guarantee the independency of resulting count--sketches in the output, while keeping its cost on the recovery threshold minimum, we use another layer of structured codes.

preprint2021arXiv

Fundamental Limits of Distributed Encoding

In general coding theory, we often assume that error is observed in transferring or storing encoded symbols, while the process of encoding itself is error-free. Motivated by recent applications of coding theory, in this paper, we consider the case where the process of encoding is distributed and prone to error. We introduce the problem of distributed encoding, comprising of $K\in\mathbb{N}$ isolated source nodes and $N\in\mathbb{N}$ encoding nodes. Each source node has one symbol from a finite field and sends it to all encoding nodes. Each encoding node stores an encoded symbol, as a function of the received symbols. However, some of the source nodes are controlled by the adversary and may send different symbols to different encoding nodes. Depending on the number of adversarial nodes, denoted by $β\in\mathbb{N}$, and the number of symbols that each one generates, denoted by $v\in\mathbb{N}$, the process of decoding from the encoded symbols could be impossible. Assume that a decoder connects to an arbitrary subset of $t \in\mathbb{N}$ encoding nodes and wants to decode the symbols of the honest nodes correctly, without necessarily identifying the sets of honest and adversarial nodes. In this paper, we study $t^*\in\mathbb{N}$, the minimum of $t$, which is a function of $K$, $N$, $β$, and $v$. We show that when the encoding nodes use linear coding, $t^*_{\textrm{linear}}=K+2β(v-1)$, if $N\ge K+2β(v-1)$, and $t^*_{\textrm{linear}}=N$, if $N\le K+2β(v-1)$. In order to achieve $t^*_{\textrm{linear}}$, we use random linear coding and show that in any feasible solution that the decoder finds, the messages of the honest nodes are decoded correctly. For the converse of the fundamental limit, we show that when the adversary behaves in a particular way, it can always confuse the decoder between two feasible solutions that differ in the message of at least one honest node.

preprint2021arXiv

Multi-Party Proof Generation in QAP-based zk-SNARKs

Zero-knowledge succinct non-interactive argument of knowledge (zkSNARK) allows a party, known as the prover, to convince another party, known as the verifier, that he knows a private value $v$, without revealing it, such that $F(u,v)=y$ for some function $F$ and public values $u$ and $y$. There are various versions of zk-SNARK, among them, Quadratic Arithmetic Program (QAP)-based zk-SNARK has been widely used in practice, specially in Blockchain technology. This is attributed to two desirable features; its fixed-size proof and the very light computation load of the verifier. However, the computation load of the prover in QAP-based zkSNARKs, is very heavy, even-though it is designed to be very efficient. This load can be beyond the prover's computation power to handle, and has to be offloaded to some external servers. In the existing offloading solutions, either (i) the load of computation, offloaded to each sever, is a fraction of the prover's primary computation (e.g., DZIK), however the servers need to be trusted, (ii) the servers are not required to be trusted, but the computation complexity imposed to each one is the same as the prover's primary computation (e.g., Trinocchio). In this paper, we present a scheme, which has the benefits of both solutions. In particular, we propose a secure multi-party proof generation algorithm where the prover can delegate its task to $N $ servers, where (i) even if a group of $T \in \mathbb{N}$ servers, $T\le N$, collude, they cannot gain any information about the secret value $v$, (ii) the computation complexity of each server is less than $1/(N-T)$ of the prover's primary computation. The design is such that we don't lose the efficiency of the prover's algorithm in the process of delegating the tasks to external servers.

preprint2021arXiv

Optimal Communication-Computation Trade-Off in Heterogeneous Gradient Coding

Gradient coding allows a master node to derive the aggregate of the partial gradients, calculated by some worker nodes over the local data sets, with minimum communication cost, and in the presence of stragglers. In this paper, for gradient coding with linear encoding, we characterize the optimum communication cost for heterogeneous distributed systems with \emph{arbitrary} data placement, with $s \in \mathbb{N}$ stragglers and $a \in \mathbb{N}$ adversarial nodes. In particular, we show that the optimum communication cost, normalized by the size of the gradient vectors, is equal to $(r-s-2a)^{-1}$, where $r \in \mathbb{N}$ is the minimum number that a data partition is replicated. In other words, the communication cost is determined by the data partition with the minimum replication, irrespective of the structure of the placement. The proposed achievable scheme also allows us to target the computation of a polynomial function of the aggregated gradient matrix. It also allows us to borrow some ideas from approximation computing and propose an approximate gradient coding scheme for the cases when the repetition in data placement is smaller than what is needed to meet the restriction imposed on communication cost or when the number of stragglers appears to be more than the presumed value in the system design.

preprint2021arXiv

The Capacity Region of Distributed Multi-User Secret Sharing

In this paper, we study the problem of distributed multi-user secret sharing, including a trusted master node, $N\in \mathbb{N}$ storage nodes, and $K$ users, where each user has access to the contents of a subset of storage nodes. Each user has an independent secret message with certain rate, defined as the size of the message normalized by the size of a storage node. Having access to the secret messages, the trusted master node places encoded shares in the storage nodes, such that (i) each user can recover its own message from the content of the storage nodes that it has access to, (ii) each user cannot gain any information about the message of any other user. We characterize the capacity region of the distributed multi-user secret sharing, defined as the set of all achievable rate tuples, subject to the correctness and privacy constraints. In the achievable scheme, for each user, the master node forms a polynomial with the degree equal to the number of its accessible storage nodes minus one, where the value of this polynomial at certain points are stored as the encoded shares. The message of that user is embedded in some of the coefficients of the polynomial. The remaining coefficients are determined such that the content of each storage node serves as the encoded shares for all users that have access to that storage node.

preprint2020arXiv

A Hybrid-Order Distributed SGD Method for Non-Convex Optimization to Balance Communication Overhead, Computational Complexity, and Convergence Rate

In this paper, we propose a method of distributed stochastic gradient descent (SGD), with low communication load and computational complexity, and still fast convergence. To reduce the communication load, at each iteration of the algorithm, the worker nodes calculate and communicate some scalers, that are the directional derivatives of the sample functions in some \emph{pre-shared directions}. However, to maintain accuracy, after every specific number of iterations, they communicate the vectors of stochastic gradients. To reduce the computational complexity in each iteration, the worker nodes approximate the directional derivatives with zeroth-order stochastic gradient estimation, by performing just two function evaluations rather than computing a first-order gradient vector. The proposed method highly improves the convergence rate of the zeroth-order methods, guaranteeing order-wise faster convergence. Moreover, compared to the famous communication-efficient methods of model averaging (that perform local model updates and periodic communication of the gradients to synchronize the local models), we prove that for the general class of non-convex stochastic problems and with reasonable choice of parameters, the proposed method guarantees the same orders of communication load and convergence rate, while having order-wise less computational complexity. Experimental results on various learning problems in neural networks applications demonstrate the effectiveness of the proposed approach compared to various state-of-the-art distributed SGD methods.

preprint2020arXiv

BlockMarkchain: A Secure Decentralized Data Market with a Constant Load on the Blockchain

In this paper, we develop BlockMarkchain, as a secure data market place, where individual data sellers can exchange certified data with buyers, in a secure environment, without any mutual trust among the parties, and without trusting on a third party, as a mediator. To develop this platform, we rely on a smart contract, deployed on a secure public blockchain. The main challenges here are to verify the validity of data and to prevent malicious behavior of the parties, while preserving the privacy of the data and taking into account the limited computing and storage resources available on the blockchain. In BlockMarkchain, the buyer has the option to dispute the honesty of the seller and prove the invalidity of the data to the smart contract. The smart contract evaluates the buyer's claim and punishes the dishonest party by forfeiting his/her deposit in favor of the honest party. BlockMarkchain enjoys several salient features including (i) the certified data has never been revealed on the public blockchain, (ii) the size of data posted on the blockchain, the load of computation on the blockchain, and the cost of communication with the blockchain is constant and negligible, and (iii) the computation cost of verifications on the parties is not expensive.

preprint2020arXiv

Coded Secure Multi-Party Computation for Massive Matrices with Adversarial Nodes

In this work, we consider the problem of secure multi-party computation (MPC), consisting of $Γ$ sources, each has access to a large private matrix, $N$ processing nodes or workers, and one data collector or master. The master is interested in the result of a polynomial function of the input matrices. Each source sends a randomized functions of its matrix, called as its share, to each worker. The workers process their shares in interaction with each other, and send some results to the master such that it can derive the final result. There are several constraints: (1) each worker can store a function of each input matrix, with the size of $\frac{1}{m}$ fraction of that input matrix, (2) up to $t$ of the workers, for some integer $t$, are adversary and may collude to gain information about the private inputs or can do malicious actions to make the final result incorrect. The objective is to design an MPC scheme with the minimum number the workers, called the recovery threshold, such that the final result is correct, workers learn no information about the input matrices, and the master learns nothing beyond the final result. In this paper, we propose an MPC scheme that achieves the recovery threshold of $3t+2m-1$ workers, which is order-wise less than the recovery threshold of the conventional methods. The challenge in dealing with this set up is that when nodes interact with each other, the malicious messages that adversarial nodes generate propagate through the system, and can mislead the honest nodes. To deal with this challenge, we design some subroutines that can detect erroneous messages, and correct or drop them.

preprint2020arXiv

Corella: A Private Multi Server Learning Approach based on Correlated Queries

The emerging applications of machine learning algorithms on mobile devices motivate us to offload the computation tasks of training a model or deploying a trained one to the cloud or at the edge of the network. One of the major challenges in this setup is to guarantee the privacy of the client data. Various methods have been proposed to protect privacy in the literature. Those include (i) adding noise to the client data, which reduces the accuracy of the result, (ii) using secure multiparty computation (MPC), which requires significant communication among the computing nodes or with the client, (iii) relying on homomorphic encryption (HE) methods, which significantly increases computation load at the servers. In this paper, we propose $\textit{Corella}$ as an alternative approach to protect the privacy of data. The proposed scheme relies on a cluster of servers, where at most $T \in \mathbb{N}$ of them may collude, each running a learning model (e.g., a deep neural network). Each server is fed with the client data, added with $\textit{strong}$ noise, independent from user data. The variance of the noise is set to be large enough to make the information leakage to any subset of up to $T$ servers information-theoretically negligible. On the other hand, the added noises for different servers are $\textit{correlated}$. This correlation among the queries allows the parameters of the models running on different servers to be $\textit{trained}$ such that the client can mitigate the contribution of the noises by combining the outputs of the servers, and recover the final result with high accuracy and with a minor computational effort. Simulation results for various datasets demonstrate the accuracy of the proposed approach for the classification, using deep neural networks, and the autoencoder, as supervised and unsupervised learning tasks, respectively.

preprint2020arXiv

DRL-Based QoS-Aware Resource Allocation Scheme for Coexistence of Licensed and Unlicensed Users in LTE and Beyond

In this paper, we employ deep reinforcement learning to develop a novel radio resource allocation and packet scheduling scheme for different Quality of Service (QoS) requirements applicable to LTEadvanced and 5G networks. In addition, regarding the scarcity of spectrum in below 6GHz bands, the proposed algorithm dynamically allocates the resource blocks (RBs) to licensed users in a way to mostly preserve the continuity of unallocated RBs. This would improve the efficiency of communication among the unlicensed entities by increasing the chance of uninterrupted communication and reducing the load of coordination overheads. The optimization problem is formulated as a Markov Decision Process (MDP), observing the entire queue of the demands, where failing to meet QoS constraints penalizes the goal with a multiplicative factor. Furthermore, a notion of continuity for unallocated resources is taken into account as an additive term in the objective function. Considering the variations in both channel coefficients and users requests, we utilize a deep reinforcement learning algorithm as an online and numerically efficient approach to solve the MDP. Numerical results show that the proposed method achieves higher average spectral efficiency, while considering delay budget and packet loss ratio, compared to the conventional greedy min-delay and max-throughput schemes, in which a fixed part of the spectrum is forced to be vacant for unlicensed entities.

preprint2020arXiv

Private Sequential Function Computation

Consider a system, including a user, $N$ servers, and $K$ basic functions which are known at all of the servers. Using the combination of those basic functions, it is possible to construct a wide class of functions. The user wishes to compute a particular combination of the basic functions, by offloading the computation to $N$ servers, while the servers should not obtain any information about which combination of the basic functions is to be computed. The objective is to minimize the total number of queries asked by the user from the servers to achieve the desired result. As a first step toward this problem, in this paper, we consider the case where the user is interested in a class of functions which are composition of the basic functions, while each basic function appears in the composition exactly once. This means that in this case, to ensure privacy, we only require to hide to the order of the basic functions in the desired composition of the user. We further assume that the basic functions are linear and can be represented by (possibly large-scale) matrices. We call this problem as private sequential function computation. We study the capacity $C$, defined as the supremum of the number of desired computations, normalized by the number of computations done at the servers, subject to the privacy constraint. We prove that $(1-\frac{1}{N})/ (1-\frac{1}{\max(K,N)}) \le C \le 1$. For the achievability, we show that the user can retrieve the desired order of composition, by choosing a proper order of inquiries among different servers, while keeping the order of computations for each server fixed, irrespective of the desired order of composition. In the end, we develop an information-theoretic converse which results in an upper bound on the capacity.

preprint2020arXiv

Secure Coded Multi-Party Computation for Massive Matrix Operations

In this paper, we consider a secure multi-party computation problem (MPC), where the goal is to offload the computation of an arbitrary polynomial function of some massive private matrices (inputs) to a cluster of workers. The workers are not reliable. Some of them may collude to gain information about the input data (semi-honest workers). The system is initialized by sharing a (randomized) function of each input matrix to each server. Since the input matrices are massive, each share's size is assumed to be at most $1/k$ fraction of the input matrix, for some $k \in \mathbb{N}$. The objective is to minimize the number of workers needed to perform the computation task correctly, such that even if an arbitrary subset of $t-1$ workers, for some $t \in \mathbb{N}$, collude, they cannot gain any information about the input matrices. We propose a sharing scheme, called \emph{polynomial sharing}, and show that it admits basic operations such as adding and multiplication of matrices and transposing a matrix. By concatenating the procedures for basic operations, we show that any polynomial function of the input matrices can be calculated, subject to the problem constraints. We show that the proposed scheme can offer order-wise gain in terms of the number of workers needed, compared to the approaches formed by the concatenation of job splitting and conventional MPC approaches.

preprint2020arXiv

Straggler Mitigation in Distributed Matrix Multiplication: Fundamental Limits and Optimal Coding

We consider the problem of massive matrix multiplication, which underlies many data analytic applications, in a large-scale distributed system comprising a group of worker nodes. We target the stragglers' delay performance bottleneck, which is due to the unpredictable latency in waiting for slowest nodes (or stragglers) to finish their tasks. We propose a novel coding strategy, named \emph{entangled polynomial code}, for designing the intermediate computations at the worker nodes in order to minimize the recovery threshold (i.e., the number of workers that we need to wait for in order to compute the final output). We demonstrate the optimality of entangled polynomial code in several cases, and show that it provides orderwise improvement over the conventional schemes for straggler mitigation. Furthermore, we characterize the optimal recovery threshold among all linear coding strategies within a factor of $2$ using \emph{bilinear complexity}, by developing an improved version of the entangled polynomial code. In particular, while evaluating bilinear complexity is a well-known challenging problem, we show that optimal recovery threshold for linear coding strategies can be approximated within a factor of $2$ of this fundamental quantity. On the other hand, the improved version of the entangled polynomial code enables further and orderwise reduction in the recovery threshold, compared to its basic version. Finally, we show that the techniques developed in this paper can also be extended to several other problems such as coded convolution and fault-tolerant computing, leading to tight characterizations.

preprint2016arXiv

A Unified Coding Framework for Distributed Computing with Straggling Servers

We propose a unified coded framework for distributed computing with straggling servers, by introducing a tradeoff between "latency of computation" and "load of communication" for some linear computation tasks. We show that the coded scheme of [1]-[3] that repeats the intermediate computations to create coded multicasting opportunities to reduce communication load, and the coded scheme of [4], [5] that generates redundant intermediate computations to combat against straggling servers can be viewed as special instances of the proposed framework, by considering two extremes of this tradeoff: minimizing either the load of communication or the latency of computation individually. Furthermore, the latency-load tradeoff achieved by the proposed coded framework allows to systematically operate at any point on that tradeoff to perform distributed computing tasks. We also prove an information-theoretic lower bound on the latency-load tradeoff, which is shown to be within a constant multiplicative gap from the achieved tradeoff at the two end points.

preprint2016arXiv

Approximate Capacity Region of the Two-User MISO Broadcast Channels with Delayed CSIT

We consider the problem of multiple-input single-output Broadcast Channels with Rayleigh fading where the transmitter has access to delayed knowledge of the channel state information. We first characterize the capacity region of this channel with two users to within constant number of bits for all values of the transmit power. The proposed signaling strategy utilizes the delayed knowledge of the channel state information and the previously transmitted signals, in order to create a signal of common interest for both receivers. This signal would be the quantized version of the summation of the previously transmitted signals. A challenge that arises in deriving the result for finite signal-to-noise ratio regimes is the correlation that exists between the quantization noise and the signal. To guarantee the independence of quantization noise and signal, we extend the framework of lattice quantizers with dither together with an interleaving step. For converse, we use the fact that the capacity region of this problem is upper-bounded by the capacity region of a physically degraded broadcast channel with no channel state information where one receiver has two antennas. Then, we derive an outer-bound on the capacity region of this degraded broadcast channel. Finally, we show how to extend our results to obtain the approximate capacity of the $K$-user multiple-input single-output Broadcast Channel with delayed knowledge of the channel state information at the transmitter to within $2 \log_2 \left( K + 2 \right)$ bits/s/Hz.

preprint2016arXiv

Fundamental Limits of Cache-Aided Interference Management

We consider a system comprising a library of $N$ files (e.g., movies) and a wireless network with $K_T$ transmitters, each equipped with a local cache of size of $M_T$ files, and $K_R$ receivers, each equipped with a local cache of size of $M_R$ files. Each receiver will ask for one of the $N$ files in the library, which needs to be delivered. The objective is to design the cache placement (without prior knowledge of receivers' future requests) and the communication scheme to maximize the throughput of the delivery. In this setting, we show that the sum degrees-of-freedom (sum-DoF) of $\min\left\{\frac{K_T M_T+K_R M_R}{N},K_R\right\}$ is achievable, and this is within a factor of 2 of the optimum, under one-shot linear schemes. This result shows that (i) the one-shot sum-DoF scales linearly with the aggregate cache size in the network (i.e., the cumulative memory available at all nodes), (ii) the transmitters' and receivers' caches contribute equally in the one-shot sum-DoF, and (iii) caching can offer a throughput gain that scales linearly with the size of the network. To prove the result, we propose an achievable scheme that exploits the redundancy of the content at transmitters' caches to cooperatively zero-force some outgoing interference and availability of the unintended content at receivers' caches to cancel (subtract) some of the incoming interference. We develop a particular pattern for cache placement that maximizes the overall gains of cache-aided transmit and receive interference cancellations. For the converse, we present an integer optimization problem which minimizes the number of communication blocks needed to deliver any set of requested files to the receivers. We then provide a lower bound on the value of this optimization problem, hence leading to an upper bound on the linear one-shot sum-DoF of the network, which is within a factor of 2 of the achievable sum-DoF.

preprint2015arXiv

Blind Index Coding

We introduce the blind index coding (BIC) problem, in which a single sender communicates distinct messages to multiple users over a shared channel. Each user has partial knowledge of each message as side information. However, unlike classic index coding, in BIC, the sender is uncertain of what side information is available to each user. In particular, the sender only knows the amount of bits in each user's side information but not its content. This problem can arise naturally in caching and wireless networks. In order to blindly exploit side information in the BIC problem, we develop a hybrid coding scheme that XORs uncoded bits of a subset of messages with random combinations of bits from other messages. This scheme allows us to strike the right balance between maximizing the transmission rate to each user and minimizing the interference leakage to others. We also develop a general outer bound, which relies on a strong data processing inequality to effectively capture the senders uncertainty about the users' side information. Additionally, we consider the case where communication takes place over a shared wireless medium, modeled by an erasure broadcast channel, and show that surprisingly, combining repetition coding with hybrid coding improves the achievable rate region and outperforms alternative strategies of coping with channel erasure and while blindly exploiting side information.

preprint2015arXiv

Coded MapReduce

MapReduce is a commonly used framework for executing data-intensive jobs on distributed server clusters. We introduce a variant implementation of MapReduce, namely "Coded MapReduce", to substantially reduce the inter-server communication load for the shuffling phase of MapReduce, and thus accelerating its execution. The proposed Coded MapReduce exploits the repetitive mapping of data blocks at different servers to create coding opportunities in the shuffling phase to exchange (key,value) pairs among servers much more efficiently. We demonstrate that Coded MapReduce can cut down the total inter-server communication load by a multiplicative factor that grows linearly with the number of servers in the system and it achieves the minimum communication load within a constant multiplicative factor. We also analyze the tradeoff between the "computation load" and the "communication load" of Coded MapReduce.

preprint2015arXiv

Cooperation Alignment for Distributed Interference Management

We consider a cooperative Gaussian interference channel in which each receiver must decode its intended message locally, with the help of cooperation either at the receivers side or at the transmitter side. In the case of receiver cooperation, the receivers can process and share information through limited capacity backhaul links. In contrast to various previously considered distributed antenna architectures, where processing is utterly performed in a centralized fashion, the model considered in this paper aims to capture the essence of decentralized processing, allowing for a more general class of "interactive" interference management strategies. Focusing on the three-user case, we characterize the fundamental tradeoff between the achievable communication rates and the corresponding backhaul cooperation rate, in terms of degrees of freedom (DoF). Surprisingly, we show that the optimum communication-cooperation tradeoff per user remains the same when we move from two-user to three-user interference channels. In the absence of cooperation, this is due to interference alignment, which keeps the fraction of communication dimensions wasted for interference unchanged. When backhaul cooperation is available, we develop a new idea that we call cooperation alignment, which guarantees that the average (per user) backhaul load remains the same as we increase the number of users. In the case of transmitter cooperation, the transmitters can form their jointly precoded signals through an interactive protocol over the backhaul. In this case, we show that the optimal (per user) tradeoff between the achievable communication rates and the corresponding backhaul cooperation rate in the three-user case is the same as for receiver cooperation.

preprint2014arXiv

Capacity Results for Binary Fading Interference Channels with Delayed CSIT

To study the effect of lack of up-to-date channel state information at the transmitters (CSIT), we consider two-user binary fading interference channels with Delayed-CSIT. We characterize the capacity region for such channels under homogeneous assumption where channel gains have identical and independent distributions across time and space, eliminating the possibility of exploiting time/space correlation. We introduce and discuss several novel coding opportunities created by outdated CSIT that can enlarge the achievable rate region. The capacity-achieving scheme relies on accurate combination, concatenation, and merging of these opportunities, depending on the channel statistics. The outer-bounds are based on an extremal inequality we develop for a binary broadcast channel with Delayed-CSIT. We further extend the results and characterize the capacity region when output feedback links are available from the receivers to the transmitters in addition to the delayed knowledge of the channel state information. We also discuss the extension of our results to the non-homogeneous setting.

preprint2014arXiv

Cellular Interference Alignment

Interference alignment promises that, in Gaussian interference channels, each link can support half of a degree of freedom (DoF) per pair of transmit-receive antennas. However, in general, this result requires to precode the data bearing signals over a signal space of asymptotically large diversity, e.g., over an infinite number of dimensions for time-frequency varying fading channels, or over an infinite number of rationally independent signal levels, in the case of time-frequency invariant channels. In this work we consider a wireless cellular system scenario where the promised optimal DoFs are achieved with linear precoding in one-shot (i.e., over a single time-frequency slot). We focus on the uplink of a symmetric cellular system, where each cell is split into three sectors with orthogonal intra-sector multiple access. In our model, interference is "local", i.e., it is due to transmitters in neighboring cells only. We consider a message-passing backhaul network architecture, in which nearby sectors can exchange already decoded messages and propose an alignment solution that can achieve the optimal DoFs. To avoid signaling schemes relying on the strength of interference, we further introduce the notion of \emph{topologically robust} schemes, which are able to guarantee a minimum rate (or DoFs) irrespectively of the strength of the interfering links. Towards this end, we design an alignment scheme which is topologically robust and still achieves the same optimum DoFs.

preprint2014arXiv

Cellular Interference Alignment: Omni-Directional Antennas and Asymmetric Configurations

Although interference alignment (IA) can theoretically achieve the optimal degrees of freedom (DoFs) in the $K$-user Gaussian interference channel, its direct application comes at the prohibitive cost of precoding over exponentially-many signaling dimensions. On the other hand, it is known that practical "one-shot" IA precoding (i.e., linear schemes without symbol expansion) provides a vanishing DoFs gain in large fully-connected networks with generic channel coefficients. In our previous work, we introduced the concept of "Cellular IA" for a network topology induced by hexagonal cells with sectors and nearest-neighbor interference. Assuming that neighboring sectors can exchange decoded messages (and not received signal samples) in the uplink, we showed that linear one-shot IA precoding over $M$ transmit/receive antennas can achieve the optimal $M/2$ DoFs per user. In this paper we extend this framework to networks with omni-directional (non-sectorized) cells and consider the practical scenario where users have $2$ antennas, and base-stations have $2$, $3$ or $4$ antennas. In particular, we provide linear one-shot IA schemes for the $2\times 2$, $2\times3$ and $2\times 4$ cases, and show the achievability of $3/4$, $1$ and $7/6$ DoFs per user, respectively. DoFs converses for one-shot schemes require the solution of a discrete optimization problem over a number of variables that grows with the network size. We develop a new approach to transform such challenging optimization problem into a tractable linear program (LP) with significantly fewer variables. This approach is used to show that the achievable $3/4$ DoFs per user are indeed optimal for a large (extended) cellular network with $2\times 2$ links.

preprint2014arXiv

Coded Caching for Delay-Sensitive Content

Coded caching is a recently proposed technique that achieves significant performance gains for cache networks compared to uncoded caching schemes. However, this substantial coding gain is attained at the cost of large delivery delay, which is not tolerable in delay-sensitive applications such as video streaming. In this paper, we identify and investigate the tradeoff between the performance gain of coded caching and the delivery delay. We propose a computationally efficient caching algorithm that provides the gains of coding and respects delay constraints. The proposed algorithm achieves the optimum performance for large delay, but still offers major gains for small delay. These gains are demonstrated in a practical setting with a video-streaming prototype.

preprint2014arXiv

Decentralized Coded Caching Attains Order-Optimal Memory-Rate Tradeoff

Replicating or caching popular content in memories distributed across the network is a technique to reduce peak network loads. Conventionally, the main performance gain of this caching was thought to result from making part of the requested data available closer to end users. Instead, we recently showed that a much more significant gain can be achieved by using caches to create coded-multicasting opportunities, even for users with different demands, through coding across data streams. These coded-multicasting opportunities are enabled by careful content overlap at the various caches in the network, created by a central coordinating server. In many scenarios, such a central coordinating server may not be available, raising the question if this multicasting gain can still be achieved in a more decentralized setting. In this paper, we propose an efficient caching scheme, in which the content placement is performed in a decentralized manner. In other words, no coordination is required for the content placement. Despite this lack of coordination, the proposed scheme is nevertheless able to create coded-multicasting opportunities and achieves a rate close to the optimal centralized scheme.

preprint2014arXiv

Fundamental Limits of Caching

Caching is a technique to reduce peak traffic rates by prefetching popular content into memories at the end users. Conventionally, these memories are used to deliver requested content in part from a locally cached copy rather than through the network. The gain offered by this approach, which we term local caching gain, depends on the local cache size (i.e, the memory available at each individual user). In this paper, we introduce and exploit a second, global, caching gain not utilized by conventional caching schemes. This gain depends on the aggregate global cache size (i.e., the cumulative memory available at all users), even though there is no cooperation among the users. To evaluate and isolate these two gains, we introduce an information-theoretic formulation of the caching problem focusing on its basic structure. For this setting, we propose a novel coded caching scheme that exploits both local and global caching gains, leading to a multiplicative improvement in the peak rate compared to previously known schemes. In particular, the improvement can be on the order of the number of users in the network. Moreover, we argue that the performance of the proposed scheme is within a constant factor of the information-theoretic optimum for all values of the problem parameters.

preprint2014arXiv

On Uplink-Downlink Duality for Cellular IA

In our previous work we considered the uplink of a hexagonal cellular network topology and showed that linear "one-shot" interference alignment (IA) schemes are able to achieve the optimal degrees of freedom (DoFs) per user, under a decoded-message passing framework that allows base-stations to exchange their own decoded messages over local backhaul links. In this work, we provide the dual framework for the downlink of cellular networks with the same backhaul architecture, and show that for every "one-shot" IA scheme that can achieve $d$ DoFs per user in the uplink, there exists a dual "one-shot" IA scheme that can achieve the same DoFs in the downlink. To enable "Cellular IA" for the downlink, base-stations will now use the same local backhaul links to exchange quantized versions of the dirty-paper precoded signals instead of user messages.

preprint2013arXiv

Communication Through Collisions: Opportunistic Utilization of Past Receptions

When several wireless users are sharing the spectrum, packet collision is a simple, yet widely used model for interference. Under this model, when transmitters cause interference at any of the receivers, their collided packets are discarded and need to be retransmitted. However, in reality, that receiver can still store its analog received signal and utilize it for decoding the packets in the future (for example, by successive interference cancellation techniques). In this work, we propose a physical layer model for wireless packet networks that allows for such flexibility at the receivers. We assume that the transmitters will be aware of the state of the channel (i.e. when and where collisions occur, or an unintended receiver overhears the signal) with some delay, and propose several coding opportunities that can be utilized by the transmitters to exploit the available signal at the receivers for interference management (as opposed to discarding them). We analyze the achievable throughput of our strategy in a canonical interference channel with two transmitter-receiver pairs, and demonstrate the gain over conventional schemes. By deriving an outer-bound, we also prove the optimality of our scheme for the corresponding model.

preprint2013arXiv

Online Coded Caching

We consider a basic content distribution scenario consisting of a single origin server connected through a shared bottleneck link to a number of users each equipped with a cache of finite memory. The users issue a sequence of content requests from a set of popular files, and the goal is to operate the caches as well as the server such that these requests are satisfied with the minimum number of bits sent over the shared link. Assuming a basic Markov model for renewing the set of popular files, we characterize approximately the optimal long-term average rate of the shared link. We further prove that the optimal online scheme has approximately the same performance as the optimal offline scheme, in which the cache contents can be updated based on the entire set of popular files before each new request. To support these theoretical results, we propose an online coded caching scheme termed coded least-recently sent (LRS) and simulate it for a demand time series derived from the dataset made available by Netflix for the Netflix Prize. For this time series, we show that the proposed coded LRS algorithm significantly outperforms the popular least-recently used (LRU) caching algorithm.

preprint2012arXiv

Completely Stale Transmitter Channel State Information is Still Very Useful

Transmitter channel state information (CSIT) is crucial for the multiplexing gains offered by advanced interference management techniques such as multiuser MIMO and interference alignment. Such CSIT is usually obtained by feedback from the receivers, but the feedback is subject to delays. The usual approach is to use the fed back information to predict the current channel state and then apply a scheme designed assuming perfect CSIT. When the feedback delay is large compared to the channel coherence time, such a prediction approach completely fails to achieve any multiplexing gain. In this paper, we show that even in this case, the completely stale CSI is still very useful. More concretely, we show that in a MIMO broadcast channel with $K$ transmit antennas and $K$ receivers each with 1 receive antenna, $\frac{K}{1+1/2+ ...+ \frac{1}{K}} (> 1) $ degrees of freedom is achievable even when the fed back channel state is completely independent of the current channel state. Moreover, we establish that if all receivers have independent and identically distributed channels, then this is the optimal number of degrees of freedom achievable. In the optimal scheme, the transmitter uses the fed back CSI to learn the side information that the receivers receive from previous transmissions rather than to predict the current channel state. Our result can be viewed as the first example of feedback providing a degree-of-freedom gain in memoryless channels.

preprint2010arXiv

A Fixed Precoding Approach to Achieve the Degrees of Freedom in X channel

This paper aims to provide a fixed precoding scheme to achieve the Degrees of Freedom DoF of the generalized ergodic X channel. This is achieved through using the notion of ergodic interference alignment technique. Accordingly, in the proposed method the transmitters do not require to know the full channel state information, while this assumption is the integral part of existing methods. Instead, a finite-rate feed-back channel is adequate to achieve the DoF. In other words, it is demonstrated that quantized versions of channel gains are adequate to achieve theDOF. To get an insight regarding the functionality of the proposed method, first we rely on finite field channel models, and then extend the terminology to more realistic cases, including dispersive fading channels in the presence of quantizer. Accordingly, in a Rayliegh fading environment, it is shown a feedback rate of 2log(p)+Theta(log(log(p))) can provide the DoF, where $p$ is the total transmit power.

preprint2007arXiv

Optimal Order of Decoding for Max-Min Fairness in $K$-User Memoryless Interference Channels

A $K$-user memoryless interference channel is considered where each receiver sequentially decodes the data of a subset of transmitters before it decodes the data of the designated transmitter. Therefore, the data rate of each transmitter depends on (i) the subset of receivers which decode the data of that transmitter, (ii) the decoding order, employed at each of these receivers. In this paper, a greedy algorithm is developed to find the users which are decoded at each receiver and the corresponding decoding order such that the minimum rate of the users is maximized. It is proven that the proposed algorithm is optimal.

Mohammad Ali Maddah-Ali

What is connected

Connect this record

See the researcher in context

Building this map preview

38 published item(s)

\mathsf{VISTA}: Decentralized Machine Learning in Adversary Dominated Environments

Game of Coding: Coding Theory in the Presence of Rational Adversaries, Motivated by Decentralized Machine Learning

Game of Coding: Sybil Resistant Decentralized Machine Learning with Minimal Trust Assumption

Learning from Acceptance: Cumulative Regret in the Game of Coding

Distributed Attribute-based Private Access Control

SwiftAgg: Communication-Efficient and Dropout-Resistant Secure Aggregation for Federated Learning with Worst-Case Security Guarantees

SwiftAgg+: Achieving Asymptotically Optimal Communication Loads in Secure Aggregation for Federated Learning

CodedSketch: A Coding Scheme for Distributed Computation of Approximated Matrix Multiplication

Fundamental Limits of Distributed Encoding

Multi-Party Proof Generation in QAP-based zk-SNARKs

Optimal Communication-Computation Trade-Off in Heterogeneous Gradient Coding

The Capacity Region of Distributed Multi-User Secret Sharing

A Hybrid-Order Distributed SGD Method for Non-Convex Optimization to Balance Communication Overhead, Computational Complexity, and Convergence Rate

BlockMarkchain: A Secure Decentralized Data Market with a Constant Load on the Blockchain

Coded Secure Multi-Party Computation for Massive Matrices with Adversarial Nodes

Corella: A Private Multi Server Learning Approach based on Correlated Queries

DRL-Based QoS-Aware Resource Allocation Scheme for Coexistence of Licensed and Unlicensed Users in LTE and Beyond

Private Sequential Function Computation

Secure Coded Multi-Party Computation for Massive Matrix Operations

Straggler Mitigation in Distributed Matrix Multiplication: Fundamental Limits and Optimal Coding

A Unified Coding Framework for Distributed Computing with Straggling Servers

Approximate Capacity Region of the Two-User MISO Broadcast Channels with Delayed CSIT

Fundamental Limits of Cache-Aided Interference Management

Blind Index Coding

Coded MapReduce

Cooperation Alignment for Distributed Interference Management

Capacity Results for Binary Fading Interference Channels with Delayed CSIT

Cellular Interference Alignment

Cellular Interference Alignment: Omni-Directional Antennas and Asymmetric Configurations

Coded Caching for Delay-Sensitive Content

Decentralized Coded Caching Attains Order-Optimal Memory-Rate Tradeoff

Fundamental Limits of Caching

On Uplink-Downlink Duality for Cellular IA

Communication Through Collisions: Opportunistic Utilization of Past Receptions

Online Coded Caching

Completely Stale Transmitter Channel State Information is Still Very Useful

A Fixed Precoding Approach to Achieve the Degrees of Freedom in X channel

Optimal Order of Decoding for Max-Min Fairness in $K$-User Memoryless Interference Channels