Researcher profile

Salim El Rouayheb

Salim El Rouayheb contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

Self-Creating Random Walks for Decentralized Learning under Pac-Man Attacks

Random walk (RW)-based algorithms have long been popular in distributed systems due to low overheads and scalability, with recent growing applications in decentralized learning. However, their reliance on local interactions makes them inherently vulnerable to malicious behavior. In this work, we investigate an adversarial threat that we term the ``Pac-Man'' attack, in which a malicious node probabilistically terminates any RW that visits it. This stealthy behavior gradually eliminates active RWs from the network, effectively halting the learning process without triggering failure alarms. To counter this threat, we propose the CREATE-IF-LATE (CIL) algorithm, which is a fully decentralized, resilient mechanism that enables self-creating RWs and prevents RW extinction in the presence of Pac-Man. Our theoretical analysis shows that the CIL algorithm guarantees several desirable properties, such as (i) non-extinction of the RW population, (ii) almost sure boundedness of the RW population, and (iii) convergence of RW-based stochastic gradient descent even in the presence of Pac-Man with a quantifiable deviation from the true optimum. Moreover, the learning process experiences at most a linear time delay due to Pac-Man interruptions and RW regeneration. Our extensive empirical results on both synthetic and public benchmark datasets validate our theoretical findings.

preprint2022arXiv

Adaptive Stochastic Gradient Descent for Fast and Communication-Efficient Distributed Learning

We consider the setting where a master wants to run a distributed stochastic gradient descent (SGD) algorithm on $n$ workers, each having a subset of the data. Distributed SGD may suffer from the effect of stragglers, i.e., slow or unresponsive workers who cause delays. One solution studied in the literature is to wait at each iteration for the responses of the fastest $k<n$ workers before updating the model, where $k$ is a fixed parameter. The choice of the value of $k$ presents a trade-off between the runtime (i.e., convergence rate) of SGD and the error of the model. Towards optimizing the error-runtime trade-off, we investigate distributed SGD with adaptive~$k$, i.e., varying $k$ throughout the runtime of the algorithm. We first design an adaptive policy for varying $k$ that optimizes this trade-off based on an upper bound on the error as a function of the wall-clock time that we derive. Then, we propose and implement an algorithm for adaptive distributed SGD that is based on a statistical heuristic. Our results show that the adaptive version of distributed SGD can reach lower error values in less time compared to non-adaptive implementations. Moreover, the results also show that the adaptive version is communication-efficient, where the amount of communication required between the master and the workers is less than that of non-adaptive versions.

preprint2022arXiv

Field Trace Polynomial Codes for Secure Distributed Matrix Multiplication

We consider the problem of communication efficient secure distributed matrix multiplication. The previous literature has focused on reducing the number of servers as a proxy for minimizing communication costs. The intuition being, that the more servers used, the higher the communication cost. We show that this is not the case. Our central technique relies on adapting results from the literature on repairing Reed-Solomon codes where instead of downloading the whole of the computing task, a user downloads field traces of these computations. We present field trace polynomial codes, a family of codes, that explore this technique and characterize regimes for which our codes outperform the existing codes in the literature.

preprint2022arXiv

Intermittent Private Information Retrieval with Application to Location Privacy

We study the problem of intermittent private information retrieval with multiple servers, in which a user consecutively requests one of K messages from N replicated databases such that part of requests need to be protected while others do not need privacy. Motivated by the location privacy application, the correlation between requests is modeled by a Markov chain. We propose an intermittent private information retrieval scheme that concatenates an obfuscation scheme and a private information retrieval scheme for the time period when privacy is not needed, to prevent leakage incurred by the correlation over time. In the end, we illustrate how the proposed scheme for the problem of intermittent private information retrieval with Markov structure correlation can be applied to design a location privacy protection mechanism in the location privacy problem.

preprint2022arXiv

Walk for Learning: A Random Walk Approach for Federated Learning from Heterogeneous Data

We consider the problem of a Parameter Server (PS) that wishes to learn a model that fits data distributed on the nodes of a graph. We focus on Federated Learning (FL) as a canonical application. One of the main challenges of FL is the communication bottleneck between the nodes and the parameter server. A popular solution in the literature is to allow each node to do several local updates on the model in each iteration before sending it back to the PS. While this mitigates the communication bottleneck, the statistical heterogeneity of the data owned by the different nodes has proven to delay convergence and bias the model. In this work, we study random walk (RW) learning algorithms for tackling the communication and data heterogeneity problems. The main idea is to leverage available direct connections among the nodes themselves, which are typically &#34;cheaper&#34; than the communication to the PS. In a random walk, the model is thought of as a &#34;baton&#34; that is passed from a node to one of its neighbors after being updated in each iteration. The challenge in designing the RW is the data heterogeneity and the uncertainty about the data distributions. Ideally, we would want to visit more often nodes that hold more informative data. We cast this problem as a sleeping multi-armed bandit (MAB) to design a near-optimal node sampling strategy that achieves variance-reduced gradient estimates and approaches sub-linearly the optimal sampling strategy. Based on this framework, we present an adaptive random walk learning algorithm. We provide theoretical guarantees on its convergence. Our numerical results validate our theoretical findings and show that our algorithm outperforms existing random walk algorithms.

preprint2021arXiv

Advances and Open Problems in Federated Learning

Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents an extensive collection of open problems and challenges.

preprint2021arXiv

Codes for Correcting Localized Deletions

We consider the problem of constructing binary codes for correcting deletions that are localized within certain parts of the codeword that are unknown a priori. The model that we study is when $δ\leq w$ deletions are localized in a window of size $w$ bits. These $δ$ deletions do not necessarily occur in consecutive positions, but are restricted to the window of size $w$. The localized deletions model is a generalization of the bursty model, in which all the deleted bits are consecutive. In this paper, we construct new explicit codes for the localized model, based on the family of Guess & Check codes which was previously introduced by the authors. The codes that we construct can correct, with high probability, $δ\leq w$ deletions that are localized in a single window of size $w$, where $w$ grows with the block length. Moreover, these codes are systematic; have low redundancy; and have efficient deterministic encoding and decoding algorithms. We also generalize these codes to deletions that are localized within multiple windows in the codeword.

preprint2020arXiv

GASP Codes for Secure Distributed Matrix Multiplication

We consider the problem of secure distributed matrix multiplication (SDMM) in which a user wishes to compute the product of two matrices with the assistance of honest but curious servers. We construct polynomial codes for SDMM by studying a combinatorial problem on a special type of addition table, which we call the degree table. The codes are based on arithmetic progressions, and are thus named GASP (Gap Additive Secure Polynomial) Codes. GASP Codes are shown to outperform all previously known polynomial codes for secure distributed matrix multiplication in terms of download rate.

preprint2020arXiv

Notes on Communication and Computation in Secure Distributed Matrix Multiplication

We consider the problem of secure distributed matrix multiplication in which a user wishes to compute the product of two matrices with the assistance of honest but curious servers. In this paper, we answer the following question: Is it beneficial to offload the computations if security is a concern? We answer this question in the affirmative by showing that by adjusting the parameters in a polynomial code we can obtain a trade-off between the user&#39;s and the servers&#39; computational time. Indeed, we show that if the computational time complexity of an operation in $\mathbb{F}_q$ is at most $\mathcal{Z}_q$ and the computational time complexity of multiplying two $n\times n$ matrices is $\mathcal{O}(n^ω\mathcal{Z}_q)$ then, by optimizing the trade-off, the user together with the servers can compute the multiplication in $\mathcal{O}(n^{4-\frac{6}{ω+1}} \mathcal{Z}_q)$ time. We also show that if the user is only concerned in optimizing the download rate, a common assumption in the literature, then the problem can be converted into a simple private information retrieval problem by means of a scheme we call Private Oracle Querying. However, this comes at large upload and computational costs for both the user and the servers.

preprint2018arXiv

Minimizing Latency for Secure Coded Computing Using Secret Sharing via Staircase Codes

We consider the setting of a Master server, M, who possesses confidential data (e.g., personal, genomic or medical data) and wants to run intensive computations on it, as part of a machine learning algorithm for example. The Master wants to distribute these computations to untrusted workers who have volunteered or are incentivized to help with this task. However, the data must be kept private and not revealed to the individual workers. Some of the workers may be stragglers, e.g., slow or busy, and will take a random time to finish the task assigned to them. We are interested in reducing the delays experienced by the Master. We focus on linear computations as an essential operation in many iterative algorithms such as principal component analysis, support vector machines and other gradient-descent based algorithms. A classical solution is to use a linear secret sharing scheme, such as Shamir&#39;s scheme, to divide the data into secret shares on which the workers can perform linear computations. However, classical codes can provide straggler mitigation assuming a worst-case scenario of a fixed number of stragglers. We propose a solution based on new secure codes, called Staircase codes, introduced previously by two of the authors. Staircase codes allow flexibility in the number of stragglers up to a given maximum, and universally achieve the information theoretic limit on the download cost by the Master, leading to latency reduction. Under the shifted exponential model, we find upper and lower bounds on the Master&#39;s mean waiting time. We derive the distribution of the Master&#39;s waiting time, and its mean, for systems with up to two stragglers. For systems with any number of stragglers, we derive an expression that can give the exact distribution, and the mean, of the waiting time of the Master. We show that Staircase codes always outperform classical secret sharing codes.

preprint2017arXiv

Minimizing Latency for Secure Distributed Computing

We consider the setting of a master server who possesses confidential data (genomic, medical data, etc.) and wants to run intensive computations on it, as part of a machine learning algorithm for example. The master wants to distribute these computations to untrusted workers who have volunteered or are incentivized to help with this task. However, the data must be kept private (in an information theoretic sense) and not revealed to the individual workers. The workers may be busy, or even unresponsive, and will take a random time to finish the task assigned to them. We are interested in reducing the aggregate delay experienced by the master. We focus on linear computations as an essential operation in many iterative algorithms. A known solution is to use a linear secret sharing scheme to divide the data into secret shares on which the workers can compute. We propose to use instead new secure codes, called Staircase codes, introduced previously by two of the authors. We study the delay induced by Staircase codes which is always less than that of secret sharing. The reason is that secret sharing schemes need to wait for the responses of a fixed fraction of the workers, whereas Staircase codes offer more flexibility in this respect. For instance, for codes with rate $R=1/2$ Staircase codes can lead to up to $40\%$ reduction in delay compared to secret sharing.