Researcher profile

Soheil Mohajer

Soheil Mohajer contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

\mathsf{VISTA}: Decentralized Machine Learning in Adversary Dominated Environments

Decentralized machine learning often relies on outsourcing computations, such as gradient evaluations, to untrusted worker nodes. Existing robust aggregation methods can mitigate malicious behavior under honest-majority assumptions, but may fail when adversaries control a majority of the workers. We study this adversary-dominated setting through an incentive-oriented framework in which reports are accepted and rewarded only when they are mutually consistent up to a threshold. This turns the adversary from a pure saboteur into a rational agent that trades off increasing estimation error against the risk of rejection and loss of reward. We consider iterative optimization under this model. Unlike one-shot computation, iterative learning requires long-horizon decisions: permissive acceptance rules enable faster early progress but admit more adversarial corruption, while strict rules improve estimation accuracy but cause frequent rejections. We propose \mathsf{VISTA}, an adaptive algorithm that tunes the acceptance threshold using the optimization history. Numerical results show that \mathsf{VISTA} improves convergence over static thresholds. We also provide a rigorous convergence analysis showing that, with suitable incentive-aware adaptation, adversary-dominated decentralized learning can retain the asymptotic convergence behavior of standard SGD without relying on an honest majority.

preprint2022arXiv

DIMIX: DIminishing MIXing for Sloppy Agents

We study non-convex distributed optimization problems where a set of agents collaboratively solve a separable optimization problem that is distributed over a time-varying network. The existing methods to solve these problems rely on (at most) one time-scale algorithms, where each agent performs a diminishing or constant step-size gradient descent at the average estimate of the agents in the network. However, if possible at all, exchanging exact information, that is required to evaluate these average estimates, potentially introduces a massive communication overhead. Therefore, a reasonable practical assumption to be made is that agents only receive a rough approximation of the neighboring agents' information. To address this, we introduce and study a \textit{two time-scale} decentralized algorithm with a broad class of \textit{lossy} information sharing methods (that includes noisy, quantized, and/or compressed information sharing) over \textit{time-varying} networks. In our method, one time-scale suppresses the (imperfect) incoming information from the neighboring agents, and one time-scale operates on local cost functions' gradients. We show that with a proper choices for the step-sizes' parameters, the algorithm achieves a convergence rate of $\mathcal{O}({T}^{-1/3 + ε})$ for non-convex distributed optimization problems over time-varying networks, for any $ε>0$. Our simulation results support the theoretical results of the paper.

preprint2022arXiv

Distributed Optimization over Time-varying Graphs with Imperfect Sharing of Information

We study strongly convex distributed optimization problems where a set of agents are interested in solving a separable optimization problem collaboratively. In this paper, we propose and study a two time-scale decentralized gradient descent algorithm for a broad class of lossy sharing of information over time-varying graphs. One time-scale fades out the (lossy) incoming information from neighboring agents, and one time-scale regulates the local loss functions' gradients. For strongly convex loss functions, with a proper choice of step-sizes, we show that the agents' estimates converge to the global optimal state at a rate of $O(T^{-1/2})$. Another important contribution of this work is to provide novel tools to deal with diminishing average weights over time-varying graphs.

preprint2022arXiv

Matrix Completion with Hierarchical Graph Side Information

We consider a matrix completion problem that exploits social or item similarity graphs as side information. We develop a universal, parameter-free, and computationally efficient algorithm that starts with hierarchical graph clustering and then iteratively refines estimates both on graph clustering and matrix ratings. Under a hierarchical stochastic block model that well respects practically-relevant social graphs and a low-rank rating matrix model (to be detailed), we demonstrate that our algorithm achieves the information-theoretic limit on the number of observed matrix entries (i.e., optimal sample complexity) that is derived by maximum likelihood estimation together with a lower-bound impossibility result. One consequence of this result is that exploiting the hierarchical structure of social graphs yields a substantial gain in sample complexity relative to the one that simply identifies different groups without resorting to the relational structure across them. We conduct extensive experiments both on synthetic and real-world datasets to corroborate our theoretical results as well as to demonstrate significant performance improvements over other matrix completion algorithms that leverage graph side information.

preprint2022arXiv

Secure Determinant Codes for Distributed Storage Systems

The information-theoretic secure exact-repair regenerating codes for distributed storage systems (DSSs) with parameters $(n,k=d,d,\ell)$ are studied in this paper. We consider distributed storage systems with $n$ nodes, in which the original data can be recovered from any subset of $k=d$ nodes, and the content of any node can be retrieved from those of any $d$ helper nodes. Moreover, we consider two secrecy constraints, namely, Type-I, where the message remains secure against an eavesdropper with access to the content of any subset of up to $\ell$ nodes, and Type-II, in which the message remains secure against an eavesdropper who can observe the incoming repair data from all possible nodes to a fixed but unknown subset of up to $\ell$ compromised nodes. Two classes of secure determinant codes are proposed for Type-I and Type-II secrecy constraints. Each proposed code can be designed for a range of per-node storage capacity and repair bandwidth for any system parameters. They lead to two achievable secrecy trade-offs, for Type-I and Type-II security.

preprint2020arXiv

Best Relay Selection in Gaussian Half-Duplex Diamond Networks

This paper considers Gaussian half-duplex diamond $n$-relay networks, where a source communicates with a destination by hopping information through one layer of $n$ non-communicating relays that operate in half-duplex. The main focus consists of investigating the following question: What is the contribution of a single relay on the approximate capacity of the entire network? In particular, approximate capacity refers to a quantity that approximates the Shannon capacity within an additive gap which only depends on $n$, and is independent of the channel parameters. This paper answers the above question by providing a fundamental bound on the ratio between the approximate capacity of the highest-performing single relay and the approximate capacity of the entire network, for any number $n$. Surprisingly, it is shown that such a ratio guarantee is $f = 1/(2+2\cos(2π/(n+2)))$, that is a sinusoidal function of $n$, which decreases as $n$ increases. It is also shown that the aforementioned ratio guarantee is tight, i.e., there exist Gaussian half-duplex diamond $n$-relay networks, where the highest-performing relay has an approximate capacity equal to an $f$ fraction of the approximate capacity of the entire network.

preprint2020arXiv

Cascade Codes For Distributed Storage Systems

A novel coding scheme for exact repair-regenerating codes is presented in this paper. The codes proposed in this work can trade between the repair bandwidth of nodes (number of downloaded symbols from each surviving node in a repair process) and the required storage overhead of the system. These codes work for general system parameters $(n,k,d)$, which are the total number of nodes, the number of nodes suffice for data recovery, and the number of helper nodes in a repair process, respectively. The proposed construction offers a unified scheme to develop exact-repair regenerating codes for the entire trade-off, including the MBR and MSR points. We conjecture that the new storage-vs.-bandwidth trade-off achieved by the proposed codes is optimum. Some other key features of this code include: the construction is linear; the required field size is only $Θ(n)$; and the code parameters and in particular sub-packetization level is at most $(d-k+1)^k$; which is independent of the number of the parity nodes. Moreover, the proposed repair mechanism is \emph{helper-independent}, that is the data sent from each helper only depends on the identity of the helper and failed nodes, but independent of the identity of other helper nodes participating in the repair process.

preprint2020arXiv

On the Fundamental Limits of Coded Data Shuffling for Distributed Machine Learning

We consider the data shuffling problem in a distributed learning system, in which a master node is connected to a set of worker nodes, via a shared link, in order to communicate a set of files to the worker nodes. The master node has access to a database of files. In every shuffling iteration, each worker node processes a new subset of files, and has excess storage to partially cache the remaining files, assuming the cached files are uncoded. The caches of the worker nodes are updated every iteration, and they should be designed to satisfy any possible unknown permutation of the files in subsequent iterations. For this problem, we characterize the exact load-memory trade-off for worst-case shuffling by deriving the minimum communication load for a given storage capacity per worker node. As a byproduct, the exact load-memory trade-off for any shuffling is characterized when the number of files is equal to the number of worker nodes. We propose a novel deterministic coded shuffling scheme, which improves the state of the art, by exploiting the cache memories to create coded functions that can be decoded by several worker nodes. Then, we prove the optimality of our proposed scheme by deriving a matching lower bound and showing that the placement phase of the proposed coded shuffling scheme is optimal over all shuffles.

preprint2019arXiv

Determinant Codes with Helper-Independent Repair for Single and Multiple Failures

Determinant codes are a class of exact-repair regenerating codes for distributed storage systems with parameters (n, k = d, d). These codes cover the entire trade-off between per-node storage and repair-bandwidth. In an earlier work of the authors, the repair data of the determinant code sent by a helper node to repair a failed node depends on the identity of the other helper nodes participating in the process, which is practically undesired. In this work, a new repair mechanism is proposed for determinant codes, which relaxes this dependency, while preserving all other properties of the code. Moreover, it is shown that the determinant codes are capable of repairing multiple failures, with a per-node repair-bandwidth which scales sub-linearly with the number of failures.