Researcher profile

Sadaf Salehkalaibar

Sadaf Salehkalaibar contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

Lossy Gradient Compression: How Much Accuracy Can One Bit Buy?

In federated learning (FL), a global model is trained at a Parameter Server (PS) by aggregating model updates obtained from multiple remote learners. Generally, the communication between the remote users and the PS is rate-limited, while the transmission from the PS to the remote users are unconstrained. The FL setting gives rise to the distributed learning scenario in which the updates from the remote learners have to be compressed so as to meet communication rate constraints in the uplink transmission toward the PS. For this problem, one wishes to compress the model updates so as to minimize the loss in accuracy resulting from the compression error. In this paper, we take a rate-distortion approach to address the compressor design problem for the distributed training of deep neural networks (DNNs). In particular, we define a measure of the compression performance under communication-rate constraints -- the \emph{per-bit accuracy} -- which addresses the ultimate improvement of accuracy that a bit of communication brings to the centralized model. In order to maximize the per-bit accuracy, we consider modeling the DNN gradient updates at remote learners as a generalized normal distribution. Under this assumption on the DNN gradient distribution, we propose a class of distortion measures to aid the design of quantizers for the compression of the model updates. We argue that this family of distortion measures, which we refer to as "$M$-magnitude weighted $L_2$" norm, captures the practitioner's intuition in the choice of gradient compressor. Numerical simulations are provided to validate the proposed approach for the CIFAR-10 dataset.

preprint2022arXiv

On Distributed Lossy Coding of Symmetrically Correlated Gaussian Sources

A distributed lossy compression network with $L$ encoders and a decoder is considered. Each encoder observes a source and sends a compressed version to the decoder. The decoder produces a joint reconstruction of target signals with the mean squared error distortion below a given threshold. It is assumed that the observed sources can be expressed as the sum of target signals and corruptive noises which are independently generated from two symmetric multivariate Gaussian distributions. The minimum compression rate of this network versus the distortion threshold is referred to as the rate-distortion function, for which an explicit lower bound is established by solving a minimization problem. Our lower bound matches the well-known Berger-Tung upper bound for some values of the distortion threshold. The asymptotic gap between the upper and lower bounds is characterized in the large $L$ limit.

preprint2020arXiv

Distributed Hypothesis Testing with Variable-Length Coding

The problem of distributed testing against independence with variable-length coding is considered when the \emph{average} and not the \emph{maximum} communication load is constrained as in previous works. The paper characterizes the optimum type-II error exponent of a single sensor single decision center system given a maximum type-I error probability when communication is either over a noise-free rate-$R$ link or over a noisy discrete memoryless channel (DMC) with stop-feedback. Specifically, let $ε$ denote the maximum allowed type-I error probability. Then the optimum exponent of the system with a rate-$R$ link under a constraint on the average communication load coincides with the optimum exponent of such a system with a rate $R/(1-ε)$ link under a maximum communication load constraint. A strong converse thus does not hold under an average communication load constraint. A similar observation holds also for testing against independence over DMCs. With variable-length coding and stop-feedback and under an average communication load constraint, the optimum type-II error exponent over a DMC of capacity $C$ equals the optimum exponent under fixed-length coding and a maximum communication load constraint when communication is over a DMC of capacity $C(1-ε)^{-1}$. In particular, under variable-length coding over a DMC with stop feedback a strong converse result does not hold and the optimum error exponent depends on the transition law of the DMC only through its capacity.

preprint2020arXiv

Hypothesis Testing Over the Two-hop Relay Network

Coding and testing schemes and the corresponding achievable type-II error exponents are presented for binary hypothesis testing over two-hop relay networks. The schemes are based on cascade source coding techniques and {unanimous decision-forwarding}, the latter meaning that a terminal decides on the null hypothesis only if all previous terminals have decided on the null hypothesis. If the observations at the transmitter, the relay, and the receiver form a Markov chain in this order, then, without loss in performance, the proposed cascade source code can be replaced by two independent point-to-point source codes, one for each hop. The decoupled scheme (combined with decision-forwarding) is shown to attain the optimal type-II error exponents for various instances of "testing against conditional independence." The same decoupling is shown to be optimal also for some instances of "testing against independence," when the observations at the transmitter, the receiver, and the relay form a Markov chain in this order, and when the relay-to-receiver link is of sufficiently high rate. For completeness, the paper also presents an analysis of the Shimokawa-Han-Amari binning scheme for the point-to-point hypothesis testing setup.

preprint2020arXiv

On Hypothesis Testing Against Independence with Multiple Decision Centers

A distributed binary hypothesis testing problem is studied with one observer and two decision centers. Achievable type-II error exponents are derived for testing against conditional independence when the observer communicates with the two decision centers over one common and two individual noise-free bit pipes and when it communicates with them over a noisy broadcast channel (BC). The results are based on a coding and testing scheme that splits the observations into subblocks, so that transmitter and receivers can independently apply to each subblock either Gray-Wyner coordination coding with side-information or hybrid joint source-channel coding with side-information, followed by a Neyman-Pearson test over the subblocks at the receivers. This approach allows to avoid introducing further error exponents that one would expect from the receivers' decoding operations related to binning or the noisy transmission channel. The derived exponents are shown to be optimal in some special cases when communication is over noise-free links. The results reveal a tradeoff between the type-II error exponents at the two decision centers.