Researcher profile

Andrea J. Goldsmith

Andrea J. Goldsmith contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2022arXiv

Cloud-Cluster Architecture for Detection in Intermittently Connected Sensor Networks

We consider a centralized detection problem where sensors experience noisy measurements and intermittent connectivity to a centralized fusion center. The sensors collaborate locally within predefined sensor clusters and fuse their noisy sensor data to reach a common local estimate of the detected event in each cluster. The connectivity of each sensor cluster is intermittent and depends on the available communication opportunities of the sensors to the fusion center. Upon receiving the estimates from all the connected sensor clusters the fusion center fuses the received estimates to make a final determination regarding the occurrence of the event across the deployment area. We refer to this hybrid communication scheme as a \emph{cloud-cluster} architecture. We propose a method for optimizing the decision rule for each cluster and analyzing the expected detection performance resulting from our hybrid scheme. Our method is tractable and addresses the high computational complexity caused by heterogeneous sensors' and clusters' detection quality, heterogeneity in their communication opportunities, and non-convexity of the loss function. Our analysis shows that clustering the sensors provides resilience to noise in the case of low sensor communication probability with the cloud. For larger clusters, a steep improvement in detection performance is possible even for a low communication probability by using our cloud-cluster architecture.

preprint2022arXiv

Efficient Randomized Subspace Embeddings for Distributed Optimization under a Communication Budget

We study first-order optimization algorithms under the constraint that the descent direction is quantized using a pre-specified budget of $R$-bits per dimension, where $R \in (0 ,\infty)$. We propose computationally efficient optimization algorithms with convergence rates matching the information-theoretic performance lower bounds for: (i) Smooth and Strongly-Convex objectives with access to an Exact Gradient oracle, as well as (ii) General Convex and Non-Smooth objectives with access to a Noisy Subgradient oracle. The crux of these algorithms is a polynomial complexity source coding scheme that embeds a vector into a random subspace before quantizing it. These embeddings are such that with high probability, their projection along any of the canonical directions of the transform space is small. As a consequence, quantizing these embeddings followed by an inverse transform to the original space yields a source coding method with optimal covering efficiency while utilizing just $R$-bits per dimension. Our algorithms guarantee optimality for arbitrary values of the bit-budget $R$, which includes both the sub-linear budget regime ($R < 1$), as well as the high-budget regime ($R \geq 1$), while requiring $O\left(n^2\right)$ multiplications, where $n$ is the dimension. We also propose an efficient relaxation of this coding scheme using Hadamard subspaces that requires a near-linear time, i.e., $O\left(n \log n\right)$ additions.Furthermore, we show that the utility of our proposed embeddings can be extended to significantly improve the performance of gradient sparsification schemes. Numerical simulations validate our theoretical claims. Our implementations are available at https://github.com/rajarshisaha95/DistOptConstrComm.

preprint2022arXiv

Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms

High-dimensional models often have a large memory footprint and must be quantized after training before being deployed on resource-constrained edge devices for inference tasks. In this work, we develop an information-theoretic framework for the problem of quantizing a linear regressor learned from training data $(\mathbf{X}, \mathbf{y})$, for some underlying statistical relationship $\mathbf{y} = \mathbf{X}\boldsymbolθ + \mathbf{v}$. The learned model, which is an estimate of the latent parameter $\boldsymbolθ \in \mathbb{R}^d$, is constrained to be representable using only $Bd$ bits, where $B \in (0, \infty)$ is a pre-specified budget and $d$ is the dimension. We derive an information-theoretic lower bound for the minimax risk under this setting and propose a matching upper bound using randomized embedding-based algorithms which is tight up to constant factors. The lower and upper bounds together characterize the minimum threshold bit-budget required to achieve a performance risk comparable to the unquantized setting. We also propose randomized Hadamard embeddings that are computationally efficient and are optimal up to a mild logarithmic factor of the lower bound. Our model quantization strategy can be generalized and we show its efficacy by extending the method and upper-bounds to two-layer ReLU neural networks for non-linear regression. Numerical simulations show the improved performance of our proposed scheme as well as its closeness to the lower bound.

preprint2022arXiv

Semi-Decentralized Federated Learning with Collaborative Relaying

We present a semi-decentralized federated learning algorithm wherein clients collaborate by relaying their neighbors&#39; local updates to a central parameter server (PS). At every communication round to the PS, each client computes a local consensus of the updates from its neighboring clients and eventually transmits a weighted average of its own update and those of its neighbors to the PS. We appropriately optimize these averaging weights to ensure that the global update at the PS is unbiased and to reduce the variance of the global update at the PS, consequently improving the rate of convergence. Numerical simulations substantiate our theoretical claims and demonstrate settings with intermittent connectivity between the clients and the PS, where our proposed algorithm shows an improved convergence rate and accuracy in comparison with the federated averaging algorithm.

preprint2021arXiv

Model-Based Machine Learning for Communications

We present an introduction to model-based machine learning for communication systems. We begin by reviewing existing strategies for combining model-based algorithms and machine learning from a high level perspective, and compare them to the conventional deep learning approach which utilizes established deep neural network (DNN) architectures trained in an end-to-end manner. Then, we focus on symbol detection, which is one of the fundamental tasks of communication receivers. We show how the different strategies of conventional deep architectures, deep unfolding, and DNN-aided hybrid algorithms, can be applied to this problem. The last two approaches constitute a middle ground between purely model-based and solely DNN-based receivers. By focusing on this specific task, we highlight the advantages and drawbacks of each strategy, and present guidelines to facilitate the design of future model-based deep learning systems for communications.

preprint2021arXiv

The Rate-Distortion Risk in Estimation from Compressed Data

Consider the problem of estimating a latent signal from a lossy compressed version of the data when the compressor is agnostic to the relation between the signal and the data. This situation arises in a host of modern applications when data is transmitted or stored prior to determining the downstream inference task. Given a bitrate constraint and a distortion measure between the data and its compressed version, let us consider the joint distribution achieving Shannon&#39;s rate-distortion (RD) function. Given an estimator and a loss function associated with the downstream inference task, define the rate-distortion risk as the expected loss under the RD-achieving distribution. We provide general conditions under which the operational risk in estimating from the compressed data is asymptotically equivalent to the RD risk. The main theoretical tools to prove this equivalence are transportation-cost inequalities in conjunction with properties of compression codes achieving Shannon&#39;s RD function. Whenever such equivalence holds, a recipe for designing estimators from datasets undergoing lossy compression without specifying the actual compression technique emerges: design the estimator to minimize the RD risk. Our conditions simplified in the special cases of discrete memoryless or multivariate normal data. For these scenarios, we derive explicit expressions for the RD risk of several estimators and compare them to the optimal source coding performance associated with full knowledge of the relation between the latent signal and the data.

preprint2020arXiv

Capacity scaling in a Non-coherent Wideband Massive SIMO Block Fading Channel

The scaling of coherent and non-coherent channel capacity is studied in a single-input multiple-output (SIMO) block Rayleigh fading channel as both the bandwidth and the number of receiver antennas go to infinity jointly with the transmit power fixed. The transmitter has no channel state information (CSI), while the receiver may have genie-provided CSI (coherent receiver), or the channel statistics only (non-coherent receiver). Our results show that if the available bandwidth is smaller than a threshold bandwidth which is proportional (up to leading order terms) to the square root of the number of antennas, there is no gap between the coherent capacity and the non-coherent capacity in terms of capacity scaling behavior. On the other hand, when the bandwidth is larger than this threshold, there is a capacity scaling gap. Since achievable rates using pilot symbols for channel estimation are subject to the non-coherent capacity bound, this work reveals that pilot-assisted coherent receivers in systems with a large number of receive antennas are unable to exploit excess spectrum above a given threshold for capacity gain.

preprint2020arXiv

Compressed Sensing Channel Estimation for OFDM with non-Gaussian Multipath Gains

This paper analyzes the impact of non-Gaussian multipath component (MPC) amplitude distributions on the performance of Compressed Sensing (CS) channel estimators for OFDM systems. The number of dominant MPCs that any CS algorithm needs to estimate in order to accurately represent the channel is characterized. This number relates to a Compressibility Index (CI) of the channel that depends on the fourth moment of the MPC amplitude distribution. A connection between the Mean Squared Error (MSE) of any CS estimation algorithm and the MPC amplitude distribution fourth moment is revealed that shows a smaller number of MPCs is needed to well-estimate channels when these components have large fourth moment amplitude gains. The analytical results are validated via simulations for channels with lognormal MPCs such as the NYU mmWave channel model. These simulations show that when the MPC amplitude distribution has a high fourth moment, the well known CS algorithm of Orthogonal Matching Pursuit performs almost identically to the Basis Pursuit De-Noising algorithm with a much lower computational cost.

preprint2020arXiv

Data-Driven Factor Graphs for Deep Symbol Detection

Many important schemes in signal processing and communications, ranging from the BCJR algorithm to the Kalman filter, are instances of factor graph methods. This family of algorithms is based on recursive message passing-based computations carried out over graphical models, representing a factorization of the underlying statistics. Consequently, in order to implement these algorithms, one must have accurate knowledge of the statistical model of the considered signals. In this work we propose to implement factor graph methods in a data-driven manner. In particular, we propose to use machine learning (ML) tools to learn the factor graph, instead of the overall system task, which in turn is used for inference by message passing over the learned graph. We apply the proposed approach to learn the factor graph representing a finite-memory channel, demonstrating the resulting ability to implement BCJR detection in a data-driven fashion. We demonstrate that the proposed system, referred to as BCJRNet, learns to implement the BCJR algorithm from a small training set, and that the resulting receiver exhibits improved robustness to inaccurate training compared to the conventional channel-model-based receiver operating under the same level of uncertainty. Our results indicate that by utilizing ML tools to learn factor graphs from labeled data, one can implement a broad range of model-based algorithms, which traditionally require full knowledge of the underlying statistics, in a data-driven fashion.

preprint2020arXiv

Data-Driven Symbol Detection via Model-Based Machine Learning

The design of symbol detectors in digital communication systems has traditionally relied on statistical channel models that describe the relation between the transmitted symbols and the observed signal at the receiver. Here we review a data-driven framework to symbol detection design which combines machine learning (ML) and model-based algorithms. In this hybrid approach, well-known channel-model-based algorithms such as the Viterbi method, BCJR detection, and multiple-input multiple-output (MIMO) soft interference cancellation (SIC) are augmented with ML-based algorithms to remove their channel-model-dependence, allowing the receiver to learn to implement these algorithms solely from data. The resulting data-driven receivers are most suitable for systems where the underlying channel models are poorly understood, highly complex, or do not well-capture the underlying physics. Our approach is unique in that it only replaces the channel-model-based computations with dedicated neural networks that can be trained from a small amount of data, while keeping the general algorithm intact. Our results demonstrate that these techniques can yield near-optimal performance of model-based algorithms without knowing the exact channel input-output statistical relationship and in the presence of channel state information uncertainty.

preprint2017arXiv

The Fluctuating Two-Ray Fading Model: Statistical Characterization and Performance Analysis

We introduce the Fluctuating Two-Ray (FTR) fading model, a new statistical channel model that consists of two fluctuating specular components with random phases plus a diffuse component. The FTR model arises as the natural generalization of the two-wave with diffuse power (TWDP) fading model; this generalization allows its two specular components to exhibit a random amplitude fluctuation. Unlike the TWDP model, all the chief probability functions of the FTR fading model (PDF, CDF and MGF) are expressed in closed-form, having a functional form similar to other state-of-the-art fading models. We also provide approximate closed-form expressions for the PDF and CDF in terms of a finite number of elementary functions, which allow for a simple evaluation of these statistics to an arbitrary level of precision. We show that the FTR fading model provides a much better fit than Rician fading for recent small-scale fading measurements in 28 GHz outdoor millimeter-wave channels. Finally, the performance of wireless communication systems over FTR fading is evaluated in terms of the bit error rate and the outage capacity, and the interplay between the FTR fading model parameters and the system performance is discussed. Monte Carlo simulations have been carried out in order to validate the obtained theoretical expressions.