Researcher profile

Nir Shlezinger

Nir Shlezinger contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
34works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

34 published item(s)

preprint2026arXiv

SGD-Based Knowledge Distillation with Bayesian Teachers: Theory and Guidelines

Knowledge Distillation (KD) is a central paradigm for transferring knowledge from a large teacher network to a typically smaller student model, often by leveraging soft probabilistic outputs. While KD has shown strong empirical success in numerous applications, its theoretical underpinnings remain only partially understood. In this work, we adopt a Bayesian perspective on KD to rigorously analyze the convergence behavior of students trained with Stochastic Gradient Descent (SGD). We study two regimes: $(i)$ when the teacher provides the exact Bayes Class Probabilities (BCPs); and $(ii)$ supervision with noisy approximations of the BCPs. Our analysis shows that learning from BCPs yields variance reduction and removes neighborhood terms in the convergence bounds compared to one-hot supervision. We further characterize how the level of noise affects generalization and accuracy. Motivated by these insights, we advocate the use of Bayesian deep learning models, which typically provide improved estimates of the BCPs, as teachers in KD. Consistent with our analysis, we experimentally demonstrate that students distilled from Bayesian teachers not only achieve higher accuracies (up to +4.27%), but also exhibit more stable convergence (up to 30% less noise), compared to students distilled from deterministic teachers.

preprint2024arXiv

Adaptive KalmanNet: Data-Driven Kalman Filter with Fast Adaptation

Combining the classical Kalman filter (KF) with a deep neural network (DNN) enables tracking in partially known state space (SS) models. A major limitation of current DNN-aided designs stems from the need to train them to filter data originating from a specific distribution and underlying SS model. Consequently, changes in the model parameters may require lengthy retraining. While the KF adapts through parameter tuning, the black-box nature of DNNs makes identifying tunable components difficult. Hence, we propose Adaptive KalmanNet (AKNet), a DNN-aided KF that can adapt to changes in the SS model without retraining. Inspired by recent advances in large language model fine-tuning paradigms, AKNet uses a compact hypernetwork to generate context-dependent modulation weights. Numerical evaluation shows that AKNet provides consistent state estimation performance across a continuous range of noise distributions, even when trained using data from limited noise settings.

preprint2023arXiv

Learn to Rapidly and Robustly Optimize Hybrid Precoding

Hybrid precoding plays a key role in realizing massive multiple-input multiple-output (MIMO) transmitters with controllable cost. MIMO precoders are required to frequently adapt based on the variations in the channel conditions. In hybrid MIMO, here precoding is comprised of digital and analog beamforming, such an adaptation involves lengthy optimization and depends on accurate channel state information (CSI). This affects the spectral efficiency when the channel varies rapidly and when operating with noisy CSI. In this work we employ deep learning techniques to learn how to rapidly and robustly optimize hybrid precoders, while being fully interpretable. We leverage data to learn iteration-dependent hyperparameter settings of projected gradient sum-rate optimization with a predefined number of iterations. The algorithm maps channel realizations into hybrid precoding settings while preserving the interpretable flow of the optimizer and improving its convergence speed. To cope with noisy CSI, we learn to optimize the minimal achievable sum-rate among all tolerable errors, proposing a robust hybrid precoding based on the projected conceptual mirror prox minimax optimizer. Numerical results demonstrate that our approach allows using over ten times less iterations compared to that required by conventional optimization with shared hyperparameters, while achieving similar and even improved sum-rate performance.

preprint2022arXiv

Anomaly Search over Composite Hypotheses in Hierarchical Statistical Models

Detection of anomalies among a large number of processes is a fundamental task that has been studied in multiple research areas, with diverse applications spanning from spectrum access to cyber-security. Anomalous events are characterized by deviations in data distributions, and thus can be inferred from noisy observations based on statistical methods. In some scenarios, one can often obtain noisy observations aggregated from a chosen subset of processes. Such hierarchical search can further minimize the sample complexity while retaining accuracy. An anomaly search strategy should thus be designed based on multiple requirements, such as maximizing the detection accuracy; efficiency, be efficient in terms of sample complexity; and be able to cope with statistical models that are known only up to some missing parameters (i.e., composite hypotheses). In this paper, we consider anomaly detection with observations taken from a chosen subset of processes that conforms to a predetermined tree structure with partially known statistical model. We propose Hierarchical Dynamic Search (HDS), a sequential search strategy that uses two variations of the Generalized Log Likelihood Ratio (GLLR) statistic, and can be used for detection of multiple anomalies. HDS is shown to be order-optimal in terms of the size of the search space, and asymptotically optimal in terms of detection accuracy. An explicit upper bound on the error probability is established for the finite sample regime. In addition to extensive experiments on synthetic datasets, experiments have been conducted on the DARPA intrusion detection dataset, showing that HDS is superior to existing methods.

preprint2022arXiv

Channel Estimation with Hybrid Reconfigurable Intelligent Metasurfaces

Reconfigurable Intelligent Surfaces (RISs) are envisioned to play a key role in future wireless communications, enabling programmable radio propagation environments. They are usually considered as almost passive planar structures that operate as adjustable reflectors, giving rise to a multitude of implementation challenges, including the inherent difficulty in estimating the underlying wireless channels. In this paper, we focus on the recently conceived concept of Hybrid Reconfigurable Intelligent Surfaces (HRISs), which do not solely reflect the impinging waveform in a controllable fashion, but are also capable of sensing and processing an adjustable portion of it. We first present implementation details for this metasurface architecture and propose a convenient mathematical model for characterizing its dual operation. As an indicative application of HRISs in wireless communications, we formulate the individual channel estimation problem for the uplink of a multi-user HRIS-empowered communication system. Considering first a noise-free setting, we theoretically quantify the advantage of HRISs in notably reducing the amount of pilots needed for channel estimation, as compared to the case of purely reflective RISs. We then present closed-form expressions for the MSE performance in estimating the individual channels at the HRISs and the base station for the noisy model. Based on these derivations, we propose an automatic differentiation-based first-order optimization approach to efficiently determine the HRIS phase and power splitting configurations for minimizing the weighted sum-MSE performance. Our numerical evaluations demonstrate that HRISs do not only enable the estimation of the individual channels in HRIS-empowered communication systems, but also improve the ability to recover the cascaded channel, as compared to existing methods using passive and reflective RISs.

preprint2022arXiv

Channel Estimation with Simultaneous Reflecting and Sensing Reconfigurable Intelligent Metasurfaces

Reconfigurable Intelligent Surfaces (RISs) are envisioned to play a key role in future wireless communications, enabling programmable radio propagation environments. They are usually considered as nearly passive planar structures that operate as adjustable reflectors, giving rise to a multitude of implementation challenges, including an inherent difficulty in estimating the underlying wireless channels. In this paper, we propose the concept of Hybrid RISs (HRISs), which do not solely reflect the impinging waveform in a controllable fashion, but are also capable of sensing and processing a portion of it via some active reception elements. We first present implementation details for this novel metasurface architecture and propose a simple model for its operation, when considered for wireless communications. As an indicative application of HRISs, we formulate and solve the individual channels identification problem for the uplink of multi-user HRIS-empowered systems. Our numerical results showcase that, in the high signal-to-noise regime, HRISs enable individual channel estimation with notably reduced amounts of pilots, compared to those needed when using a purely reflective RIS that can only estimate the cascaded channel.

preprint2022arXiv

CNN-Aided Factor Graphs with Estimated Mutual Information Features for Seizure Detection

We propose a convolutional neural network (CNN) aided factor graphs assisted by mutual information features estimated by a neural network for seizure detection. Specifically, we use neural mutual information estimation to evaluate the correlation between different electroencephalogram (EEG) channels as features. We then use a 1D-CNN to extract extra features from the EEG signals and use both features to estimate the probability of a seizure event.~Finally, learned factor graphs are employed to capture the temporal correlation in the signal. Both sets of features from the neural mutual estimation and the 1D-CNN are used to learn the factor nodes. We show that the proposed method achieves state-of-the-art performance using 6-fold leave-four-patients-out cross-validation.

preprint2022arXiv

Decentralized Low-Latency Collaborative Inference via Ensembles on the Edge

The success of deep neural networks (DNNs) is heavily dependent on computational resources. While DNNs are often employed on cloud servers, there is a growing need to operate DNNs on edge devices. Edge devices are typically limited in their computational resources, yet, often multiple edge devices are deployed in the same environment and can reliably communicate with each other. In this work we propose to facilitate the application of DNNs on the edge by allowing multiple users to collaborate during inference to improve their accuracy. Our mechanism, coined {\em edge ensembles}, is based on having diverse predictors at each device, which form an ensemble of models during inference. To mitigate the communication overhead, the users share quantized features, and we propose a method for aggregating multiple decisions into a single inference rule. We analyze the latency induced by edge ensembles, showing that its performance improvement comes at the cost of a minor additional delay under common assumptions on the communication network. Our experiments demonstrate that collaborative inference via edge ensembles equipped with compact DNNs substantially improves the accuracy over having each user infer locally, and can outperform using a single centralized DNN larger than all the networks in the ensemble together.

preprint2022arXiv

Deep Learning Based Successive Interference Cancellation for the Non-Orthogonal Downlink

Non-orthogonal communications are expected to play a key role in future wireless systems. In downlink transmissions, the data symbols are broadcast from a base station to different users, which are superimposed with different power to facilitate high-integrity detection using successive interference cancellation (SIC). However, SIC requires accurate knowledge of both the channel model and channel state information (CSI), which may be difficult to acquire. We propose a deep learningaided SIC detector termed SICNet, which replaces the interference cancellation blocks of SIC by deep neural networks (DNNs). Explicitly, SICNet jointly trains its internal DNN-aided blocks for inferring the soft information representing the interfering symbols in a data-driven fashion, rather than using hard-decision decoders as in classical SIC. As a result, SICNet reliably detects the superimposed symbols in the downlink of non-orthogonal systems without requiring any prior knowledge of the channel model, while being less sensitive to CSI uncertainty than its model-based counterpart. SICNet is also robust to changes in the number of users and to their power allocation. Furthermore, SICNet learns to produce accurate soft outputs, which facilitates improved soft-input error correction decoding compared to model-based SIC. Finally, we propose an online training method for SICNet under block fading, which exploits the channel decoding for accurately recovering online data labels for retraining, hence, allowing it to smoothly track the fading envelope without requiring dedicated pilots. Our numerical results show that SICNet approaches the performance of classical SIC under perfect CSI, while outperforming it under realistic CSI uncertainty.

preprint2022arXiv

Deep-Learning-Aided Distributed Clock Synchronization for Wireless Networks

The proliferation of wireless communications networks over the past decades, combined with the scarcity of the wireless spectrum, have motivated a significant effort towards increasing the throughput of wireless networks. One of the major factors which limits the throughput in wireless communications networks is the accuracy of the time synchronization between the nodes in the network, as a higher throughput requires higher synchronization accuracy. Existing time synchronization schemes, and particularly, methods based on pulse-coupled oscillators (PCOs), which are the focus of the current work, have the advantage of simple implementation and achieve high accuracy when the nodes are closely located, yet tend to achieve poor synchronization performance for distant nodes. In this study, we propose a robust PCO-based time synchronization algorithm which retains the simple structure of existing approaches while operating reliably and converging quickly for both distant and closely located nodes. This is achieved by augmenting PCO-based synchronization with deep learning tools that are trainable in a distributed manner, thus allowing the nodes to train their neural network component of the synchronization algorithm without requiring additional exchange of information or central coordination. The numerical results show that our proposed deep learning-aided scheme is notably robust to propagation delays resulting from deployments over large areas, and to relative clock frequency offsets. It is also shown that the proposed approach rapidly attains full (i.e., clock frequency and phase) synchronization for all nodes in the wireless network, while the classic model-based implementation does not.

preprint2022arXiv

Deep-Learning-Assisted Configuration of Reconfigurable Intelligent Surfaces in Dynamic rich-scattering Environments

The integration of Reconfigurable Intelligent Surfaces (RISs) into wireless environments endows channels with programmability, and is expected to play a key role in future communication standards. To date, most RIS-related efforts focus on quasi-free-space, where wireless channels are typically modeled analytically. Many realistic communication scenarios occur, however, in rich-scattering environments which, moreover, evolve dynamically. These conditions present a tremendous challenge in identifying an RIS configuration that optimizes the achievable communication rate. In this paper, we make a first step toward tackling this challenge. Based on a simulator that is faithful to the underlying wave physics, we train a deep neural network as surrogate forward model to capture the stochastic dependence of wireless channels on the RIS configuration under dynamic rich-scattering conditions. Subsequently, we use this model in combination with a genetic algorithm to identify RIS configurations optimizing the communication rate. We numerically demonstrate the ability of the proposed approach to tune RISs to improve the achievable rate in rich-scattering setups.

preprint2022arXiv

Jointly Learned Symbol Detection and Signal Reflection in RIS-Aided Multi-user MIMO Systems

Reconfigurable Intelligent Surfaces (RISs) are regarded as a key technology for future wireless communications, enabling programmable radio propagation environments. However, the passive reflecting feature of RISs induces notable challenges on channel estimation, making coherent symbol detection a challenging task. In this paper, we consider the uplink of RIS-aided multi-user Multiple-Input Multiple-Output (MIMO) systems and propose a Machine Learning (ML) approach to jointly design the multi-antenna receiver and configure the RIS reflection coefficients, which does not require explicit full knowledge of the channel input-output relationship. Our approach devises a ML-based receiver, while the configurations of the RIS reflection patterns affecting the underlying propagation channel are treated as hyperparameters. Based on this system design formulation, we propose a Bayesian ML framework for optimizing the RIS hyperparameters, according to which the transmitted pilots are directly used to jointly tune the RIS and the multi-antenna receiver. Our simulation results demonstrate the capability of the proposed approach to provide reliable communications in non-linear channel conditions corrupted by Gaussian noise.

preprint2022arXiv

KalmanNet: Neural Network Aided Kalman Filtering for Partially Known Dynamics

State estimation of dynamical systems in real-time is a fundamental task in signal processing. For systems that are well-represented by a fully known linear Gaussian state space (SS) model, the celebrated Kalman filter (KF) is a low complexity optimal solution. However, both linearity of the underlying SS model and accurate knowledge of it are often not encountered in practice. Here, we present KalmanNet, a real-time state estimator that learns from data to carry out Kalman filtering under non-linear dynamics with partial information. By incorporating the structural SS model with a dedicated recurrent neural network module in the flow of the KF, we retain data efficiency and interpretability of the classic algorithm while implicitly learning complex dynamics from data. We demonstrate numerically that KalmanNet overcomes non-linearities and model mismatch, outperforming classic filtering methods operating with both mismatched and accurate domain knowledge.

preprint2022arXiv

MICAL: Mutual Information-Based CNN-Aided Learned Factor Graphs for Seizure Detection from EEG Signals

We develop a hybrid model-based data-driven seizure detection algorithm called Mutual Information-based CNNAided Learned factor graphs (MICAL) for detection of eclectic seizures from EEG signals. Our proposed method contains three main components: a neural mutual information (MI) estimator, 1D convolutional neural network (CNN), and factor graph inference. Since during seizure the electrical activity in one or more regions in the brain becomes correlated, we use neural MI estimators to measure inter-channel statistical dependence. We also design a 1D CNN to extract additional features from raw EEG signals. Since the soft estimates obtained as the combined features from the neural MI estimator and the CNN do not capture the temporal correlation between different EEG blocks, we use them not as estimates of the seizure state, but to compute the function nodes of a factor graph. The resulting factor graphs allows structured inference which exploits the temporal correlation for further improving the detection performance. On public CHB-MIT database, We conduct three evaluation approaches using the public CHB-MIT database, including 6-fold leave-four-patients-out cross-validation, all patient training; and per patient training. Our evaluations systematically demonstrate the impact of each element in MICAL through a complete ablation study and measuring six performance metrics. It is shown that the proposed method obtains state-of-the-art performance specifically in 6-fold leave-four-patients-out cross-validation and all patient training, demonstrating a superior generalizability.

preprint2022arXiv

Model-Based Deep Learning

Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Such model-based methods utilize mathematical formulations that represent the underlying physics, prior information and additional domain knowledge. Simple classical models are useful but sensitive to inaccuracies and may lead to poor performance when real systems display complex or dynamic behavior. On the other hand, purely data-driven approaches that are model-agnostic are becoming increasingly popular as datasets become abundant and the power of modern deep learning pipelines increases. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance, especially for supervised problems. However, DNNs typically require massive amounts of data and immense computational resources, limiting their applicability for some signal processing scenarios. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches. Such model-based deep learning methods exploit both partial domain knowledge, via mathematical structures designed for specific problems, as well as learning from limited data. In this article we survey the leading approaches for studying and designing model-based deep learning systems. We divide hybrid model-based/data-driven systems into categories based on their inference mechanism. We provide a comprehensive review of the leading approaches for combining model-based algorithms with deep learning in a systematic manner, along with concrete guidelines and detailed signal processing oriented examples from recent literature. Our aim is to facilitate the design and study of future systems on the intersection of signal processing and machine learning that incorporate the advantages of both domains.

preprint2022arXiv

Model-Based Deep Learning: On the Intersection of Deep Learning and Optimization

Decision making algorithms are used in a multitude of different applications. Conventional approaches for designing decision algorithms employ principled and simplified modelling, based on which one can determine decisions via tractable optimization. More recently, deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models, are becoming increasingly popular. Model-based optimization and data-centric deep learning are often considered to be distinct disciplines. Here, we characterize them as edges of a continuous spectrum varying in specificity and parameterization, and provide a tutorial-style presentation to the methodologies lying in the middle ground of this spectrum, referred to as model-based deep learning. We accompany our presentation with running examples in super-resolution and stochastic control, and show how they are expressed using the provided characterization and specialized in each of the detailed methodologies. The gains of combining model-based optimization and deep learning are demonstrated using experimental results in various applications, ranging from biomedical imaging to digital communications.

preprint2022arXiv

Multi-Level Group Testing with Application to One-Shot Pooled COVID-19 Tests

A key requirement in containing contagious diseases, such as the Coronavirus disease 2019 (COVID-19) pandemic, is the ability to efficiently carry out mass diagnosis over large populations. Some of the leading testing procedures, such as those utilizing qualitative polymerase chain reaction, involve using dedicated machinery which can simultaneously process a limited amount of samples. A candidate method to increase the test throughput is to examine pooled samples comprised of a mixture of samples from different patients. In this work we study pooling based tests which operate in a one-shot fashion, while providing an indication not solely on the presence of infection, but also on its level, without additional pool tests, as often required in COVID-19 testing. As these requirements limit the application of traditional group-testing (GT) methods, we propose a multi-level GT scheme, which builds upon GT principles to enable accurate recovery using much fewer tests than patients, while operating in a one-shot manner and providing multi-level indications. We provide a theoretical analysis of the proposed scheme and characterize conditions under which the algorithm operates reliably and at affordable computational complexity. Our numerical results demonstrate that multi level GT accurately and efficiently detects infection levels, while achieving improved performance over previously proposed one-shot COVID-19 pooled-testing methods.

preprint2022arXiv

On the Acquisition of Stationary Signals Using Uniform ADCs

In this work, we consider the acquisition of stationary signals using uniform analog-to-digital converters (ADCs), i.e., employing uniform sampling and scalar uniform quantization. We jointly optimize the pre-sampling and reconstruction filters to minimize the time-averaged mean-squared error (TMSE) in recovering the continuous-time input signal for a fixed sampling rate and quantizer resolution and obtain closed-form expressions for the minimal achievable TMSE. We show that the TMSE-minimizing pre-sampling filter omits aliasing and discards weak frequency components to resolve the remaining ones with higher resolution when the rate budget is small. In our numerical study, we validate our results and show that sub-Nyquist sampling often minimizes the TMSE under tight rate budgets at the output of the ADC.

preprint2022arXiv

PhysFad: Physics-Based End-to-End Channel Modeling of RIS-Parametrized Environments with Adjustable Fading

Programmable radio environments parametrized by reconfigurable intelligent surfaces (RISs) are emerging as a new wireless communications paradigm, but currently used channel models for the design and analysis of signal-processing algorithms cannot include fading in a manner that is faithful to the underlying wave physics. To overcome this roadblock, we introduce a physics-based end-to-end model of RIS-parametrized wireless channels with adjustable fading (coined PhysFad) which is based on a first-principles coupled-dipole formalism. PhysFad naturally incorporates the notions of space and causality, dispersion (i.e., frequency selectivity) and the intertwinement of each RIS element's phase and amplitude response, as well as any arising mutual coupling effects including long-range mesoscopic correlations. PhysFad offers the to-date missing tuning knob for adjustable fading. We thoroughly characterize PhysFad and demonstrate its capabilities for a prototypical problem of RIS-enabled over-the-air channel equalization in rich-scattering wireless communications. We also share a user-friendly version of our code to help the community transition towards physics-based models with adjustable fading.

preprint2022arXiv

Uncertainty in Data-Driven Kalman Filtering for Partially Known State-Space Models

Providing a metric of uncertainty alongside a state estimate is often crucial when tracking a dynamical system. Classic state estimators, such as the Kalman filter (KF), provide a time-dependent uncertainty measure from knowledge of the underlying statistics, however, deep learning based tracking systems struggle to reliably characterize uncertainty. In this paper, we investigate the ability of KalmanNet, a recently proposed hybrid model-based deep state tracking algorithm, to estimate an uncertainty measure. By exploiting the interpretable nature of KalmanNet, we show that the error covariance matrix can be computed based on its internal features, as an uncertainty measure. We demonstrate that when the system dynamics are known, KalmanNet-which learns its mapping from data without access to the statistics-provides uncertainty similar to that provided by the KF; and while in the presence of evolution model-mismatch, KalmanNet pro-vides a more accurate error estimation.

preprint2022arXiv

Wideband Multi-User MIMO Communications with Frequency Selective RISs: Element Response Modeling and Sum-Rate Maximization

Reconfigurable Intelligent Surfaces (RISs) are an emerging technology for future wireless communication systems, enabling improved coverage in an energy efficient manner. RISs are usually metasurfaces, constituting of two-dimensional arrangements of metamaterial elements, whose individual response is commonly modeled in the literature as an adjustable phase shifter. However, this model holds only for narrowband communications, and when wideband transmissions are utilized, one has to account for the frequency selectivity of metamaterials, whose response usually follows a Lorentzian-like profile. In this paper, we consider the uplink of a wideband RIS-empowered multi-user Multiple-Input Multiple-Output (MIMO) wireless system with Orthogonal Frequency Division Multiplexing (OFDM) signaling, while accounting for the frequency selectivity of RISs. In particular, we focus on designing the controllable parameters dictating the Lorentzian response of each RIS metamaterial element, in order to maximize the achievable sum rate. We devise a scheme combining block coordinate descent with penalty dual decomposition to tackle the resulting challenging optimization framework. Our simulation results reveal the achievable rates one can achieve using realistically frequency selective RISs in wideband settings, and quantify the performance loss that occurs when using state-of-the-art methods which assume that the RIS elements behave as frequency-flat phase shifters.

preprint2021arXiv

Federated Learning: A Signal Processing Perspective

The dramatic success of deep learning is largely due to the availability of data. Data samples are often acquired on edge devices, such as smart phones, vehicles and sensors, and in some cases cannot be shared due to privacy considerations. Federated learning is an emerging machine learning paradigm for training models across multiple edge devices holding local datasets, without explicitly exchanging the data. Learning in a federated manner differs from conventional centralized machine learning, and poses several core unique challenges and requirements, which are closely related to classical problems studied in the areas of signal processing and communications. Consequently, dedicated schemes derived from these areas are expected to play an important role in the success of federated learning and the transition of deep learning from the domain of centralized servers to mobile edge devices. In this article, we provide a unified systematic framework for federated learning in a manner that encapsulates and highlights the main challenges that are natural to treat using signal processing tools. We present a formulation for the federated learning paradigm from a signal processing perspective, and survey a set of candidate approaches for tackling its unique challenges. We further provide guidelines for the design and adaptation of signal processing and communication methods to facilitate federated learning at large scale.

preprint2021arXiv

Model-Based Machine Learning for Communications

We present an introduction to model-based machine learning for communication systems. We begin by reviewing existing strategies for combining model-based algorithms and machine learning from a high level perspective, and compare them to the conventional deep learning approach which utilizes established deep neural network (DNN) architectures trained in an end-to-end manner. Then, we focus on symbol detection, which is one of the fundamental tasks of communication receivers. We show how the different strategies of conventional deep architectures, deep unfolding, and DNN-aided hybrid algorithms, can be applied to this problem. The last two approaches constitute a middle ground between purely model-based and solely DNN-based receivers. By focusing on this specific task, we highlight the advantages and drawbacks of each strategy, and present guidelines to facilitate the design of future model-based deep learning systems for communications.

preprint2020arXiv

Data-Driven Factor Graphs for Deep Symbol Detection

Many important schemes in signal processing and communications, ranging from the BCJR algorithm to the Kalman filter, are instances of factor graph methods. This family of algorithms is based on recursive message passing-based computations carried out over graphical models, representing a factorization of the underlying statistics. Consequently, in order to implement these algorithms, one must have accurate knowledge of the statistical model of the considered signals. In this work we propose to implement factor graph methods in a data-driven manner. In particular, we propose to use machine learning (ML) tools to learn the factor graph, instead of the overall system task, which in turn is used for inference by message passing over the learned graph. We apply the proposed approach to learn the factor graph representing a finite-memory channel, demonstrating the resulting ability to implement BCJR detection in a data-driven fashion. We demonstrate that the proposed system, referred to as BCJRNet, learns to implement the BCJR algorithm from a small training set, and that the resulting receiver exhibits improved robustness to inaccurate training compared to the conventional channel-model-based receiver operating under the same level of uncertainty. Our results indicate that by utilizing ML tools to learn factor graphs from labeled data, one can implement a broad range of model-based algorithms, which traditionally require full knowledge of the underlying statistics, in a data-driven fashion.

preprint2020arXiv

Data-Driven Symbol Detection via Model-Based Machine Learning

The design of symbol detectors in digital communication systems has traditionally relied on statistical channel models that describe the relation between the transmitted symbols and the observed signal at the receiver. Here we review a data-driven framework to symbol detection design which combines machine learning (ML) and model-based algorithms. In this hybrid approach, well-known channel-model-based algorithms such as the Viterbi method, BCJR detection, and multiple-input multiple-output (MIMO) soft interference cancellation (SIC) are augmented with ML-based algorithms to remove their channel-model-dependence, allowing the receiver to learn to implement these algorithms solely from data. The resulting data-driven receivers are most suitable for systems where the underlying channel models are poorly understood, highly complex, or do not well-capture the underlying physics. Our approach is unique in that it only replaces the channel-model-based computations with dedicated neural networks that can be trained from a small amount of data, while keeping the general algorithm intact. Our results demonstrate that these techniques can yield near-optimal performance of model-based algorithms without knowing the exact channel input-output statistical relationship and in the presence of channel state information uncertainty.

preprint2020arXiv

DeepSIC: Deep Soft Interference Cancellation for Multiuser MIMO Detection

Digital receivers are required to recover the transmitted symbols from their observed channel output. In multiuser multiple-input multiple-output (MIMO) setups, where multiple symbols are simultaneously transmitted, accurate symbol detection is challenging. A family of algorithms capable of reliably recovering multiple symbols is based on interference cancellation. However, these methods assume that the channel is linear, a model which does not reflect many relevant channels, as well as require accurate channel state information (CSI), which may not be available. In this work we propose a multiuser MIMO receiver which learns to jointly detect in a data-driven fashion, without assuming a specific channel model or requiring CSI. In particular, we propose a data-driven implementation of the iterative soft interference cancellation (SIC) algorithm which we refer to as DeepSIC. The resulting symbol detector is based on integrating dedicated machine-learning (ML) methods into the iterative SIC algorithm. DeepSIC learns to carry out joint detection from a limited set of training samples without requiring the channel to be linear and its parameters to be known. Our numerical evaluations demonstrate that for linear channels with full CSI, DeepSIC approaches the performance of iterative SIC, which is comparable to the optimal performance, and outperforms previously proposed ML-based MIMO receivers. Furthermore, in the presence of CSI uncertainty, DeepSIC significantly outperforms model-based approaches. Finally, we show that DeepSIC accurately detects symbols in non-linear channels, where conventional iterative SIC fails even when accurate CSI is available.

preprint2020arXiv

Dynamic Metasurface Antennas for 6G Extreme Massive MIMO Communications

Next generation wireless base stations and access points will transmit and receive using extremely massive numbers of antennas. A promising technology for realizing such massive arrays in a dynamically controllable and scalable manner with reduced cost and power consumption utilizes surfaces of radiating metamaterial elements, known as metasurfaces. To date, metasurfaces are mainly considered in the context of wireless communications as passive reflecting devices, aiding conventional transceivers in shaping the propagation environment. This article presents an alternative application of metasurfaces for wireless communications as active reconfigurable antennas with advanced analog signal processing capabilities for next generation transceivers. We review the main characteristics of metasurfaces used for radiation and reception, and analyze their main advantages as well as their effect on the ability to reliably communicate in wireless networks. As current studies unveil only a portion of the potential of metasurfaces, we detail a list of exciting research and implementation challenges which arise from the application of metasurface antennas for wireless transceivers.

preprint2020arXiv

eSampling: Energy Harvesting ADCs

Analog-to-digital converters (ADCs) allow physical signals to be processed using digital hardware. The power consumed in conversion grows with the sampling rate and quantization resolution, imposing a major challenge in power-limited systems. A common ADC architecture is based on sample-and-hold (S/H) circuits, where the analog signal is being tracked only for a fraction of the sampling period. In this paper, we propose the concept of eSampling ADCs, which harvest energy from the analog signal during the time periods where the signal is not being tracked. This harvested energy can be used to supplement the ADC itself, paving the way to the possibility of zero-power consumption and power-saving ADCs. We analyze the tradeoff between the ability to recover the sampled signal and the energy harvested, and provide guidelines for setting the sampling rate in the light of accuracy and energy constraints. Our analysis indicates that eSampling ADCs operating with up to 12 bits per sample can acquire bandlimited analog signals such that they can be perfectly recovered without requiring power from the external source. Furthermore, our theoretical results reveal that eSampling ADCs can in fact save power by harvesting more energy than they consume. To verify the feasibility of eSampling ADCs, we present a circuit-level design using standard complementary metal oxide semiconductor (CMOS) 65 nm technology. An eSampling 8-bit ADC which samples at 40 MHZ is designed on a Cadence Virtuoso platform. Our experimental study involving Nyquist rate sampling of bandlimited signals demonstrates that such ADCs are indeed capable of harvesting more energy than that spent during analog-to-digital conversion, without affecting the accuracy.

preprint2020arXiv

Joint Transmit Beamforming for Multiuser MIMO Communication and MIMO Radar

Future wireless communication systems are expected to explore spectral bands typically used by radar systems, in order to overcome spectrum congestion of traditional communication bands. Since in many applications radar and communication share the same platform, spectrum sharing can be facilitated by joint design as dual function radar-communications system. In this paper, we propose a joint transmit beamforming model for a dual-function multiple-input-multiple-output (MIMO) radar and multiuser MIMO communication transmitter sharing the spectrum and an antenna array. The proposed dual-function system transmits the weighted sum of independent radar waveform and communication symbols, forming multiple beams towards the radar targets and the communication receivers, respectively. The design of the weighting coefficients is formulated as an optimization problem whose objective is the performance of the MIMO radar transmit beamforming, while guaranteeing that the signal-to-interference-plus-noise ratio (SINR) at each communication user is higher than a given threshold. Despite the non-convexity of the proposed optimization problem, it can be relaxed into a convex one, which can be solved in polynomial time, and we prove that the relaxation is tight. Then, we propose a reduced complexity design based on zero-forcing the inter-user interference and radar interference. Unlike previous works, which focused on the transmission of communication symbols to synthesize a radar transmit beam pattern, our method provides more degrees of freedom for MIMO radar and is thus able to obtain improved radar performance, as demonstrated in our simulation study. Furthermore, the proposed dual-function scheme approaches the radar performance of the radar-only scheme, i.e., without spectrum sharing, under reasonable communication quality constraints.

preprint2020arXiv

Task-Based Quantization with Application to MIMO Receivers

Multiple-input multiple-output (MIMO) systems are required to communicate reliably at high spectral bands using a large number of antennas, while operating under strict power and cost constraints. In order to meet these constraints, future MIMO receivers are expected to operate with low resolution quantizers, namely, utilize a limited number of bits for representing their observed measurements, inherently distorting the digital representation of the acquired signals. The fact that MIMO receivers use their measurements for some task, such as symbol detection and channel estimation, other than recovering the underlying analog signal, indicates that the distortion induced by bit-constrained quantization can be reduced by designing the acquisition scheme in light of the system task, i.e., by {\em task-based quantization}. In this work we survey the theory and design approaches to task-based quantization, presenting model-aware designs as well as data-driven implementations. Then, we show how one can implement a task-based bit-constrained MIMO receiver, presenting approaches ranging from conventional hybrid receiver architectures to structures exploiting the dynamic nature of metasurface antennas. This survey narrows the gap between theoretical task-based quantization and its implementation in practice, providing concrete algorithmic and hardware design principles for realizing task-based MIMO receivers.

preprint2020arXiv

The Rate Distortion Function of Asynchronously Sampled Memoryless Cyclostationary Gaussian Processes

Man-made communications signals are typically modelled as continuous-time (CT) wide-sense cyclostationary (WSCS) processes. As modern processing is digital, it operates on sampled versions of the CT signals. When sampling is applied to a CT WSCS process, the statistics of the resulting discrete-time (DT) process depends on the relationship between the sampling interval and the period of the statistics of the CT process: When these two parameters have a common integer factor, then the DT process is WSCS. This situation is referred to as synchronous sampling. When this is not the case, which is referred to as asynchronous sampling, the resulting DT process is wide-sense almost cyclostationary (WSACS). In this work, we study the fundamental tradeoff of sources codes applied to sampled CT WSCS processes, namely, their rate-distortion function (RDF). We note that RDF characterization for the case of synchronous sampling directly follows from classic information-theoretic tools utilizing ergodicity and the law of large numbers; however, when sampling is asynchronous, the resulting process is not information stable. In such cases, commonly used information-theoretic tools are inapplicable to RDF analysis, which poses a major challenge. Using the information spectrum framework, we show that the RDF for asynchronous sampling in the low distortion regime can be expressed as the limit superior of a sequence of RDFs in which each element corresponds to the RDF of a synchronously sampled WSCS process (but their limit is not guaranteed to exist). The resulting characterization allows us to introduce novel insights on the relationship between sampling synchronization and RDF. For example, we demonstrate that, differently from stationary processes, small differences in the sampling rate and the sampling time offset can notably affect the RDF of sampled CT WSCS processes.

preprint2020arXiv

UVeQFed: Universal Vector Quantization for Federated Learning

Traditional deep learning models are trained at a centralized server using labeled data samples collected from end devices or users. Such data samples often include private information, which the users may not be willing to share. Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data. In FL, each user trains its copy of the learning model locally. The server then collects the individual updates and aggregates them into a global model. A major challenge that arises in this method is the need of each user to efficiently transmit its learned model over the throughput limited uplink channel. In this work, we tackle this challenge using tools from quantization theory. In particular, we identify the unique characteristics associated with conveying trained models over rate-constrained channels, and propose a suitable quantization scheme for such settings, referred to as universal vector quantization for FL (UVeQFed). We show that combining universal vector quantization methods with FL yields a decentralized training system in which the compression of the trained models induces only a minimum distortion. We then theoretically analyze the distortion, showing that it vanishes as the number of users grows. We also characterize the convergence of models trained with the traditional federated averaging method combined with UVeQFed to the model which minimizes the loss function. Our numerical results demonstrate the gains of UVeQFed over previously proposed methods in terms of both distortion induced in quantization and accuracy of the resulting aggregated model.

preprint2019arXiv

A Block Sparsity Based Estimator for mmWave Massive MIMO Channels with Beam Squint

Multiple-input multiple-output (MIMO) millimeter wave (mmWave) communication is a key technology for next generation wireless networks. One of the consequences of utilizing a large number of antennas with an increased bandwidth is that array steering vectors vary among different subcarriers. Due to this effect, known as beam squint, the conventional channel model is no longer applicable for mmWave massive MIMO systems. In this paper, we study channel estimation under the resulting non-standard model. To that aim, we first analyze the beam squint effect from an array signal processing perspective, resulting in a model which sheds light on the angle-delay sparsity of mmWave transmission. We next design a compressive sensing based channel estimation algorithm which utilizes the shift-invariant block-sparsity of this channel model. The proposed algorithm jointly computes the off-grid angles, the off-grid delays, and the complex gains of the multi-path channel. We show that the newly proposed scheme reflects the mmWave channel more accurately and results in improved performance compared to traditional approaches. We then demonstrate how this approach can be applied to recover both the uplink as well as the downlink channel in frequency division duplex (FDD) systems, by exploiting the angle-delay reciprocity of mmWave channels.

preprint2019arXiv

MAJoRCom: A Dual-Function Radar Communication System Using Index Modulation

Dual-function radar communication (DFRC) systems implement both sensing and communication using the same hardware. Such schemes are often more efficient in terms of size, power, and cost, over using distinct radar and communication systems. Since these functionalities share resources such as spectrum, power, and antennas, DFRC methods typically entail some degradation in both radar and communication performance. In this work we propose a DFRC scheme based on the carrier agile phased array radar (CAESAR), which combines frequency and spatial agility. The proposed DFRC system, referred to as multi-carrier agile joint radar communication (MAJoRCom), exploits the inherent spatial and spectral randomness of CAESAR to convey digital messages in the form of index modulation. The resulting communication scheme naturally coexists with the radar functionality, and thus does not come at the cost of reduced radar performance. We analyze the performance of MAJoRCom, quantifying its achievable bit rate. In addition, we develop a low complexity decoder and a codebook design approach, which simplify the recovery of the communicated bits. Our numerical results demonstrate that MAJoRCom is capable of achieving a bit rate which is comparable to utilizing independent communication modules without affecting the radar performance, and that our proposed low-complexity decoder allows the receiver to reliably recover the transmitted symbols with an affordable computational burden.