Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
33works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

33 published item(s)

preprint2026arXiv

Context-Aware Wireless Token Communication via Joint Token Masking and Detection

The increasing use of token-based representations in language-driven applications has motivated wireless token communication, where tokens are treated as fundamental units for transmission. However, conventional communication systems overlook dependencies among tokens and allocate transmission resources uniformly, leading to inefficient use of limited wireless resources under channel impairments. In this paper, we propose a context-aware token communication framework that leverages a masked language model (MLM) as a shared contextual model between the transmitter (Tx) and receiver (Rx). At the Rx, we develop a context-aware token detection method that integrates channel likelihoods with MLM-based contextual priors under a Bayesian formulation, enabling robust token inference over noisy channels. At the Tx, we propose a context-aware token masking strategy that selectively omits tokens that can be reliably inferred at the Rx, allowing the available power budget to be concentrated on more informative tokens. These components are jointly designed through a shared MLM, establishing a unified Tx-Rx framework for efficient token transmission and detection. Simulation results demonstrate that the proposed framework significantly improves reconstruction performance compared to conventional and existing token communication schemes, achieving up to 1.77X and 1.63X performance gains on the Europarl corpus and WikiText-103 datasets, respectively.

preprint2026arXiv

Enabling Training-Free Semantic Communication Systems with Generative Diffusion Models

Semantic communication (SemCom) has recently emerged as a promising paradigm for next-generation wireless systems. Empowered by advanced artificial intelligence (AI) technologies, SemCom has achieved significant improvements in transmission quality and efficiency. However, existing SemCom systems either rely on training over large datasets and specific channel conditions or suffer from performance degradation under channel noise when operating in a training-free manner. To address these issues, we explore the use of generative diffusion models (GDMs) as training-free SemCom systems. Specifically, we design a semantic encoding and decoding method based on the inversion and sampling process of the denoising diffusion implicit model (DDIM), which introduces a two-stage forward diffusion process, split between the transmitter and receiver to enhance robustness against channel noise. Moreover, we optimize sampling steps to compensate for the increased noise level caused by channel noise. We also conduct a brief analysis to provide insights about this design. Simulations on the Kodak dataset validate that the proposed system outperforms the existing baseline SemCom systems across various metrics.

preprint2026arXiv

Rethinking Secure Semantic Communications in the Age of Generative and Agentic AI: Threats and Opportunities

Semantic communication (SemCom) improves communication efficiency by transmitting task-relevant information instead of raw bits and is expected to be a key technology for 6G networks. Recent advances in generative AI (GenAI) further enhance SemCom by enabling robust semantic encoding and decoding under limited channel conditions. However, these efficiency gains also introduce new security and privacy vulnerabilities. Due to the broadcast nature of wireless channels, eavesdroppers can also use powerful GenAI-based semantic decoders to recover private information from intercepted signals. Moreover, rapid advances in agentic AI enable eavesdroppers to perform long-term and adaptive inference through the integration of memory, external knowledge, and reasoning capabilities. This allows eavesdroppers to further infer user private behavior and intent beyond the transmitted content. Motivated by these emerging challenges, this paper comprehensively rethinks the security and privacy of SemCom systems in the age of generative and agentic AI. We first present a systematic taxonomy of eavesdropping threat models in SemCom systems. Then, we provide insights into how GenAI and agentic AI can enhance eavesdropping threats. Meanwhile, we also highlight potential opportunities for leveraging GenAI and agentic AI to design privacy-preserving SemCom systems.

preprint2026arXiv

Toward Scalable SDN for LEO Mega-Constellations: A Graph Learning Approach

Terrestrial network limitations drive the integration of non-terrestrial networks (NTNs), notably mega-constellations comprising thousands of low Earth orbit (LEO) satellites. While these satellites act as interconnected network switches via inter-satellite links (ISLs), their massive scale creates severe bottlenecks for network management. To address this, we propose a scalable, hierarchical software-defined networking (SDN) framework. Our architecture leverages graph neural networks (GNNs) to compactly represent the constellation topology, and Koopman theory to linearize nonlinear dynamics. Specifically, a Graph Koopman Autoencoder (GKAE) forecasts spatio-temporal behavior within a linear subspace for each orbital shell. A central SDN controller then aggregates these shell-level predictions for globally coordinated control. Simulations on the Starlink constellation demonstrate that our approach achieves at least a 42.8\% improvement in spatial compression and a 10.81\% improvement in temporal forecasting compared to established baselines, all while utilizing a significantly smaller model footprint.

preprint2022arXiv

A Unified View on Semantic Information and Communication: A Probabilistic Logic Approach

This article aims to provide a unified and technical approach to semantic information, communication, and their interplay through the lens of probabilistic logic. To this end, on top of the existing technical communication (TC) layer, we additionally introduce a semantic communication (SC) layer that exchanges logically meaningful clauses in knowledge bases. To make these SC and TC layers interact, we propose various measures based on the entropy of a clause in a knowledge base. These measures allow us to delineate various technical issues on SC such as a message selection problem for improving the knowledge at a receiver. Extending this, we showcase selected examples in which SC and TC layers interact with each other while taking into account constraints on physical channels.

preprint2022arXiv

Predictive Closed-Loop Remote Control over Wireless Two-Way Split Koopman Autoencoder

Real-time remote control over wireless is an important-yet-challenging application in 5G and beyond due to its mission-critical nature under limited communication resources. Current solutions hinge on not only utilizing ultra-reliable and low-latency communication (URLLC) links but also predicting future states, which may consume enormous communication resources and struggle with a short prediction time horizon. To fill this void, in this article we propose a novel two-way Koopman autoencoder (AE) approach wherein: 1) a sensing Koopman AE learns to understand the temporal state dynamics and predicts missing packets from a sensor to its remote controller; and 2) a controlling Koopman AE learns to understand the temporal action dynamics and predicts missing packets from the controller to an actuator co-located with the sensor. Specifically, each Koopman AE aims to learn the Koopman operator in the hidden layers while the encoder of the AE aims to project the non-linear dynamics onto a lifted subspace, which is reverted into the original non-linear dynamics by the decoder of the AE. The Koopman operator describes the linearized temporal dynamics, enabling long-term future prediction and coping with missing packets and closed-form optimal control in the lifted subspace. Simulation results corroborate that the proposed approach achieves a 38x lower mean squared control error at 0 dBm signal-to-noise ratio (SNR) than the non-predictive baseline.

preprint2022arXiv

Quantum Multi-Agent Reinforcement Learning via Variational Quantum Circuit Design

In recent years, quantum computing (QC) has been getting a lot of attention from industry and academia. Especially, among various QC research topics, variational quantum circuit (VQC) enables quantum deep reinforcement learning (QRL). Many studies of QRL have shown that the QRL is superior to the classical reinforcement learning (RL) methods under the constraints of the number of training parameters. This paper extends and demonstrates the QRL to quantum multi-agent RL (QMARL). However, the extension of QRL to QMARL is not straightforward due to the challenge of the noise intermediate-scale quantum (NISQ) and the non-stationary properties in classical multi-agent RL (MARL). Therefore, this paper proposes the centralized training and decentralized execution (CTDE) QMARL framework by designing novel VQCs for the framework to cope with these issues. To corroborate the QMARL framework, this paper conducts the QMARL demonstration in a single-hop environment where edge agents offload packets to clouds. The extensive demonstration shows that the proposed QMARL framework enhances 57.7% of total reward than classical frameworks.

preprint2022arXiv

Semantic Communication as a Signaling Game with Correlated Knowledge Bases

Semantic communication (SC) goes beyond technical communication in which a given sequence of bits or symbols, often referred to as information, is be transmitted reliably over a noisy channel, regardless of its meaning. In SC, conveying the meaning of information becomes important, which requires some sort of agreement between a sender and a receiver through their knowledge bases. In this sense, SC is closely related to a signaling game where a sender takes an action to send a signal that conveys information to a receiver, while the receiver can interpret the signal and choose a response accordingly. Based on the signaling game, we can build a SC model and characterize the performance in terms of mutual information in this paper. In addition, we show that the conditional mutual information between the instances of the knowledge bases of communicating parties plays a crucial role in improving the performance of SC.

preprint2022arXiv

Slimmable Quantum Federated Learning

Quantum federated learning (QFL) has recently received increasing attention, where quantum neural networks (QNNs) are integrated into federated learning (FL). In contrast to the existing static QFL methods, we propose slimmable QFL (SlimQFL) in this article, which is a dynamic QFL framework that can cope with time-varying communication channels and computing energy limitations. This is made viable by leveraging the unique nature of a QNN where its angle parameters and pole parameters can be separately trained and dynamically exploited. Simulation results corroborate that SlimQFL achieves higher classification accuracy than Vanilla QFL, particularly under poor channel conditions on average.

preprint2022arXiv

Towards Semantic Communication Protocols: A Probabilistic Logic Perspective

Classical medium access control (MAC) protocols are interpretable, yet their task-agnostic control signaling messages (CMs) are ill-suited for emerging mission-critical applications. By contrast, neural network (NN) based protocol models (NPMs) learn to generate task-specific CMs, but their rationale and impact lack interpretability. To fill this void, in this article we propose, for the first time, a semantic protocol model (SPM) constructed by transforming an NPM into an interpretable symbolic graph written in the probabilistic logic programming language (ProbLog). This transformation is viable by extracting and merging common CMs and their connections while treating the NPM as a CM generator. By extensive simulations, we corroborate that the SPM tightly approximates its original NPM while occupying only 0.02% memory. By leveraging its interpretability and memory-efficiency, we demonstrate several SPM-enabled applications such as SPM reconfiguration for collision-avoidance, as well as comparing different SPMs via semantic entropy calculation and storing multiple SPMs to cope with non-stationary environments.

preprint2022arXiv

Visual Transformer Meets CutMix for Improved Accuracy, Communication Efficiency, and Data Privacy in Split Learning

This article seeks for a distributed learning solution for the visual transformer (ViT) architectures. Compared to convolutional neural network (CNN) architectures, ViTs often have larger model sizes, and are computationally expensive, making federated learning (FL) ill-suited. Split learning (SL) can detour this problem by splitting a model and communicating the hidden representations at the split-layer, also known as smashed data. Notwithstanding, the smashed data of ViT are as large as and as similar as the input data, negating the communication efficiency of SL while violating data privacy. To resolve these issues, we propose a new form of CutSmashed data by randomly punching and compressing the original smashed data. Leveraging this, we develop a novel SL framework for ViT, coined CutMixSL, communicating CutSmashed data. CutMixSL not only reduces communication costs and privacy leakage, but also inherently involves the CutMix data augmentation, improving accuracy and scalability. Simulations corroborate that CutMixSL outperforms baselines such as parallelized SL and SplitFed that integrates FL with SL.

preprint2021arXiv

Communication Efficient Distributed Learning with Censored, Quantized, and Generalized Group ADMM

In this paper, we propose a communication-efficiently decentralized machine learning framework that solves a consensus optimization problem defined over a network of inter-connected workers. The proposed algorithm, Censored and Quantized Generalized GADMM (CQ-GGADMM), leverages the worker grouping and decentralized learning ideas of Group Alternating Direction Method of Multipliers (GADMM), and pushes the frontier in communication efficiency by extending its applicability to generalized network topologies, while incorporating link censoring for negligible updates after quantization. We theoretically prove that CQ-GGADMM achieves the linear convergence rate when the local objective functions are strongly convex under some mild assumptions. Numerical simulations corroborate that CQ-GGADMM exhibits higher communication efficiency in terms of the number of communication rounds and transmit energy consumption without compromising the accuracy and convergence speed, compared to the censored decentralized ADMM, and the worker grouping method of GADMM.

preprint2021arXiv

Mean-Field Game-Theoretic Edge Caching

In this book chapter, we study a problem of distributed content caching in an ultra-dense edge caching network (UDCN), in which a large number of small base stations (SBSs) prefetch popular files to cope with the ever-growing user demand in 5G and beyond. In a UDCN, even a small misprediction of user demand may render a large amount of prefetched data obsolete. Furtherproacmore, the interference variance is high due to the short inter-SBS distances, making it difficult to quantify data downloading rates. Lastly, since the caching decision of each SBS interacts with those of all other SBSs, the problem complexity of exponentially increases with the number of SBSs, which is unfit for UDCNs. To resolve such challenging issues while reflecting time-varying and location-dependent user demand, we leverage mean-field game (MFG) theory through which each SBS interacts only with a single virtual SBS whose state is drawn from the state distribution of the entire SBS population, i.e., mean-field (MF) distribution. This MF approximation asymptotically guarantees achieving the epsilon Nash equilibrium as the number of SBSs approaches infinity. To describe such an MFG-theoretic caching framework, this chapter aims to provide a brief review of MFG, and demonstrate its effectiveness for UDCNs.

preprint2021arXiv

Predictive Control and Communication Co-Design via Two-Way Gaussian Process Regression and AoI-Aware Scheduling

This article studies the joint problem of uplink-downlink scheduling and power allocation for controlling a large number of actuators that upload their states to remote controllers and download control actions over wireless links. To overcome the lack of wireless resources, we propose a machine learning-based solution, where only a fraction of actuators is controlled, while the rest of the actuators are actuated by locally predicting the missing state and/or action information using the previous uplink and/or downlink receptions via a Gaussian process regression (GPR). This GPR prediction credibility is determined using the age-of-information (AoI) of the latest reception. Moreover, the successful reception is affected by the transmission power, mandating a co-design of the communication and control operations. To this end, we formulate a network-wide minimization problem of the average AoI and transmission power under communication reliability and control stability constraints. To solve the problem, we propose a dynamic control algorithm using the Lyapunov drift-plus-penalty optimization framework. Numerical results corroborate that the proposed algorithm can stably control $2$x more number of actuators compared to an event-triggered scheduling baseline with Kalman filtering and frequency division multiple access, which is $18$x larger than a round-robin scheduling baseline.

preprint2021arXiv

RIS-Assisted Coverage Enhancement in Millimeter-Wave Cellular Networks

The use of millimeter-wave (mmWave) bandwidth is one key enabler to achieve the high data rates in the fifth-generation (5G) cellular systems. However, mmWave signals suffer from significant path loss due to high directivity and sensitivity to blockages, limiting its adoption within small-scale deployments. To enhance the coverage of mmWave communication in 5G and beyond, it is promising to deploy a large number of reconfigurable intelligent surfaces (RISs) that passively reflect mmWave signals towards desired directions. With this motivation, in this work we study the coverage of an RIS-assisted large-scale mmWave cellular network using stochastic geometry, and derive the peak reflection power expression of an RIS and the downlink signal-to-interference ratio (SIR) coverage expression in closed forms. These analytic results clarify the effectiveness of deploying RISs in the mmWave SIR coverage enhancement, while unveiling the major role of the density ratio between active base stations (BSs) and passive RISs. Furthermore, the results show that deploying passive reflectors is as effective as equipping BSs with more active antennas in the mmWave coverage enhancement. Simulation results confirm the tightness of the closed form expressions, corroborating our major findings based on the derived expressions.

preprint2021arXiv

Robust Blockchained Federated Learning with Model Validation and Proof-of-Stake Inspired Consensus

Federated learning (FL) is a promising distributed learning solution that only exchanges model parameters without revealing raw data. However, the centralized architecture of FL is vulnerable to the single point of failure. In addition, FL does not examine the legitimacy of local models, so even a small fraction of malicious devices can disrupt global training. To resolve these robustness issues of FL, in this paper, we propose a blockchain-based decentralized FL framework, termed VBFL, by exploiting two mechanisms in a blockchained architecture. First, we introduced a novel decentralized validation mechanism such that the legitimacy of local model updates is examined by individual validators. Second, we designed a dedicated proof-of-stake consensus mechanism where stake is more frequently rewarded to honest devices, which protects the legitimate local model updates by increasing their chances of dictating the blocks appended to the blockchain. Together, these solutions promote more federation within legitimate devices, enabling robust FL. Our emulation results of the MNIST classification corroborate that with 15% of malicious devices, VBFL achieves 87% accuracy, which is 7.4x higher than Vanilla FL.

preprint2021arXiv

Robustness and Diversity Seeking Data-Free Knowledge Distillation

Knowledge distillation (KD) has enabled remarkable progress in model compression and knowledge transfer. However, KD requires a large volume of original data or their representation statistics that are not usually available in practice. Data-free KD has recently been proposed to resolve this problem, wherein teacher and student models are fed by a synthetic sample generator trained from the teacher. Nonetheless, existing data-free KD methods rely on fine-tuning of weights to balance multiple losses, and ignore the diversity of generated samples, resulting in limited accuracy and robustness. To overcome this challenge, we propose robustness and diversity seeking data-free KD (RDSKD) in this paper. The generator loss function is crafted to produce samples with high authenticity, class diversity, and inter-sample diversity. Without real data, the objectives of seeking high sample authenticity and class diversity often conflict with each other, causing frequent loss fluctuations. We mitigate this by exponentially penalizing loss increments. With MNIST, CIFAR-10, and SVHN datasets, our experiments show that RDSKD achieves higher accuracy with more robustness over different hyperparameter settings, compared to other data-free KD methods such as DAFL, MSKD, ZSKD, and DeepInversion.

preprint2020arXiv

Communication and Consensus Co-Design for Distributed, Low-Latency and Reliable Wireless Systems

Designing distributed, fast and reliable wireless consensus protocols is instrumental in enabling mission-critical decentralized systems, such as robotic networks in the industrial Internet of Things (IIoT), drone swarms in rescue missions, and so forth. However, chasing both low-latency and reliability of consensus protocols is a challenging task. The problem is aggravated under wireless connectivity that may be slower and less reliable, compared to wired connections. To tackle this issue, we investigate fundamental relationships between consensus latency and reliability through the lens of wireless connectivity, and co-design communication and consensus protocols for low-latency and reliable decentralized systems. Specifically, we propose a novel communication-efficient distributed consensus protocol, termed Random Representative Consensus (R2C), and show its effectiveness under gossip and broadcast communication protocols. To this end, we derive a closed-form end-to-end (E2E) latency expression of the R2C that guarantees a target reliability, and compare it with a baseline consensus protocol, referred to as Referendum Consensus (RC). The result shows that the R2C is faster compared to the RC and more reliable compared when co-designed with the broadcast protocol compared to that with the gossip protocol.

preprint2020arXiv

Communication-Efficient and Distributed Learning Over Wireless Networks: Principles and Applications

Machine learning (ML) is a promising enabler for the fifth generation (5G) communication systems and beyond. By imbuing intelligence into the network edge, edge nodes can proactively carry out decision-making, and thereby react to local environmental changes and disturbances while experiencing zero communication latency. To achieve this goal, it is essential to cater for high ML inference accuracy at scale under time-varying channel and network dynamics, by continuously exchanging fresh data and ML model updates in a distributed way. Taming this new kind of data traffic boils down to improving the communication efficiency of distributed learning by optimizing communication payload types, transmission techniques, and scheduling, as well as ML architectures, algorithms, and data processing methods. To this end, this article aims to provide a holistic overview of relevant communication and ML principles, and thereby present communication-efficient and distributed learning frameworks with selected use cases.

preprint2020arXiv

Communication-Efficient Massive UAV Online Path Control: Federated Learning Meets Mean-Field Game Theory

This paper investigates the control of a massive population of UAVs such as drones. The straightforward method of control of UAVs by considering the interactions among them to make a flock requires a huge inter-UAV communication which is impossible to implement in real-time applications. One method of control is to apply the mean-field game (MFG) framework which substantially reduces communications among the UAVs. However, to realize this framework, powerful processors are required to obtain the control laws at different UAVs. This requirement limits the usage of the MFG framework for real-time applications such as massive UAV control. Thus, a function approximator based on neural networks (NN) is utilized to approximate the solutions of Hamilton-Jacobi-Bellman (HJB) and Fokker-Planck-Kolmogorov (FPK) equations. Nevertheless, using an approximate solution can violate the conditions for convergence of the MFG framework. Therefore, the federated learning (FL) approach which can share the model parameters of NNs at drones, is proposed with NN based MFG to satisfy the required conditions. The stability analysis of the NN based MFG approach is presented and the performance of the proposed FL-MFG is elaborated by the simulations.

preprint2020arXiv

Communication-Efficient Multimodal Split Learning for mmWave Received Power Prediction

The goal of this study is to improve the accuracy of millimeter wave received power prediction by utilizing camera images and radio frequency (RF) signals, while gathering image inputs in a communication-efficient and privacy-preserving manner. To this end, we propose a distributed multimodal machine learning (ML) framework, coined multimodal split learning (MultSL), in which a large neural network (NN) is split into two wirelessly connected segments. The upper segment combines images and received powers for future received power prediction, whereas the lower segment extracts features from camera images and compresses its output to reduce communication costs and privacy leakage. Experimental evaluation corroborates that MultSL achieves higher accuracy than the baselines utilizing either images or RF signals. Remarkably, without compromising accuracy, compressing the lower segment output by 16x yields 16x lower communication latency and 2.8% less privacy leakage compared to the case without compression.

preprint2020arXiv

Distributed Heteromodal Split Learning for Vision Aided mmWave Received Power Prediction

The goal of this work is the accurate prediction of millimeter-wave received power leveraging both radio frequency (RF) signals and heterogeneous visual data from multiple distributed cameras, in a communication and energy-efficient manner while preserving data privacy. To this end, firstly focusing on data privacy, we propose heteromodal split learning with feature aggregation (HetSLAgg) that splits neural network (NN) models into camera-side and base station (BS)-side segments. The BS-side NN segment fuses RF signals and uploaded image features without collecting raw images. However, the usage of multiple visual data leads to an increase in NN input dimensions, which gives rise to additional communication and energy costs. To overcome additional communication and energy costs due to image interpolation to blend different frame rates, we propose a novel BS-side manifold mixup technique that offloads the interpolation operations from cameras to a BS. Subsequently, we confront energy costs for operating a larger size of the BS- side NN segment due to concatenating image features across cameras and propose an energy-efficient aggregation method. This is done via a linear combination of image features instead of concatenating them, where the NN size is independent of the number of cameras. Comprehensive test-bed experiments with measured channels demonstrate that HetSLAgg reduces the prediction error by 44% compared to a baseline leveraging only RF received power. Moreover, the experiments show that the designed HetSLAgg achieves over 20% gains in terms of communication and energy cost reduction compared to several baseline designs within at most 1% of accuracy loss.

preprint2020arXiv

Extreme URLLC: Vision, Challenges, and Key Enablers

Notwithstanding the significant traction gained by ultra-reliable and low-latency communication (URLLC) in both academia and 3GPP standardization, fundamentals of URLLC remain elusive. Meanwhile, new immersive and high-stake control applications with much stricter reliability, latency and scalability requirements are posing unprecedented challenges in terms of system design and algorithmic solutions. This article aspires at providing a fresh and in-depth look into URLLC by first examining the limitations of 5G URLLC, and putting forward key research directions for the next generation of URLLC, coined eXtreme ultra-reliable and low-latency communication (xURLLC). xURLLC is underpinned by three core concepts: (1) it leverages recent advances in machine learning (ML) for faster and reliable data-driven predictions; (2) it fuses both radio frequency (RF) and non-RF modalities for modeling and combating rare events without sacrificing spectral efficiency; and (3) it underscores the much needed joint communication and control co-design, as opposed to the communication-centric 5G URLLC. The intent of this article is to spearhead beyond-5G/6G mission-critical applications by laying out a holistic vision of xURLLC, its research challenges and enabling technologies, while providing key insights grounded in selected use cases.

preprint2020arXiv

Federated Reinforcement Distillation with Proxy Experience Memory

In distributed reinforcement learning, it is common to exchange the experience memory of each agent and thereby collectively train their local models. The experience memory, however, contains all the preceding state observations and their corresponding policies of the host agent, which may violate the privacy of the agent. To avoid this problem, in this work, we propose a privacy-preserving distributed reinforcement learning (RL) framework, termed federated reinforcement distillation (FRD). The key idea is to exchange a proxy experience memory comprising a pre-arranged set of states and time-averaged policies, thereby preserving the privacy of actual experiences. Based on an advantage actor-critic RL architecture, we numerically evaluate the effectiveness of FRD and investigate how the performance of FRD is affected by the proxy memory structure and different memory exchanging rules.

preprint2020arXiv

GADMM: Fast and Communication Efficient Framework for Distributed Machine Learning

When the data is distributed across multiple servers, lowering the communication cost between the servers (or workers) while solving the distributed learning problem is an important problem and is the focus of this paper. In particular, we propose a fast, and communication-efficient decentralized framework to solve the distributed machine learning (DML) problem. The proposed algorithm, Group Alternating Direction Method of Multipliers (GADMM) is based on the Alternating Direction Method of Multipliers (ADMM) framework. The key novelty in GADMM is that it solves the problem in a decentralized topology where at most half of the workers are competing for the limited communication resources at any given time. Moreover, each worker exchanges the locally trained model only with two neighboring workers, thereby training a global model with a lower amount of communication overhead in each exchange. We prove that GADMM converges to the optimal solution for convex loss functions, and numerically show that it converges faster and more communication-efficient than the state-of-the-art communication-efficient algorithms such as the Lazily Aggregated Gradient (LAG) and dual averaging, in linear and logistic regression tasks on synthetic and real datasets. Furthermore, we propose Dynamic GADMM (D-GADMM), a variant of GADMM, and prove its convergence under the time-varying network topology of the workers.

preprint2020arXiv

Integrating LEO Satellite and UAV Relaying via Reinforcement Learning for Non-Terrestrial Networks

A mega-constellation of low-earth orbit (LEO) satellites has the potential to enable long-range communication with low latency. Integrating this with burgeoning unmanned aerial vehicle (UAV) assisted non-terrestrial networks will be a disruptive solution for beyond 5G systems provisioning large scale three-dimensional connectivity. In this article, we study the problem of forwarding packets between two faraway ground terminals, through an LEO satellite selected from an orbiting constellation and a mobile high-altitude platform (HAP) such as a fixed-wing UAV. To maximize the end-to-end data rate, the satellite association and HAP location should be optimized, which is challenging due to a huge number of orbiting satellites and the resulting time-varying network topology. We tackle this problem using deep reinforcement learning (DRL) with a novel action dimension reduction technique. Simulation results corroborate that our proposed method achieves up to 5.74x higher average data rate compared to a direct communication baseline without SAT and HAP.

preprint2020arXiv

L-FGADMM: Layer-Wise Federated Group ADMM for Communication Efficient Decentralized Deep Learning

This article proposes a communication-efficient decentralized deep learning algorithm, coined layer-wise federated group ADMM (L-FGADMM). To minimize an empirical risk, every worker in L-FGADMM periodically communicates with two neighbors, in which the periods are separately adjusted for different layers of its deep neural network. A constrained optimization problem for this setting is formulated and solved using the stochastic version of GADMM proposed in our prior work. Numerical evaluations show that by less frequently exchanging the largest layer, L-FGADMM can significantly reduce the communication cost, without compromising the convergence speed. Surprisingly, despite less exchanged information and decentralized operations, intermittently skipping the largest layer consensus in L-FGADMM creates a regularizing effect, thereby achieving the test accuracy as high as federated learning (FL), a baseline method with the entire layer consensus by the aid of a central entity.

preprint2020arXiv

Mix2FLD: Downlink Federated Learning After Uplink Federated Distillation With Two-Way Mixup

This letter proposes a novel communication-efficient and privacy-preserving distributed machine learning framework, coined Mix2FLD. To address uplink-downlink capacity asymmetry, local model outputs are uploaded to a server in the uplink as in federated distillation (FD), whereas global model parameters are downloaded in the downlink as in federated learning (FL). This requires a model output-to-parameter conversion at the server, after collecting additional data samples from devices. To preserve privacy while not compromising accuracy, linearly mixed-up local samples are uploaded, and inversely mixed up across different devices at the server. Numerical evaluations show that Mix2FLD achieves up to 16.7% higher test accuracy while reducing convergence time by up to 18.8% under asymmetric uplink-downlink channels compared to FL.

preprint2020arXiv

Predictive Control and Communication Co-Design: A Gaussian Process Regression Approach

While Remote control over wireless connections is a key enabler for scalable control systems consisting of multiple actuator-sensor pairs, i.e., control systems, it entails two technical challenges. Due to the lack of wireless resources, only a limited number of control systems can be served, making the state observations outdated. Further, even after scheduling, the state observations received through wireless channels are distorted, hampering control stability. To address these issues, in this article we propose a scheduling algorithm that guarantees the age-of-information (AoI) of the last received states. Meanwhile, for non-scheduled sensor-actuator pairs, we propose a machine learning (ML) aided predictive control algorithm, in which states are predicted using a Gaussian process regression (GPR). Since the GPR prediction credibility decreases with the AoI of the input data, both predictive control and AoI-based scheduler should be co-designed. Hence, we formulate a joint scheduling and transmission power optimization via the Lyapunov optimization framework. Numerical simulations corroborate that the proposed co-designed predictive control and AoI based scheduling achieves lower control errors, compared to a benchmark scheme using a round-robin scheduler without state prediction.

preprint2020arXiv

Proxy Experience Replay: Federated Distillation for Distributed Reinforcement Learning

Traditional distributed deep reinforcement learning (RL) commonly relies on exchanging the experience replay memory (RM) of each agent. Since the RM contains all state observations and action policy history, it may incur huge communication overhead while violating the privacy of each agent. Alternatively, this article presents a communication-efficient and privacy-preserving distributed RL framework, coined federated reinforcement distillation (FRD). In FRD, each agent exchanges its proxy experience replay memory (ProxRM), in which policies are locally averaged with respect to proxy states clustering actual states. To provide FRD design insights, we present ablation studies on the impact of ProxRM structures, neural network architectures, and communication intervals. Furthermore, we propose an improved version of FRD, coined mixup augmented FRD (MixFRD), in which ProxRM is interpolated using the mixup data augmentation algorithm. Simulations in a Cartpole environment validate the effectiveness of MixFRD in reducing the variance of mission completion time and communication cost, compared to the benchmark schemes, vanilla FRD, federated reinforcement learning (FRL), and policy distillation (PD).

preprint2020arXiv

Towards Enabling Critical mMTC: A Review of URLLC within mMTC

Massive machine-type communication (mMTC) and ultra-reliable and low-latency communication (URLLC) are two key service types in the fifth-generation (5G) communication systems, pursuing scalability and reliability with low-latency, respectively. These two extreme services are envisaged to agglomerate together into \emph{critical mMTC} shortly with emerging use cases (e.g., wide-area disaster monitoring, wireless factory automation), creating new challenges to designing wireless systems beyond 5G. While conventional network slicing is effective in supporting a simple mixture of mMTC and URLLC, it is difficult to simultaneously guarantee the reliability, latency, and scalability requirements of critical mMTC (e.g., < 4ms latency, $10^6$ devices/km$^2$ for factory automation) with limited radio resources. Furthermore, recently proposed solutions to scalable URLLC (e.g., machine learning aided URLLC for driverless vehicles) are ill-suited to critical mMTC whose machine type users have minimal energy budget and computing capability that should be (tightly) optimized for given tasks. To this end, our paper aims to characterize promising use cases of critical mMTC and search for their possible solutions. To this end, we first review the state-of-the-art (SOTA) technologies for separate mMTC and URLLC services and then identify key challenges from conflicting SOTA requirements, followed by potential approaches to prospective critical mMTC solutions at different layers.

preprint2020arXiv

When Wireless Communications Meet Computer Vision in Beyond 5G

This article articulates the emerging paradigm, sitting at the confluence of computer vision and wireless communication, to enable beyond-5G/6G mission-critical applications (autonomous/remote-controlled vehicles, visuo-haptic VR, and other cyber-physical applications). First, drawing on recent advances in machine learning and the availability of non-RF data, vision-aided wireless networks are shown to significantly enhance the reliability of wireless communication without sacrificing spectral efficiency. In particular, we demonstrate how computer vision enables {look-ahead} prediction in a millimeter-wave channel blockage scenario, before the blockage actually happens. From a computer vision perspective, we highlight how radio frequency (RF) based sensing and imaging are instrumental in robustifying computer vision applications against occlusion and failure. This is corroborated via an RF-based image reconstruction use case, showcasing a receiver-side image failure correction resulting in reduced retransmission and latency. Taken together, this article sheds light on the much-needed convergence of RF and non-RF modalities to enable ultra-reliable communication and truly intelligent 6G networks.

preprint2020arXiv

XOR Mixup: Privacy-Preserving Data Augmentation for One-Shot Federated Learning

User-generated data distributions are often imbalanced across devices and labels, hampering the performance of federated learning (FL). To remedy to this non-independent and identically distributed (non-IID) data problem, in this work we develop a privacy-preserving XOR based mixup data augmentation technique, coined XorMixup, and thereby propose a novel one-shot FL framework, termed XorMixFL. The core idea is to collect other devices&#39; encoded data samples that are decoded only using each device&#39;s own data samples. The decoding provides synthetic-but-realistic samples until inducing an IID dataset, used for model training. Both encoding and decoding procedures follow the bit-wise XOR operations that intentionally distort raw samples, thereby preserving data privacy. Simulation results corroborate that XorMixFL achieves up to 17.6% higher accuracy than Vanilla FL under a non-IID MNIST dataset.