Researcher profile

Mikael Skoglund

Mikael Skoglund contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
32works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

32 published item(s)

preprint2026arXiv

A Hierarchical Sampling Framework for bounding the Generalization Error of Federated Learning

We study expected generalization bounds for the Hierarchical Federated Learning (HFL) setup using Wasserstein distance. We introduce a generalized framework in which data is sampled hierarchically, and we model it with a multi-layered tree structure that induces dependencies among the clients' datasets. We derive generalization bounds in terms of Wasserstein distance under the Lipschitz assumption on the loss function, by applying a supersample construction that allows us to measure the sensitivity of the algorithm to the change of a single node in the sampling tree. By leveraging the FL structure, we recover and strictly imply existing state-of-the-art conditional mutual information (CMI) bounds in the case of bounded losses. We also show that our bound can be applied together with Differential Privacy assumptions, to recover generalization bounds based on algorithmic privacy. To assess the tightness of our bounds, we study the Gaussian Location Model (GLM) and show that we recover the actual asymptotic rate of the generalization error.

preprint2026arXiv

Dobrushin Coefficients of Private Mechanisms Beyond Local Differential Privacy

We investigate Dobrushin coefficients of discrete Markov kernels that have bounded pointwise maximal leakage (PML) with respect to all distributions with a minimum probability mass bounded away from zero by a constant $c>0$. This definition recovers local differential privacy (LDP) for $c\to 0$. We derive achievable bounds on contraction in terms of a kernels PML guarantees, and provide mechanism constructions that achieve the presented bounds. Further, we extend the results to general $f$-divergences by an application of Binette's inequality. Our analysis yields tighter bounds for mechanisms satisfying LDP and extends beyond the LDP regime to any discrete kernel.

preprint2026arXiv

Generalizing the Fano inequality further

Interactive statistical decision making (ISDM) features algorithm-dependent data generated through interaction. Existing information-theoretic lower bounds in ISDM largely target expected risk, while tail-sensitive objectives are less developed. We generalize the interactive Fano framework of Chen et al. by replacing the hard success event with a randomized one-bit statistic representing an arbitrary bounded transform of the loss. This yields a Bernoulli f-divergence inequality, which we invert to obtain a two-sided interval for the transform, recovering the previous result as a special case. Instantiating the transform with a bounded hinge and using the Rockafellar-Uryasev representation, we derive lower bounds on the prior-predictive (Bayesian) CVaR of bounded losses. For KL divergence with the mixture reference distribution, the bound becomes explicit in terms of mutual information via Pinsker's inequality.

preprint2026arXiv

Land-then-transport: A Flow Matching-Based Generative Decoder for Wireless Image Transmission

Due to strict rate and reliability demands, wireless image transmission remains difficult for both classical layered designs and joint source-channel coding (JSCC), especially under low latency. Diffusion-based generative decoders can deliver strong perceptual quality by leveraging learned image priors, but iterative stochastic denoising leads to high decoding delay. To enable low-latency decoding, we propose a flow-matching (FM) generative decoder under a new land-then-transport (LTT) paradigm that tightly integrates the physical wireless channel into a continuous-time probability flow. For AWGN channels, we build a Gaussian smoothing path whose noise schedule indexes effective noise levels, and derive a closed-form teacher velocity field along this path. A neural-network student vector field is trained by conditional flow matching, yielding a deterministic, channel-aware ODE decoder with complexity linear in the number of ODE steps. At inference, it only needs an estimate of the effective noise variance to set the ODE starting time. We further show that Rayleigh fading and MIMO channels can be mapped, via linear MMSE equalization and singular-value-domain processing, to AWGN-equivalent channels with calibrated starting times. Therefore, the same probability path and trained velocity field can be reused for Rayleigh and MIMO without retraining. Experiments on MNIST, Fashion-MNIST, and DIV2K over AWGN, Rayleigh, and MIMO demonstrate consistent gains over JPEG2000+LDPC, DeepJSCC, and diffusion-based baselines, while achieving good perceptual quality with only a few ODE steps. Overall, LTT provides a deterministic, physically interpretable, and computation-efficient framework for generative wireless image decoding across diverse channels.

preprint2026arXiv

On the Impact of Channel Aging and Doppler-Affected Clutter on OFDM ISAC Systems

The temporal evolution of the propagation environment plays a central role in integrated sensing and communication (ISAC) systems. A slow-time evolution manifests as channel aging in communication links, while a fast-time one is associated with structured clutter with non-zero Doppler. Nevertheless, the joint impact of these two phenomena on ISAC performance has been largely overlooked. This addresses this research gap in a network utilizing orthogonal frequency division multiplexing waveforms. Here, a base station simultaneously serves multiple user equipment (UE) devices and performs monostatic sensing. Channel aging is captured through an autoregressive model with exponential correlation decay. In contrast, clutter is modeled as a collection of uncorrelated, coherent patches with non-zero Doppler, resulting in a Kronecker-separable covariance structure. We propose an aging-aware channel estimator that uses prior pilot observations to estimate the time-varying UE channels, characterized by a non-isotropic multipath fading structure. The clutter's structure enables a novel low-complexity sensing pipeline: clutter statistics are estimated from raw data and subsequently used to suppress the clutter's action, after which target parameters are extracted through range-angle and range-velocity maps. We evaluate the influence of frame length and pilot history on channel estimation accuracy and demonstrate substantial performance gains over block fading in low-to-moderate mobility regimes. The sensing pipeline is implemented in a clutter-dominated environment, demonstrating that effective clutter suppression can be achieved under practical configurations. Furthermore, our results show that dedicated sensing streams are required, as communication beams provide insufficient range resolution.

preprint2026arXiv

Privacy-Utility Trade-offs Under Multi-Level Point-Wise Leakage Constraints

An information-theoretic privacy mechanism design is studied, where an agent observes useful data $Y$ which is correlated with the private data $X$. The agent wants to reveal the information to a user, hence, the agent utilizes a privacy mechanism to produce disclosed data $U$ that can be revealed. We assume that the agent has no direct access to $X$, i.e., the private data is hidden. We study privacy mechanism design that maximizes the disclosed information about $Y$, measured by the mutual information between $Y$ and $U$, while satisfying a point-wise constraint with different privacy leakage budgets. We introduce a new measure, called the \emph{multi-level point-wise leakage}, which allows us to impose different leakage levels for different realizations of $U$. In contrast to previous studies on point-wise measures, which use the same leakage level for each realization, we consider a more general scenario in which each data point can leak information up to a different threshold. As a result, this concept also covers cases in which some data points should not leak any information about the private data, i.e., they must satisfy perfect privacy. In other words, a combination of perfect privacy and non-zero leakage can be considered. When the leakage is sufficiently small, concepts from information geometry allow us to locally approximate the mutual information. We show that when the leakage matrix $P_{X|Y}$ is invertible, utilizing this approximation leads to a quadratic optimization problem that has closed-form solution under some constraints. In particular, we show that it is sufficient to consider only binary $U$ to attain the optimal utility. This leads to simple privacy designs with low complexity which are based on finding the maximum singular value and singular vector of a matrix.

preprint2026arXiv

Sparse Point-wise Privacy Leakage: Mechanism Design and Fundamental Limits

We study an information-theoretic privacy mechanism design problem, where an agent observes useful data $Y$ that is arbitrarily correlated with sensitive data $X$, and design disclosed data $U$ generated from $Y$ (the agent has no direct access to $X$). We introduce \emph{sparse point-wise privacy leakage}, a worst-case privacy criterion that enforces two simultaneous constraints for every disclosed symbol $u\in\mathcal{U}$: (i) $u$ may be correlated with at most $N$ realizations of $X$, and (ii) the total leakage toward those realizations is bounded. In the high-privacy regime, we use concepts from information geometry to obtain a local quadratic approximation of mutual information which measures utility between $U$ and $Y$. When the leakage matrix $P_{X|Y}$ is invertible, this approximation reduces the design problem to a sparse quadratic maximization, known as the Rayleigh-quotient problem, with an $\ell_0$ constraint. We further show that, for the approximated problem, one can without loss of optimality restrict attention to a binary released variable $U$ with a uniform distribution. For small alphabet sizes, the exact sparsity-constrained optimum can be computed via combinatorial support enumeration, which quickly becomes intractable as the dimension grows. For general dimensions, the resulting sparse Rayleigh-quotient maximization is NP-hard and closely related to sparse principal component analysis (PCA). We propose a convex semidefinite programming (SDP) relaxation that is solvable in polynomial time and provides a tractable surrogate for the NP-hard design, together with a simple rounding procedure to recover a feasible leakage direction. We also identify a sparsity threshold beyond which the sparse optimum saturates at the unconstrained spectral value and the SDP relaxation becomes tight.

preprint2023arXiv

Bounds for Privacy-Utility Trade-off with Non-zero Leakage

The design of privacy mechanisms for two scenarios is studied where the private data is hidden or observable. In the first scenario, an agent observes useful data $Y$, which is correlated with private data $X$, and wants to disclose the useful information to a user. A privacy mechanism is employed to generate data $U$ that maximizes the revealed information about $Y$ while satisfying a privacy criterion. In the second scenario, the agent has additionally access to the private data. To this end, the Functional Representation Lemma and Strong Functional Representation Lemma are extended relaxing the independence condition and thereby allowing a certain leakage. Lower bounds on privacy-utility trade-off are derived for the second scenario as well as upper bounds for both scenarios. In particular, for the case where no leakage is allowed, our upper and lower bounds improve previous bounds.

preprint2022arXiv

An Information-Theoretic Analysis of Bayesian Reinforcement Learning

Building on the framework introduced by Xu and Raginksy [1] for supervised learning problems, we study the best achievable performance for model-based Bayesian reinforcement learning problems. With this purpose, we define minimum Bayesian regret (MBR) as the difference between the maximum expected cumulative reward obtainable either by learning from the collected data or by knowing the environment and its dynamics. We specialize this definition to reinforcement learning problems modeled as Markov decision processes (MDPs) whose kernel parameters are unknown to the agent and whose uncertainty is expressed by a prior distribution. One method for deriving upper bounds on the MBR is presented and specific bounds based on the relative entropy and the Wasserstein distance are given. We then focus on two particular cases of MDPs, the multi-armed bandit problem (MAB) and the online optimization with partial feedback problem. For the latter problem, we show that our bounds can recover from below the current information-theoretic bounds by Russo and Van Roy [2].

preprint2022arXiv

Asynchronous Parallel Incremental Block-Coordinate Descent for Decentralized Machine Learning

Machine learning (ML) is a key technique for big-data-driven modelling and analysis of massive Internet of Things (IoT) based intelligent and ubiquitous computing. For fast-increasing applications and data amounts, distributed learning is a promising emerging paradigm since it is often impractical or inefficient to share/aggregate data to a centralized location from distinct ones. This paper studies the problem of training an ML model over decentralized systems, where data are distributed over many user devices and the learning algorithm run on-device, with the aim of relaxing the burden at a central entity/server. Although gossip-based approaches have been used for this purpose in different use cases, they suffer from high communication costs, especially when the number of devices is large. To mitigate this, incremental-based methods are proposed. We first introduce incremental block-coordinate descent (I-BCD) for the decentralized ML, which can reduce communication costs at the expense of running time. To accelerate the convergence speed, an asynchronous parallel incremental BCD (API-BCD) method is proposed, where multiple devices/agents are active in an asynchronous fashion. We derive convergence properties for the proposed methods. Simulation results also show that our API-BCD method outperforms state of the art in terms of running time and communication costs.

preprint2022arXiv

Bounds for Privacy-Utility Trade-off with Per-letter Privacy Constraints and Non-zero Leakage

An information theoretic privacy mechanism design problem for two scenarios is studied where the private data is either hidden or observable. In each scenario, privacy leakage constraints are considered using two different measures. In these scenarios the private data is hidden or observable. In the first scenario, an agent observes useful data $Y$ that is correlated with private data $X$, and wishes to disclose the useful information to a user. A privacy mechanism is designed to generate disclosed data $U$ which maximizes the revealed information about $Y$ while satisfying a per-letter privacy constraint. In the second scenario, the agent has additionally access to the private data. First, the Functional Representation Lemma and Strong Functional Representation Lemma are extended by relaxing the independence condition to find a lower bound considering the second scenario. Next, lower bounds as well as upper bounds on privacy-utility trade-off are derived for both scenarios. In particular, for the case where $X$ is deterministic function of $Y$, we show that our upper and lower bounds are asymptotically optimal considering the first scenario.

preprint2022arXiv

Continuous-Time Channel Gain Control for Minimum-Information Kalman-Bucy Filtering

We consider the problem of estimating a continuous-time Gauss-Markov source process observed through a vector Gaussian channel with an adjustable channel gain matrix. For a given (generally time-varying) channel gain matrix, we provide formulas to compute (i) the mean-square estimation error attainable by the classical Kalman-Bucy filter, and (ii) the mutual information between the source process and its Kalman-Bucy estimate. We then formulate a novel "optimal channel gain control problem" where the objective is to control the channel gain matrix strategically to minimize the weighted sum of these two performance metrics. To develop insights into the optimal solution, we first consider the problem of controlling a time-varying channel gain over a finite time interval. A necessary optimality condition is derived based on Pontryagin's minimum principle. For a scalar system, we show that the optimal channel gain is a piece-wise constant signal with at most two switches. We also consider the problem of designing the optimal time-invariant gain to minimize the average cost over an infinite time horizon. A novel semidefinite programming (SDP) heuristic is proposed and the exactness of the solution is discussed.

preprint2022arXiv

Goodput Maximization with Quantized Feedback in the Finite Blocklength Regime for Quasi-Static Channels

In this paper, we study a quantized feedback scheme to maximize the goodput of a finite blocklength communication scenario over a quasi-static fading channel. It is assumed that the receiver has perfect channel state information (CSI) and sends back the CSI to the transmitter over a resolution-limited error-free feedback channel. With this partial CSI, the transmitter is supposed to select the optimum transmission rate, such that it maximizes the overall goodput of the communication system. This problem has been studied for the asymptotic blocklength regime, however, no solution has so far been presented for finite blocklength. Here, we study this problem in two cases: with and without constraint on reliability. We first formulate the optimization problems and analytically solve them. Iterative algorithms that successfully exploit the system parameters for both cases are presented. It is shown that although the achievable maximum goodput decreases with shorter blocklengths and higher reliability requirements, significant improvement can be achieved even with coarsely quantized feedback schemes.

preprint2022arXiv

Privacy signaling games with binary alphabets

In this paper, we consider a privacy signaling game problem for binary alphabets and single-bit transmission where a transmitter has a pair of messages, one of which is a casual message that needs to be conveyed, whereas the other message contains sensitive data and needs to be protected. The receiver wishes to estimate both messages to acquire as much information as possible. For this setup, we study the interactions between the transmitter and the receiver with non-aligned information-theoretic objectives (modeled by mutual information and hamming distance) due to the privacy concerns of the transmitter. We derive conditions under which Nash and/or Stackelberg equilibria exist and identify the optimal responses of the encoder and decoders strategies for each type of game. One particularly surprising result is that when both types of equilibria exist, they admit the same encoding and decoding strategies. We corroborate our analysis with simulation studies.

preprint2022arXiv

Secure Source Coding with Side-information at Decoder and Shared Key at Encoder and Decoder

We study the problem of rate-distortion-equivocation with side-information only available at the decoder when an independent private random key is shared between the sender and the receiver. The sender compresses the sequence, and the receiver reconstructs it such that the average distortion between the source and the output is limited. The equivocation is measured at an eavesdropper that intercepts the source encoded message, utilizing side-information correlated with the source and the side-information at the decoder. We have derived the entire achievable rate-distortion-equivocation region for this problem.

preprint2022arXiv

Tighter expected generalization error bounds via Wasserstein distance

This work presents several expected generalization error bounds based on the Wasserstein distance. More specifically, it introduces full-dataset, single-letter, and random-subset bounds, and their analogues in the randomized subsample setting from Steinke and Zakynthinou [1]. Moreover, when the loss function is bounded and the geometry of the space is ignored by the choice of the metric in the Wasserstein distance, these bounds recover from below (and thus, are tighter than) current bounds based on the relative entropy. In particular, they generate new, non-vacuous bounds based on the relative entropy. Therefore, these results can be seen as a bridge between works that account for the geometry of the hypothesis space and those based on the relative entropy, which is agnostic to such geometry. Furthermore, it is shown how to produce various new bounds based on different information measures (e.g., the lautum information or several $f$-divergences) based on these bounds and how to derive similar bounds with respect to the backward channel using the presented proof techniques.

preprint2021arXiv

A Learning-Based Approach to Address Complexity-Reliability Tradeoff in OS Decoders

In this paper, we study the tradeoffs between complexity and reliability for decoding large linear block codes. We show that using artificial neural networks to predict the required order of an ordered statistics based decoder helps in reducing the average complexity and hence the latency of the decoder. We numerically validate the approach through Monte Carlo simulations.

preprint2021arXiv

Latency and Reliability Trade-off with Computational Complexity Constraints: OS Decoders and Generalizations

In this paper, we study the problem of latency and reliability trade-off in ultra-reliable low-latency communication (URLLC) in the presence of decoding complexity constraints. We consider linear block encoded codewords transmitted over a binary-input AWGN channel and decoded with order-statistic (OS) decoder. We first investigate the performance of OS decoders as a function of decoding complexity and propose an empirical model that accurately quantifies the corresponding trade-off. Next, a consistent way to compute the aggregate latency for complexity constrained receivers is presented, where the latency due to decoding is also included. It is shown that, with strict latency requirements, decoding latency cannot be neglected in complexity constrained receivers. Next, based on the proposed model, several optimization problems, relevant to the design of URLLC systems, are introduced and solved. It is shown that the decoding time has a drastic effect on the design of URLLC systems when constraints on decoding complexity are considered. Finally, it is also illustrated that the proposed model can closely describe the performance versus complexity trade-off for other candidate coding solutions for URLLC such as tail-biting convolutional codes, polar codes, and low-density parity-check codes.

preprint2021arXiv

Quadratic Signaling Games with Channel Combining Ratio

In this study, Nash and Stackelberg equilibria of single-stage and multi-stage quadratic signaling games between an encoder and a decoder are investigated. In the considered setup, the objective functions of the encoder and the decoder are misaligned, there is a noisy channel between the encoder and the decoder, the encoder has a soft power constraint, and the decoder has also noisy observation of the source to be estimated. We show that there exist only linear encoding and decoding strategies at the Stackelberg equilibrium, and derive the equilibrium strategies and costs. Regarding the Nash equilibrium, we explicitly characterize affine equilibria for the single-stage setup and show that the optimal encoder (resp. decoder) is affine for an affine decoder (resp. encoder) for the multi-stage setup. For the decoder side, between the information coming from the encoder and noisy observation of the source, our results describe what should be the combining ratio of these two channels. Regarding the encoder, we derive the conditions under which it is meaningful to transmit a message.

preprint2020arXiv

A Hybrid Model-based and Data-driven Approach to Spectrum Sharing in mmWave Cellular Networks

Inter-operator spectrum sharing in millimeter-wave bands has the potential of substantially increasing the spectrum utilization and providing a larger bandwidth to individual user equipment at the expense of increasing inter-operator interference. Unfortunately, traditional model-based spectrum sharing schemes make idealistic assumptions about inter-operator coordination mechanisms in terms of latency and protocol overhead, while being sensitive to missing channel state information. In this paper, we propose hybrid model-based and data-driven multi-operator spectrum sharing mechanisms, which incorporate model-based beamforming and user association complemented by data-driven model refinements. Our solution has the same computational complexity as a model-based approach but has the major advantage of having substantially less signaling overhead. We discuss how limited channel state information and quantized codebook-based beamforming affect the learning and the spectrum sharing performance. We show that the proposed hybrid sharing scheme significantly improves spectrum utilization under realistic assumptions on inter-operator coordination and channel state information acquisition.

preprint2020arXiv

Asynchronous Decentralized Learning of a Neural Network

In this work, we exploit an asynchronous computing framework namely ARock to learn a deep neural network called self-size estimating feedforward neural network (SSFN) in a decentralized scenario. Using this algorithm namely asynchronous decentralized SSFN (dSSFN), we provide the centralized equivalent solution under certain technical assumptions. Asynchronous dSSFN relaxes the communication bottleneck by allowing one node activation and one side communication, which reduces the communication overhead significantly, consequently increasing the learning speed. We compare asynchronous dSSFN with traditional synchronous dSSFN in the experimental results, which shows the competitive performance of asynchronous dSSFN, especially when the communication network is sparse.

preprint2020arXiv

Conditional Mutual Information Neural Estimator

Several recent works in communication systems have proposed to leverage the power of neural networks in the design of encoders and decoders. In this approach, these blocks can be tailored to maximize the transmission rate based on aggregated samples from the channel. Motivated by the fact that, in many communication schemes, the achievable transmission rate is determined by a conditional mutual information term, this paper focuses on neural-based estimators for this information-theoretic quantity. Our results are based on variational bounds for the KL-divergence and, in contrast to some previous works, we provide a mathematically rigorous lower bound. However, additional challenges with respect to the unconditional mutual information emerge due to the presence of a conditional density function which we address here.

preprint2020arXiv

Decentralized Beamforming Design for Intelligent Reflecting Surface-enhanced Cell-free Networks

Cell-free networks are considered as a promising distributed network architecture to satisfy the increasing number of users and high rate expectations in beyond-5G systems. However, to further enhance network capacity, an increasing number of high-cost base stations (BSs) are required. To address this problem and inspired by the cost-effective intelligent reflecting surface (IRS) technique, we propose a fully decentralized design framework for cooperative beamforming in IRS-aided cell-free networks. We first transform the centralized weighted sum-rate maximization problem into a tractable consensus optimization problem, and then an incremental alternating direction method of multipliers (ADMM) algorithm is proposed to locally update the beamformer. The complexity and convergence of the proposed method are analyzed, and these results show that the performance of the new scheme can asymptotically approach that of the centralized one as the number of iterations increases. Results also show that IRSs can significantly increase the system sum-rate of cell-free networks and the proposed method outperforms existing decentralized methods.

preprint2020arXiv

Detecting State Transitions of a Markov Source: Sampling Frequency and Age Trade-off

We consider a finite-state Discrete-Time Markov Chain (DTMC) source that can be sampled for detecting the events when the DTMC transits to a new state. Our goal is to study the trade-off between sampling frequency and staleness in detecting the events. We argue that, for the problem at hand, using Age of Information (AoI) for quantifying the staleness of a sample is conservative and therefore, introduce \textit{age penalty} for this purpose. We study two optimization problems: minimize average age penalty subject to an average sampling frequency constraint, and minimize average sampling frequency subject to an average age penalty constraint; both are Constrained Markov Decision Problems. We solve them using linear programming approach and compute Markov policies that are optimal among all causal policies. Our numerical results demonstrate that the computed Markov policies not only outperform optimal periodic sampling policies, but also achieve sampling frequencies close to or lower than that of an optimal clairvoyant (non-causal) sampling policy, if a small age penalty is allowed.

preprint2020arXiv

Empirical Coordination with Multiple Descriptions

We extend the framework of empirical coordination to a distributed setup where for a given action by nature, multiple descriptions of the action of the decoder are available. We adopt the coding strategy applied by El Gamal and Cover in \cite{gamal:1982} to get a lower bound of the coordination region. Then, we improve this region by applying the coding scheme applied by Zhang and Berger in \cite{zhang:1987}.

preprint2020arXiv

Remote Empirical Coordination

We apply the framework of imperfect empirical coordination to a two-node setup where the action $X$ of the first node is not observed directly but via $L$ agents who observe independently impaired measurements $\hat X$ of the action. These $L$ agents, using a rate-limited communication that is available to all of them, help the second node to generate the action $Y$ in order to establish the desired coordinated behaviour. When $L<\infty$, we prove that it suffices $R_i\geq I\left(\hat X;\hat{Y}\right)$ for at least one agent whereas for $L\longrightarrow\infty$, we show that it suffices $R_i\geq I\left(\hat X;\hat Y|X\right)$ for all agents where $\hat Y$ is a random variable such that $X-\hat X-\hat Y$ and $\|p_{X,\hat Y}\left(x,y\right)-p_{X,Y}\left(x,y\right)\|_{TV}\leq Δ$ ( $Δ$ is the pre-specified fidelity).

preprint2020arXiv

Secure Strong Coordination

We consider a network of two nodes separated by a noisy channel, in which the source and its reconstruction have to be strongly coordinated, while simultaneously satisfying the strong secrecy condition with respect to an outside observer of the noisy channel. In the case of non-causal encoding and decoding, we propose a joint source-channel coding scheme for the secure strong coordination region. Furthermore, we provide a complete characterization of the secure strong coordination region when the decoder has to reliably reconstruct the source sequence and the legitimate channel is more capable than the channel of the eavesdropper.

preprint2020arXiv

Sequential Source Coding for Stochastic Systems Subject to Finite Rate Constraints

In this paper, we revisit the sequential source coding framework to analyze fundamental performance limitations of discrete-time stochastic control systems subject to feedback data-rate constraints in finite-time horizon. The basis of our results is a new characterization of the lower bound on the minimum total-rate achieved by sequential codes subject to a total (across time) distortion constraint and a computational algorithm that allocates optimally the rate-distortion for any fixed finite-time horizon. This characterization facilitates the derivation of analytical, non-asymptotic, and finite-dimensional lower and upper bounds in two control-related scenarios. (a) A parallel time-varying Gauss-Markov process with identically distributed spatial components that is quantized and transmitted through a noiseless channel to a minimum mean-squared error (MMSE) decoder. (b) A time-varying quantized LQG closed-loop control system, with identically distributed spatial components and with a random data-rate allocation. Our non-asymptotic lower bound on the quantized LQG control problem, reveals the absolute minimum data-rates for (mean square) stability of our time-varying plant for any fixed finite time horizon. We supplement our framework with illustrative simulation experiments.

preprint2020arXiv

SSFN -- Self Size-estimating Feed-forward Network with Low Complexity, Limited Need for Human Intervention, and Consistent Behaviour across Trials

We design a self size-estimating feed-forward network (SSFN) using a joint optimization approach for estimation of number of layers, number of nodes and learning of weight matrices. The learning algorithm has a low computational complexity, preferably within few minutes using a laptop. In addition the algorithm has a limited need for human intervention to tune parameters. SSFN grows from a small-size network to a large-size network, guaranteeing a monotonically non-increasing cost with addition of nodes and layers. The learning approach uses judicious a combination of `lossless flow property&#39; of some activation functions, convex optimization and instance of random matrix. Consistent performance -- low variation across Monte-Carlo trials -- is found for inference performance (classification accuracy) and estimation of network size.

preprint2020arXiv

The Convex Information Bottleneck Lagrangian

The information bottleneck (IB) problem tackles the issue of obtaining relevant compressed representations $T$ of some random variable $X$ for the task of predicting $Y$. It is defined as a constrained optimization problem which maximizes the information the representation has about the task, $I(T;Y)$, while ensuring that a certain level of compression $r$ is achieved (i.e., $ I(X;T) \leq r$). For practical reasons, the problem is usually solved by maximizing the IB Lagrangian (i.e., $\mathcal{L}_{\text{IB}}(T;β) = I(T;Y) - βI(X;T)$) for many values of $β\in [0,1]$. Then, the curve of maximal $I(T;Y)$ for a given $I(X;T)$ is drawn and a representation with the desired predictability and compression is selected. It is known when $Y$ is a deterministic function of $X$, the IB curve cannot be explored and another Lagrangian has been proposed to tackle this problem: the squared IB Lagrangian: $\mathcal{L}_{\text{sq-IB}}(T;β_{\text{sq}})=I(T;Y)-β_{\text{sq}}I(X;T)^2$. In this paper, we (i) present a general family of Lagrangians which allow for the exploration of the IB curve in all scenarios; (ii) provide the exact one-to-one mapping between the Lagrange multiplier and the desired compression rate $r$ for known IB curve shapes; and (iii) show we can approximately obtain a specific compression level with the convex IB Lagrangian for both known and unknown IB curve shapes. This eliminates the burden of solving the optimization problem for many values of the Lagrange multiplier. That is, we prove that we can solve the original constrained problem with a single optimization.

preprint2020arXiv

Transfer-Entropy-Regularized Markov Decision Processes

We consider the framework of transfer-entropy-regularized Markov Decision Process (TERMDP) in which the weighted sum of the classical state-dependent cost and the transfer entropy from the state random process to the control random process is minimized. Although TERMDPs are generally formulated as nonconvex optimization problems, we derive an analytical necessary optimality condition expressed as a finite set of nonlinear equations, based on which an iterative forward-backward computational procedure similar to the Arimoto-Blahut algorithm is proposed. It is shown that every limit point of the sequence generated by the proposed algorithm is a stationary point of the TERMDP. Applications of TERMDPs are discussed in the context of networked control systems theory and non-equilibrium thermodynamics. The proposed algorithm is applied to an information-constrained maze navigation problem, whereby we study how the price of information qualitatively alters the optimal decision polices.