Source author record

Michael Gastpar

Michael Gastpar appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning math.PR math.ST Statistics Theory Cryptography and Security math.CO math.FA Neural and Evolutionary Computing

Catalog footprint

What is connected

49works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Contraction of Rényi Divergences for Discrete Channels: Properties and Applications

This work explores properties of Strong Data-Processing constants for Rényi Divergences. Parallels are made with the well-studied $φ$-Divergences, and it is shown that the order $α$ of Rényi Divergences dictates whether certain properties of the contraction of $φ$-Divergences are mirrored or not. In particular, we demonstrate that when $α>1$, the contraction properties can deviate quite strikingly from those of $φ$-Divergences. We also uncover specific characteristics of contraction for the $\infty$-Rényi Divergence and relate it to $\varepsilon$-Local Differential Privacy. The results are then applied to bound the speed of convergence of Markov chains, where we argue that the contraction of Rényi Divergences offers a new perspective on the contraction of $L^α$-norms commonly studied in the literature.

preprint2026arXiv

Model non-collapse: Minimax bounds for recursive discrete distribution estimation

Learning discrete distributions from i.i.d. samples is a well-understood problem. However, advances in generative machine learning prompt an interesting new, non-i.i.d. setting: after receiving a certain number of samples, an estimated distribution is fixed, and samples from this estimate are drawn and introduced into the sample corpus, undifferentiated from real samples. Subsequent generations of estimators now face contaminated environments, a scenario referred to in the machine learning literature as self-consumption. Empirically, it has been observed that models in fully synthetic self-consuming loops collapse -- their performance deteriorates with each batch of training -- but accumulating data has been shown to prevent complete degeneration. This, in turn, begs the question: What happens when fresh real samples \textit{are} added at every stage? In this paper, we study the minimax loss of self-consuming discrete distribution estimation in such loops. We show that even when model collapse is consciously averted, the ratios between the minimax losses with and without source information can grow unbounded as the batch size increases. In the data accumulation setting, where all batches of samples are available for estimation, we provide minimax lower bounds and upper bounds that are order-optimal under mild conditions for the expected $\ell_2^2$ and $\ell_1$ losses at every stage. We provide conditions for regimes where there is a strict gap in the convergence rates compared to the corresponding oracle-assisted minimax loss where real and synthetic samples are differentiated, and provide examples where this gap is easily observed. We also provide a lower bound on the minimax loss in the data replacement setting, where only the latest batch of samples is available, and use it to find a lower bound for the worst-case loss for bounded estimate trajectories.

preprint2022arXiv

A Johnson--Lindenstrauss Framework for Randomly Initialized CNNs

How does the geometric representation of a dataset change after the application of each randomly initialized layer of a neural network? The celebrated Johnson--Lindenstrauss lemma answers this question for linear fully-connected neural networks (FNNs), stating that the geometry is essentially preserved. For FNNs with the ReLU activation, the angle between two inputs contracts according to a known mapping. The question for non-linear convolutional neural networks (CNNs) becomes much more intricate. To answer this question, we introduce a geometric framework. For linear CNNs, we show that the Johnson--Lindenstrauss lemma continues to hold, namely, that the angle between two inputs is preserved. For CNNs with ReLU activation, on the other hand, the behavior is richer: The angle between the outputs contracts, where the level of contraction depends on the nature of the inputs. In particular, after one layer, the geometry of natural images is essentially preserved, whereas for Gaussian correlated inputs, CNNs exhibit the same contracting behavior as FNNs with ReLU activation.

preprint2022arXiv

Finite Littlestone Dimension Implies Finite Information Complexity

We prove that every online learnable class of functions of Littlestone dimension $d$ admits a learning algorithm with finite information complexity. Towards this end, we use the notion of a globally stable algorithm. Generally, the information complexity of such a globally stable algorithm is large yet finite, roughly exponential in $d$. We also show there is room for improvement; for a canonical online learnable class, indicator functions of affine subspaces of dimension $d$, the information complexity can be upper bounded logarithmically in $d$.

preprint2022arXiv

From Generalisation Error to Transportation-cost Inequalities and Back

In this work, we connect the problem of bounding the expected generalisation error with transportation-cost inequalities. Exposing the underlying pattern behind both approaches we are able to generalise them and go beyond Kullback-Leibler Divergences/Mutual Information and sub-Gaussian measures. In particular, we are able to provide a result showing the equivalence between two families of inequalities: one involving functionals and one involving measures. This result generalises the one proposed by Bobkov and Götze that connects transportation-cost inequalities with concentration of measure. Moreover, it allows us to recover all standard generalisation error bounds involving mutual information and to introduce new, more general bounds, that involve arbitrary divergence measures.

preprint2022arXiv

Lower-bounds on the Bayesian Risk in Estimation Procedures via $f$-Divergences

We consider the problem of parameter estimation in a Bayesian setting and propose a general lower-bound that includes part of the family of $f$-Divergences. The results are then applied to specific settings of interest and compared to other notable results in the literature. In particular, we show that the known bounds using Mutual Information can be improved by using, for example, Maximal Leakage, Hellinger divergence, or generalizations of the Hockey-Stick divergence.

preprint2022arXiv

On Sibson's $α$-Mutual Information

We explore a family of information measures that stems from Rényi's $α$-Divergences with $α<0$. In particular, we extend the definition of Sibson's $α$-Mutual Information to negative values of $α$ and show several properties of these objects. Moreover, we highlight how this family of information measures is related to functional inequalities that can be employed in a variety of fields, including lower-bounds on the Risk in Bayesian Estimation Procedures.

preprint2022arXiv

Shannon Bounds on Lossy Gray-Wyner Networks

The Gray-Wyner network subject to a fidelity criterion is studied. Upper and lower bounds for the trade-offs between the private sum-rate and the common rate are obtained for arbitrary sources subject to mean-squared error distortion. The bounds meet exactly, leading to the computation of the rate region, when the source is jointly Gaussian. They meet partially when the sources are modeled via an additive Gaussian "channel". The bounds are inspired from the Shannon bounds on the rate-distortion problem.

preprint2022arXiv

The Price of Distributed: Rate Loss in the CEO Problem

In the distributed remote (CEO) source coding problem, many separate encoders observe independently noisy copies of an underlying source. The rate loss is the difference between the rate required in this distributed setting and the rate that would be required in a setting where the encoders can fully cooperate. In this sense, the rate loss characterizes the price of distributed processing. We survey and extend the known results on the rate loss in various settings, with a particular emphasis on the case where the noise in the observations is Gaussian, but the underlying source is general.

preprint2021arXiv

Learning, compression, and leakage: Minimising classification error via meta-universal compression principles

Learning and compression are driven by the common aim of identifying and exploiting statistical regularities in data, which opens the door for fertile collaboration between these areas. A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding, which provides strong guarantees for compression of small datasets - in contrast with more popular estimators whose guarantees hold only in the asymptotic limit. Here we consider a NML-based decision strategy for supervised classification problems, and show that it attains heuristic PAC learning when applied to a wide variety of models. Furthermore, we show that the misclassification rate of our method is upper bounded by the maximal leakage, a recently proposed metric to quantify the potential of data leakage in privacy-sensitive scenarios.

preprint2021arXiv

Lower bound on Wyner's Common Information

An important notion of common information between two random variables is due to Wyner. In this paper, we derive a lower bound on Wyner's common information for continuous random variables. The new bound improves on the only other general lower bound on Wyner's common information, which is the mutual information. We also show that the new lower bound is tight for the so-called "Gaussian channels" case, namely, when the joint distribution of the random variables can be written as the sum of a single underlying random variable and Gaussian noises. We motivate this work from the recent variations of Wyner's common information and applications to network data compression problems such as the Gray-Wyner network.

preprint2020arXiv

Common Information Components Analysis

We give an information-theoretic interpretation of Canonical Correlation Analysis (CCA) via (relaxed) Wyner's common information. CCA permits to extract from two high-dimensional data sets low-dimensional descriptions (features) that capture the commonalities between the data sets, using a framework of correlations and linear transforms. Our interpretation first extracts the common information up to a pre-selected resolution level, and then projects this back onto each of the data sets. In the case of Gaussian statistics, this procedure precisely reduces to CCA, where the resolution level specifies the number of CCA components that are extracted. This also suggests a novel algorithm, Common Information Components Analysis (CICA), with several desirable features, including a natural extension to beyond just two data sets.

preprint2020arXiv

Robust Generalization via $α$-Mutual Information

The aim of this work is to provide bounds connecting two probability measures of the same event using Rényi $α$-Divergences and Sibson's $α$-Mutual Information, a generalization of respectively the Kullback-Leibler Divergence and Shannon's Mutual Information. A particular case of interest can be found when the two probability measures considered are a joint distribution and the corresponding product of marginals (representing the statistically independent scenario). In this case, a bound using Sibson's $α-$Mutual Information is retrieved, extending a result involving Maximal Leakage to general alphabets. These results have broad applications, from bounding the generalization error of learning algorithms to the more general framework of adaptive data analysis, provided that the divergences and/or information measures used are amenable to such an analysis ({\it i.e.,} are robust to post-processing and compose adaptively). The generalization error bounds are derived with respect to high-probability events but a corresponding bound on expected generalization error is also retrieved.

preprint2016arXiv

A Joint Typicality Approach to Algebraic Network Information Theory

This paper presents a joint typicality framework for encoding and decoding nested linear codes for multi-user networks. This framework provides a new perspective on compute-forward within the context of discrete memoryless networks. In particular, it establishes an achievable rate region for computing the weighted sum of nested linear codewords over a discrete memoryless multiple-access channel (MAC). When specialized to the Gaussian MAC, this rate region recovers and improves upon the lattice-based compute-forward rate region of Nazer and Gastpar, thus providing a unified approach for discrete memoryless and Gaussian networks. Furthermore, this framework can be used to shed light on the joint decoding rate region for compute-forward, which is considered an open problem. Specifically, this work establishes an achievable rate region for simultaneously decoding two linear combinations of nested linear codewords from K senders.

preprint2016arXiv

A New Converse Bound for Coded Caching

An information-theoretic lower bound is developed for the caching system studied by Maddah-Ali and Niesen. By comparing the proposed lower bound with the decentralized coded caching scheme of Maddah-Ali and Niesen, the optimal memory--rate tradeoff is characterized to within a multiplicative gap of $4.7$ for the worst case, improving the previous analytical gap of $12$. Furthermore, for the case when users' requests follow the uniform distribution, the multiplicative gap is tightened to $4.7$, improving the previous analytical gap of $72$. As an independent result of interest, for the single-user average case in which the user requests multiple files, it is proved that caching the most requested files is optimal.

preprint2016arXiv

Computation in Multicast Networks: Function Alignment and Converse Theorems

The classical problem in network coding theory considers communication over multicast networks. Multiple transmitters send independent messages to multiple receivers which decode the same set of messages. In this work, computation over multicast networks is considered: each receiver decodes an identical function of the original messages. For a countably infinite class of two-transmitter two-receiver single-hop linear deterministic networks, the computing capacity is characterized for a linear function (modulo-2 sum) of Bernoulli sources. Inspired by the geometric concept of interference alignment in networks, a new achievable coding scheme called function alignment is introduced. A new converse theorem is established that is tighter than cut-set based and genie-aided bounds. Computation (vs. communication) over multicast networks requires additional analysis to account for multiple receivers sharing a network's computational resources. We also develop a network decomposition theorem which identifies elementary parallel subnetworks that can constitute an original network without loss of optimality. The decomposition theorem provides a conceptually-simpler algebraic proof of achievability that generalizes to $L$-transmitter $L$-receiver networks.

preprint2016arXiv

Gaussian Multiple Access via Compute-and-Forward

Lattice codes used under the Compute-and-Forward paradigm suggest an alternative strategy for the standard Gaussian multiple-access channel (MAC): The receiver successively decodes integer linear combinations of the messages until it can invert and recover all messages. In this paper, a multiple-access technique called CFMA (Compute-Forward Multiple Access) is proposed and analyzed. For the two-user MAC, it is shown that without time-sharing, the entire capacity region can be attained using CFMA with a single-user decoder as soon as the signal-to-noise ratios are above $1+\sqrt{2}$. A partial analysis is given for more than two users. Lastly the strategy is extended to the so-called dirty MAC where two interfering signals are known non-causally to the two transmitters in a distributed fashion. Our scheme extends the previously known results and gives new achievable rate regions.

preprint2016arXiv

Information Theoretic Caching: The Multi-User Case

In this paper, we consider a cache aided network in which each user is assumed to have individual caches, while upon users' requests, an update message is sent though a common link to all users. First, we formulate a general information theoretic setting that represents the database as a discrete memoryless source, and the users' requests as side information that is available everywhere except at the cache encoder. The decoders' objective is to recover a function of the source and the side information. By viewing cache aided networks in terms of a general distributed source coding problem and through information theoretic arguments, we present inner and outer bounds on the fundamental tradeoff of cache memory size and update rate. Then, we specialize our general inner and outer bounds to a specific model of content delivery networks: File selection networks, in which the database is a collection of independent equal-size files and each user requests one of the files independently. For file selection networks, we provide an outer bound and two inner bounds (for centralized and decentralized caching strategies). For the case when the user request information is uniformly distributed, we characterize the rate vs. cache size tradeoff to within a multiplicative gap of 4. By further extending our arguments to the framework of Maddah-Ali and Niesen, we also establish a new outer bound and two new inner bounds in which it is shown to recover the centralized and decentralized strategies, previously established by Maddah-Ali and Niesen. Finally, in terms of rate vs. cache size tradeoff, we improve the previous multiplicative gap of 72 to 4.7 for the average case with uniform requests.

preprint2016arXiv

Information-Theoretic Caching: Sequential Coding for Computing

Under the paradigm of caching, partial data is delivered before the actual requests of users are known. In this paper, this problem is modeled as a canonical distributed source coding problem with side information, where the side information represents the users' requests. For the single-user case, a single-letter characterization of the optimal rate region is established, and for several important special cases, closed-form solutions are given, including the scenario of uniformly distributed user requests. In this case, it is shown that the optimal caching strategy is closely related to total correlation and Wyner's common information. Using the insight gained from the single-user case, three two-user scenarios admitting single-letter characterization are considered, which draw connections to existing source coding problems in the literature: the Gray--Wyner system and distributed successive refinement. Finally, the model studied by Maddah-Ali and Niesen is rephrased to make a comparison with the considered information-theoretic model. Although the two caching models have a similar behavior for the single-user case, it is shown through a two-user example that the two caching models behave differently in general.

preprint2016arXiv

Multi-Library Coded Caching

We study the problem of coded caching when the server has access to several libraries and each user makes independent requests from every library. The single-library scenario has been well studied and it has been proved that coded caching can significantly improve the delivery rate compared to uncoded caching. In this work we show that when all the libraries have the same number of files, memory-sharing is optimal and the delivery rate cannot be improved via coding across files from different libraries. In this setting, the optimal memory-sharing strategy is one that divides the cache of each user proportional to the size of the files in different libraries. As for the general case, when the number of files in different libraries are arbitrary, we propose an inner-bound based on memory-sharing and an outer-bound based on concatenation of files from different libraries.

preprint2016arXiv

Typical sumsets of linear codes

Given two identical linear codes $\mathcal C$ over $\mathbb F_q$ of length $n$, we independently pick one codeword from each codebook uniformly at random. A $\textit{sumset}$ is formed by adding these two codewords entry-wise as integer vectors and a sumset is called $\textit{typical}$, if the sum falls inside this set with high probability. We ask the question: how large is the typical sumset for most codes? In this paper we characterize the asymptotic size of such typical sumset. We show that when the rate $R$ of the linear code is below a certain threshold $D$, the typical sumset size is roughly $|\mathcal C|^2=2^{2nR}$ for most codes while when $R$ is above this threshold, most codes have a typical sumset whose size is roughly $|\mathcal C|\cdot 2^{nD}=2^{n(R+D)}$ due to the linear structure of the codes. The threshold $D$ depends solely on the alphabet size $q$ and takes value in $[1/2, \log \sqrt{e})$. More generally, we completely characterize the asymptotic size of typical sumsets of two nested linear codes $\mathcal C_1, \mathcal C_2$ with different rates. As an application of the result, we study the communication problem where the integer sum of two codewords is to be decoded through a general two-user multiple-access channel.

preprint2015arXiv

$K$ Users Caching Two Files: An Improved Achievable Rate

Caching is an approach to smoothen the variability of traffic over time. Recently it has been proved that the local memories at the users can be exploited for reducing the peak traffic in a much more efficient way than previously believed. In this work we improve upon the existing results and introduce a novel caching strategy that takes advantage of simultaneous coded placement and coded delivery in order to decrease the worst case achievable rate with $2$ files and $K$ users. We will show that for any cache size $\frac{1}{K}<M<1$ our scheme outperforms the state of the art.

preprint2015arXiv

Asymmetric Compute-and-Forward with CSIT

We present a modified compute-and-forward scheme which utilizes Channel State Information at the Transmitters (CSIT) in a natural way. The modified scheme allows different users to have different coding rates, and use CSIT to achieve larger rate region. This idea is applicable to all systems which use the compute-and-forward technique and can be arbitrarily better than the regular scheme in some settings.

preprint2015arXiv

Efficient Algorithms for the Data Exchange Problem

In this paper we study the data exchange problem where a set of users is interested in gaining access to a common file, but where each has only partial knowledge about it as side-information. Assuming that the file is broken into packets, the side-information considered is in the form of linear combinations of the file packets. Given that the collective information of all the users is sufficient to allow recovery of the entire file, the goal is for each user to gain access to the file while minimizing some communication cost. We assume that users can communicate over a noiseless broadcast channel, and that the communication cost is a sum of each user's cost function over the number of bits it transmits. For instance, the communication cost could simply be the total number of bits that needs to be transmitted. In the most general case studied in this paper, each user can have any arbitrary convex cost function. We provide deterministic, polynomial-time algorithms (in the number of users and packets) which find an optimal communication scheme that minimizes the communication cost. To further lower the complexity, we also propose a simple randomized algorithm inspired by our deterministic algorithm which is based on a random linear network coding scheme.

preprint2015arXiv

Lattice Codes for Many-to-One Interference Channels With and Without Cognitive Messages

A new achievable rate region is given for the Gaussian cognitive many-to-one interference channel. The proposed novel coding scheme is based on the compute-and-forward approach with lattice codes. Using the idea of decoding sums of codewords, our scheme improves considerably upon the conventional coding schemes which treat interference as noise or decode messages simultaneously. Our strategy also extends directly to the usual many-to-one interference channels without cognitive messages. Comparing to the usual compute-and-forward scheme where a fixed lattice is used for the code construction, the novel scheme employs scaled lattices and also encompasses key ingredients of the existing schemes for the cognitive interference channel. With this new component, our scheme achieves a larger rate region in general. For some symmetric channel settings, new constant gap or capacity results are established, which are independent of the number of users in the system.

preprint2015arXiv

Secure Transmission on the Two-hop Relay Channel with Scaled Compute-and-Forward

In this paper, we consider communication on a two-hop channel, in which a source wants to send information reliably and securely to the destination via a relay. We consider both the untrusted relay case and the external eavesdropper case. In the untrusted relay case, the relay behaves as an eavesdropper and there is a cooperative node which sends a jamming signal to confuse the relay when the it is receiving from the source. We propose two secure transmission schemes using the scaled compute-and-forward technique. One of the schemes is based on a random binning code and the other one is based on a lattice chain code. It is proved that in either the high Signal-to-Noise-Ratio (SNR) scenario and/or the restricted relay power scenario, if the destination is used as the jammer, both schemes outperform all existing schemes and achieve the upper bound. In particular, if the SNR is large and the source, the relay, and the cooperative jammer have identical power and channels, both schemes achieve the upper bound for secrecy rate, which is merely $1/2$ bit per channel use lower than the channel capacity without secrecy constraints. We also prove that one of our schemes achieves a positive secrecy rate in the external eavesdropper case in which the relay is trusted and there exists an external eavesdropper.

preprint2014arXiv

Compute-and-Forward: Finding the Best Equation

Compute-and-Forward is an emerging technique to deal with interference. It allows the receiver to decode a suitably chosen integer linear combination of the transmitted messages. The integer coefficients should be adapted to the channel fading state. Optimizing these coefficients is a Shortest Lattice Vector (SLV) problem. In general, the SLV problem is known to be prohibitively complex. In this paper, we show that the particular SLV instance resulting from the Compute-and-Forward problem can be solved in low polynomial complexity and give an explicit deterministic algorithm that is guaranteed to find the optimal solution.

preprint2014arXiv

Integer-Forcing Linear Receivers

Linear receivers are often used to reduce the implementation complexity of multiple-antenna systems. In a traditional linear receiver architecture, the receive antennas are used to separate out the codewords sent by each transmit antenna, which can then be decoded individually. Although easy to implement, this approach can be highly suboptimal when the channel matrix is near singular. This paper develops a new linear receiver architecture that uses the receive antennas to create an effective channel matrix with integer-valued entries. Rather than attempting to recover transmitted codewords directly, the decoder recovers integer combinations of the codewords according to the entries of the effective channel matrix. The codewords are all generated using the same linear code which guarantees that these integer combinations are themselves codewords. Provided that the effective channel is full rank, these integer combinations can then be digitally solved for the original codewords. This paper focuses on the special case where there is no coding across transmit antennas and no channel state information at the transmitter(s), which corresponds either to a multi-user uplink scenario or to single-user V-BLAST encoding. In this setting, the proposed integer-forcing linear receiver significantly outperforms conventional linear architectures such as the zero-forcing and linear MMSE receiver. In the high SNR regime, the proposed receiver attains the optimal diversity-multiplexing tradeoff for the standard MIMO channel with no coding across transmit antennas. It is further shown that in an extended MIMO model with interference, the integer-forcing linear receiver achieves the optimal generalized degrees-of-freedom.

preprint2013arXiv

Approximate Sparsity Pattern Recovery: Information-Theoretic Lower Bounds

Recovery of the sparsity pattern (or support) of an unknown sparse vector from a small number of noisy linear measurements is an important problem in compressed sensing. In this paper, the high-dimensional setting is considered. It is shown that if the measurement rate and per-sample signal-to-noise ratio (SNR) are finite constants independent of the length of the vector, then the optimal sparsity pattern estimate will have a constant fraction of errors. Lower bounds on the measurement rate needed to attain a desired fraction of errors are given in terms of the SNR and various key parameters of the unknown vector. The tightness of the bounds in a scaling sense, as a function of the SNR and the fraction of errors, is established by comparison with existing achievable bounds. Near optimality is shown for a wide variety of practically motivated signal models.

preprint2013arXiv

Coding Schemes and Asymptotic Capacity of the Gaussian Broadcast and Interference Channels with Feedback

A coding scheme is proposed for the memoryless Gaussian broadcast channel with correlated noises and feedback. For all noise correlations other than -1, the gap between the sum-rate the scheme achieves and the full-cooperation bound vanishes as the signal-to-noise ratio tends to infinity. When the correlation coefficient is -1, the gains afforded by feedback are unbounded and the prelog is doubled. When the correlation coefficient is +1 we demonstrate a dichotomy: If the noise variances are equal, then feedback is useless, and otherwise, feedback affords unbounded rate gains and doubles the prelog. The unbounded feedback gains, however, require perfect (noiseless) feedback. When the feedback links are noisy the feedback gains are bounded, unless the feedback noise decays to zero sufficiently fast with the signal-to-noise ratio. Extensions to more receivers are also discussed as is the memoryless Gaussian interference channel with feedback.

preprint2013arXiv

Computation Over Gaussian Networks With Orthogonal Components

Function computation of arbitrarily correlated discrete sources over Gaussian networks with orthogonal components is studied. Two classes of functions are considered: the arithmetic sum function and the type function. The arithmetic sum function in this paper is defined as a set of multiple weighted arithmetic sums, which includes averaging of the sources and estimating each of the sources as special cases. The type or frequency histogram function counts the number of occurrences of each argument, which yields many important statistics such as mean, variance, maximum, minimum, median, and so on. The proposed computation coding first abstracts Gaussian networks into the corresponding modulo sum multiple-access channels via nested lattice codes and linear network coding and then computes the desired function by using linear Slepian-Wolf source coding. For orthogonal Gaussian networks (with no broadcast and multiple-access components), the computation capacity is characterized for a class of networks. For Gaussian networks with multiple-access components (but no broadcast), an approximate computation capacity is characterized for a class of networks.

preprint2013arXiv

Interactive Computation of Type-Threshold Functions in Collocated Broadcast-Superposition Networks

In wireless sensor networks, various applications involve learning one or multiple functions of the measurements observed by sensors, rather than the measurements themselves. This paper focuses on type-threshold functions, e.g., the maximum and indicator functions. Previous work studied this problem under the collocated collision network model and showed that under many probabilistic models for the measurements, the achievable computation rates converge to zero as the number of sensors increases. This paper considers two network models reflecting both the broadcast and superposition properties of wireless channels: the collocated linear finite field network and the collocated Gaussian network. A general multi-round coding scheme exploiting not only the broadcast property but particularly also the superposition property of the networks is developed. Through careful scheduling of concurrent transmissions to reduce redundancy, it is shown that given any independent measurement distribution, all type-threshold functions can be computed reliably with a non-vanishing rate in the collocated Gaussian network, even if the number of sensors tends to infinity.

preprint2013arXiv

Polar Codes For Broadcast Channels

Polar codes are introduced for discrete memoryless broadcast channels. For $m$-user deterministic broadcast channels, polarization is applied to map uniformly random message bits from $m$ independent messages to one codeword while satisfying broadcast constraints. The polarization-based codes achieve rates on the boundary of the private-message capacity region. For two-user noisy broadcast channels, polar implementations are presented for two information-theoretic schemes: i) Cover's superposition codes; ii) Marton's codes. Due to the structure of polarization, constraints on the auxiliary and channel-input distributions are identified to ensure proper alignment of polarization indices in the multi-user setting. The codes achieve rates on the capacity boundary of a few classes of broadcast channels (e.g., binary-input stochastically degraded). The complexity of encoding and decoding is $O(n*log n)$ where $n$ is the block length. In addition, polar code sequences obtain a stretched-exponential decay of $O(2^{-n^β})$ of the average block error probability where $0 < β< 0.5$.

preprint2012arXiv

Approximate Ergodic Capacity of a Class of Fading 2-user 2-hop Networks

We consider a fading AWGN 2-user 2-hop network where the channel coefficients are independent and identically distributed (i.i.d.) drawn from a continuous distribution and vary over time. For a broad class of channel distributions, we characterize the ergodic sum capacity to within a constant number of bits/sec/Hz, independent of signal-to-noise ratio. The achievability follows from the analysis of an interference neutralization scheme where the relays are partitioned into $M$ pairs, and interference is neutralized separately by each pair of relays. When $M=1$, the proposed ergodic interference neutralization characterizes the ergodic sum capacity to within $4$ bits/sec/Hz for i.i.d. uniform phase fading and approximately $4.7$ bits/sec/Hz for i.i.d. Rayleigh fading. We further show that this gap can be tightened to $4\log π-4$ bits/sec/Hz (approximately $2.6$) for i.i.d. uniform phase fading and $4-4\log( \frac{3π}{8})$ bits/sec/Hz (approximately $3.1$) for i.i.d. Rayleigh fading in the limit of large $M$.

preprint2012arXiv

Approximate Feedback Capacity of the Gaussian Multicast Channel

We characterize the capacity region to within log{2(M-1)} bits/s/Hz for the M-transmitter K-receiver Gaussian multicast channel with feedback where each receiver wishes to decode every message from the M transmitters. Extending Cover-Leung's achievable scheme intended for (M,K)=(2,1), we show that this generalized scheme achieves the cutset-based outer bound within log{2(M-1)} bits per transmitter for all channel parameters. In contrast to the capacity in the non-feedback case, the feedback capacity improves upon the naive intersection of the feedback capacities of K individual multiple access channels. We find that feedback provides unbounded multiplicative gain at high signal-to-noise ratios as was shown in the Gaussian interference channel. To complement the results, we establish the exact feedback capacity of the Avestimehr-Diggavi-Tse (ADT) deterministic model, from which we make the observation that feedback can also be beneficial for function computation.

preprint2012arXiv

Data Exchange Problem with Helpers

In this paper we construct a deterministic polynomial time algorithm for the problem where a set of users is interested in gaining access to a common file, but where each has only partial knowledge of the file. We further assume the existence of another set of terminals in the system, called helpers, who are not interested in the common file, but who are willing to help the users. Given that the collective information of all the terminals is sufficient to allow recovery of the entire file, the goal is to minimize the (weighted) sum of bits that these terminals need to exchange over a noiseless public channel in order achieve this goal. Based on established connections to the multi-terminal secrecy problem, our algorithm also implies a polynomial-time method for constructing the largest shared secret key in the presence of an eavesdropper. We consider the following side-information settings: (i) side-information in the form of uncoded packets of the file, where the terminals' side-information consists of subsets of the file; (ii) side-information in the form of linearly correlated packets, where the terminals have access to linear combinations of the file packets; and (iii) the general setting where the the terminals' side-information has an arbitrary (i.i.d.) correlation structure. We provide a polynomial-time algorithm (in the number of terminals) that finds the optimal rate allocations for these terminals, and then determines an explicit optimal transmission scheme for cases (i) and (ii).

preprint2012arXiv

Ergodic Interference Alignment

This paper develops a new communication strategy, ergodic interference alignment, for the K-user interference channel with time-varying fading. At any particular time, each receiver will see a superposition of the transmitted signals plus noise. The standard approach to such a scenario results in each transmitter-receiver pair achieving a rate proportional to 1/K its interference-free ergodic capacity. However, given two well-chosen time indices, the channel coefficients from interfering users can be made to exactly cancel. By adding up these two observations, each receiver can obtain its desired signal without any interference. If the channel gains have independent, uniform phases, this technique allows each user to achieve at least 1/2 its interference-free ergodic capacity at any signal-to-noise ratio. Prior interference alignment techniques were only able to attain this performance as the signal-to-noise ratio tended to infinity. Extensions are given for the case where each receiver wants a message from more than one transmitter as well as the "X channel" case (with two receivers) where each transmitter has an independent message for each receiver. Finally, it is shown how to generalize this strategy beyond Gaussian channel models. For a class of finite field interference channels, this approach yields the ergodic capacity region.

preprint2012arXiv

Minimum Cost Multicast with Decentralized Sources

In this paper we study the multisource multicast problem where every sink in a given directed acyclic graph is a client and is interested in a common file. We consider the case where each node can have partial knowledge about the file as a side information. Assuming that nodes can communicate over the capacity constrained links of the graph, the goal is for each client to gain access to the file, while minimizing some linear cost function of number of bits transmitted in the network. We consider three types of side-information settings:(ii) side information in the form of linearly correlated packets; and (iii) the general setting where the side information at the nodes have an arbitrary (i.i.d.) correlation structure. In this work we 1) provide a polynomial time feasibility test, i.e., whether or not all the clients can recover the file, and 2) we provide a polynomial-time algorithm that finds the optimal rate allocation among the links of the graph, and then determines an explicit transmission scheme for cases (i) and (ii).

preprint2012arXiv

Random Access with Physical-layer Network Coding

Leveraging recent progress in physical-layer network coding we propose a new approach to random access: When packets collide, it is possible to recover a linear combination of the packets at the receiver. Over many rounds of transmission, the receiver can thus obtain many linear combinations and eventually recover all original packets. This is by contrast to slotted ALOHA where packet collisions lead to complete erasures. The throughput of the proposed strategy is derived and shown to be significantly superior to the best known strategies, including multipacket reception.

preprint2012arXiv

Reduced-Dimension Linear Transform Coding of Correlated Signals in Networks

A model, called the linear transform network (LTN), is proposed to analyze the compression and estimation of correlated signals transmitted over directed acyclic graphs (DAGs). An LTN is a DAG network with multiple source and receiver nodes. Source nodes transmit subspace projections of random correlated signals by applying reduced-dimension linear transforms. The subspace projections are linearly processed by multiple relays and routed to intended receivers. Each receiver applies a linear estimator to approximate a subset of the sources with minimum mean squared error (MSE) distortion. The model is extended to include noisy networks with power constraints on transmitters. A key task is to compute all local compression matrices and linear estimators in the network to minimize end-to-end distortion. The non-convex problem is solved iteratively within an optimization framework using constrained quadratic programs (QPs). The proposed algorithm recovers as special cases the regular and distributed Karhunen-Loeve transforms (KLTs). Cut-set lower bounds on the distortion region of multi-source, multi-receiver networks are given for linear coding based on convex relaxations. Cut-set lower bounds are also given for any coding strategy based on information theory. The distortion region and compression-estimation tradeoffs are illustrated for different communication demands (e.g. multiple unicast), and graph structures.

preprint2012arXiv

Relaxing the Gaussian AVC

The arbitrarily varying channel (AVC) is a conservative way of modeling an unknown interference, and the corresponding capacity results are pessimistic. We reconsider the Gaussian AVC by relaxing the classical model and thereby weakening the adversarial nature of the interference. We examine three different relaxations. First, we show how a very small amount of common randomness between transmitter and receiver is sufficient to achieve the rates of fully randomized codes. Second, akin to the dirty paper coding problem, we study the impact of an additional interference known to the transmitter. We provide partial capacity results that differ significantly from the standard AVC. Third, we revisit a Gaussian MIMO AVC in which the interference is arbitrary but of limited dimension.

preprint2012arXiv

The Sampling Rate-Distortion Tradeoff for Sparsity Pattern Recovery in Compressed Sensing

Recovery of the sparsity pattern (or support) of an unknown sparse vector from a limited number of noisy linear measurements is an important problem in compressed sensing. In the high-dimensional setting, it is known that recovery with a vanishing fraction of errors is impossible if the measurement rate and the per-sample signal-to-noise ratio (SNR) are finite constants, independent of the vector length. In this paper, it is shown that recovery with an arbitrarily small but constant fraction of errors is, however, possible, and that in some cases computationally simple estimators are near-optimal. Bounds on the measurement rate needed to attain a desired fraction of errors are given in terms of the SNR and various key parameters of the unknown vector for several different recovery algorithms. The tightness of the bounds, in a scaling sense, as a function of the SNR and the fraction of errors, is established by comparison with existing information-theoretic necessary bounds. Near optimality is shown for a wide variety of practically motivated signal models.

preprint2011arXiv

A Compressed Sensing Wire-Tap Channel

A multiplicative Gaussian wire-tap channel inspired by compressed sensing is studied. Lower and upper bounds on the secrecy capacity are derived, and shown to be relatively tight in the large system limit for a large class of compressed sensing matrices. Surprisingly, it is shown that the secrecy capacity of this channel is nearly equal to the capacity without any secrecy constraint provided that the channel of the eavesdropper is strictly worse than the channel of the intended receiver. In other words, the eavesdropper can see almost everything and yet learn almost nothing. This behavior, which contrasts sharply with that of many commonly studied wiretap channels, is made possible by the fact that a small number of linear projections can make a crucial difference in the ability to estimate sparse vectors.

preprint2011arXiv

Compute-and-Forward: Harnessing Interference through Structured Codes

Interference is usually viewed as an obstacle to communication in wireless networks. This paper proposes a new strategy, compute-and-forward, that exploits interference to obtain significantly higher rates between users in a network. The key idea is that relays should decode linear functions of transmitted messages according to their observed channel coefficients rather than ignoring the interference as noise. After decoding these linear equations, the relays simply send them towards the destinations, which given enough equations, can recover their desired messages. The underlying codes are based on nested lattices whose algebraic structure ensures that integer combinations of codewords can be decoded reliably. Encoders map messages from a finite field to a lattice and decoders recover equations of lattice points which are then mapped back to equations over the finite field. This scheme is applicable even if the transmitters lack channel state information.

preprint2011arXiv

On the Role of Diversity in Sparsity Estimation

A major challenge in sparsity pattern estimation is that small modes are difficult to detect in the presence of noise. This problem is alleviated if one can observe samples from multiple realizations of the nonzero values for the same sparsity pattern. We will refer to this as "diversity". Diversity comes at a price, however, since each new realization adds new unknown nonzero values, thus increasing uncertainty. In this paper, upper and lower bounds on joint sparsity pattern estimation are derived. These bounds, which improve upon existing results even in the absence of diversity, illustrate key tradeoffs between the number of measurements, the accuracy of estimation, and the diversity. It is shown, for instance, that diversity introduces a tradeoff between the uncertainty in the noise and the uncertainty in the nonzero values. Moreover, it is shown that the optimal amount of diversity significantly improves the behavior of the estimation problem for both optimal and computationally efficient estimators.

preprint2011arXiv

Optimal Deterministic Polynomial-Time Data Exchange for Omniscience

We study the problem of constructing a deterministic polynomial time algorithm that achieves omniscience, in a rate-optimal manner, among a set of users that are interested in a common file but each has only partial knowledge about it as side-information. Assuming that the collective information among all the users is sufficient to allow the reconstruction of the entire file, the goal is to minimize the (possibly weighted) amount of bits that these users need to exchange over a noiseless public channel in order for all of them to learn the entire file. Using established connections to the multi-terminal secrecy problem, our algorithm also implies a polynomial-time method for constructing a maximum size secret shared key in the presence of an eavesdropper. We consider the following types of side-information settings: (i) side information in the form of uncoded fragments/packets of the file, where the users' side-information consists of subsets of the file; (ii) side information in the form of linearly correlated packets, where the users have access to linear combinations of the file packets; and (iii) the general setting where the the users' side-information has an arbitrary (i.i.d.) correlation structure. Building on results from combinatorial optimization, we provide a polynomial-time algorithm (in the number of users) that, first finds the optimal rate allocations among these users, then determines an explicit transmission scheme (i.e., a description of which user should transmit what information) for cases (i) and (ii).

preprint2011arXiv

Reliable Physical Layer Network Coding

When two or more users in a wireless network transmit simultaneously, their electromagnetic signals are linearly superimposed on the channel. As a result, a receiver that is interested in one of these signals sees the others as unwanted interference. This property of the wireless medium is typically viewed as a hindrance to reliable communication over a network. However, using a recently developed coding strategy, interference can in fact be harnessed for network coding. In a wired network, (linear) network coding refers to each intermediate node taking its received packets, computing a linear combination over a finite field, and forwarding the outcome towards the destinations. Then, given an appropriate set of linear combinations, a destination can solve for its desired packets. For certain topologies, this strategy can attain significantly higher throughputs over routing-based strategies. Reliable physical layer network coding takes this idea one step further: using judiciously chosen linear error-correcting codes, intermediate nodes in a wireless network can directly recover linear combinations of the packets from the observed noisy superpositions of transmitted signals. Starting with some simple examples, this survey explores the core ideas behind this new technique and the possibilities it offers for communication over interference-limited wireless networks.

preprint2010arXiv

"Compressed" Compressed Sensing

The field of compressed sensing has shown that a sparse but otherwise arbitrary vector can be recovered exactly from a small number of randomly constructed linear projections (or samples). The question addressed in this paper is whether an even smaller number of samples is sufficient when there exists prior knowledge about the distribution of the unknown vector, or when only partial recovery is needed. An information-theoretic lower bound with connections to free probability theory and an upper bound corresponding to a computationally simple thresholding estimator are derived. It is shown that in certain cases (e.g. discrete valued vectors or large distortions) the number of samples can be decreased. Interestingly though, it is also shown that in many cases no reduction is possible.

preprint2009arXiv

Zero-rate feedback can achieve the empirical capacity

The utility of limited feedback for coding over an individual sequence of DMCs is investigated. This study complements recent results showing how limited or noisy feedback can boost the reliability of communication. A strategy with fixed input distribution $P$ is given that asymptotically achieves rates arbitrarily close to the mutual information induced by $P$ and the state-averaged channel. When the capacity achieving input distribution is the same over all channel states, this achieves rates at least as large as the capacity of the state averaged channel, sometimes called the empirical capacity.

Michael Gastpar

What is connected

Connect this record

See the researcher in context

Building this map preview

49 published item(s)

Contraction of Rényi Divergences for Discrete Channels: Properties and Applications

Model non-collapse: Minimax bounds for recursive discrete distribution estimation

A Johnson--Lindenstrauss Framework for Randomly Initialized CNNs

Finite Littlestone Dimension Implies Finite Information Complexity

From Generalisation Error to Transportation-cost Inequalities and Back

Lower-bounds on the Bayesian Risk in Estimation Procedures via $f$-Divergences

On Sibson's $α$-Mutual Information

Shannon Bounds on Lossy Gray-Wyner Networks

The Price of Distributed: Rate Loss in the CEO Problem

Learning, compression, and leakage: Minimising classification error via meta-universal compression principles

Lower bound on Wyner's Common Information

Common Information Components Analysis

Robust Generalization via $α$-Mutual Information

A Joint Typicality Approach to Algebraic Network Information Theory

A New Converse Bound for Coded Caching

Computation in Multicast Networks: Function Alignment and Converse Theorems

Gaussian Multiple Access via Compute-and-Forward

Information Theoretic Caching: The Multi-User Case

Information-Theoretic Caching: Sequential Coding for Computing

Multi-Library Coded Caching

Typical sumsets of linear codes

$K$ Users Caching Two Files: An Improved Achievable Rate

Asymmetric Compute-and-Forward with CSIT

Efficient Algorithms for the Data Exchange Problem

Lattice Codes for Many-to-One Interference Channels With and Without Cognitive Messages

Secure Transmission on the Two-hop Relay Channel with Scaled Compute-and-Forward

Compute-and-Forward: Finding the Best Equation

Integer-Forcing Linear Receivers

Approximate Sparsity Pattern Recovery: Information-Theoretic Lower Bounds

Coding Schemes and Asymptotic Capacity of the Gaussian Broadcast and Interference Channels with Feedback

Computation Over Gaussian Networks With Orthogonal Components

Interactive Computation of Type-Threshold Functions in Collocated Broadcast-Superposition Networks

Polar Codes For Broadcast Channels

Approximate Ergodic Capacity of a Class of Fading 2-user 2-hop Networks

Approximate Feedback Capacity of the Gaussian Multicast Channel

Data Exchange Problem with Helpers

Ergodic Interference Alignment

Minimum Cost Multicast with Decentralized Sources

Random Access with Physical-layer Network Coding

Reduced-Dimension Linear Transform Coding of Correlated Signals in Networks

Relaxing the Gaussian AVC

The Sampling Rate-Distortion Tradeoff for Sparsity Pattern Recovery in Compressed Sensing

A Compressed Sensing Wire-Tap Channel

Compute-and-Forward: Harnessing Interference through Structured Codes

On the Role of Diversity in Sparsity Estimation

Optimal Deterministic Polynomial-Time Data Exchange for Omniscience

Reliable Physical Layer Network Coding

"Compressed" Compressed Sensing

Zero-rate feedback can achieve the empirical capacity