Source author record

Neri Merhav

Neri Merhav appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT cond-mat.stat-mech cond-mat.dis-nn Cryptography and Security math.PR cond-mat.other

Catalog footprint

What is connected

72works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Lossy Compression of Individual Sequences Revisited: Fundamental Limits of Finite-State Encoders

We extend Ziv and Lempel's model of finite-state encoders to the realm of lossy compression of individual sequences. In particular, the model of the encoder includes a finite-state reconstruction codebook followed by an information lossless finite-state encoder that compresses the reconstruction codeword with no additional distortion. We first derive two different lower bounds to the compression ratio that depend on the number of states of the lossless encoder. Both bounds are asymptotically achievable by conceptually simple coding schemes. We then show that when the number of states of the lossless encoder is large enough in terms of the reconstruction block-length, the performance can be improved, sometimes significantly so. In particular, the improved performance is achievable using a random-coding ensemble that is universal, not only in terms of the source sequence, but also in terms of the distortion measure.

preprint2022arXiv

$D$-semifaithful codes that are universal over both memoryless sources and distortion measures

We prove the existence of codebooks for d-semifaithful lossy compression that are simultaneously universal with respect to both the class of finite-alphabet memoryless sources and the class of all bounded additive distortion measures. By applying independent random selection of the codewords according to a mixture of all memoryless sources, we achieve redundancy rates that are within O(log n/n) close to the empirical rate-distortion function of every given source vector with respect to every bounded distortion measure. As outlined in the last section, the principal ideas can also be extended significantly beyond the class of memoryless sources, namely, to the setting of individual sequences encoded by finite-state machines.

preprint2022arXiv

Codebook Mismatch Can Be Fully Compensated by Mismatched Decoding

We consider an ensemble of constant composition codes that are subsets of linear codes: while the encoder uses only the constant-composition subcode, the decoder operates as if the full linear code was used, with the motivation of simultaneously benefiting both from the probabilistic shaping of the channel input and from the linear structure of the code. We prove that the codebook mismatch can be fully compensated by using a mismatched additive decoding metric that achieves the random coding error exponent of (non-linear) constant composition codes. As the coding rate tends to the mutual information, the optimal mismatched metric approaches the maximum a posteriori probability (MAP) metric, showing that codebook mismatch with mismatched MAP metric is capacity-achieving for the optimal input assignment.

preprint2022arXiv

Error Exponents of the Dirty-Paper and Gel'fand-Pinsker Channels

We derive various error exponents for communication channels with random states, which are available non-causally at the encoder only. For both the finite-alphabet Gel'fand-Pinsker channel and its Gaussian counterpart, the dirty-paper channel, we derive random coding exponents, error exponents of the typical random codes (TRCs), and error exponents of expurgated codes. For the two channel models, we analyze some sub-optimal bin-index decoders, which turn out to be asymptotically optimal, at least for the random coding error exponent. For the dirty-paper channel, we show explicitly via a numerical example, that both the error exponent of the TRC and the expurgated exponent strictly improve upon the random coding exponent, at relatively low coding rates, which is a known fact for discrete memoryless channels without random states. We also show that at rates below capacity, the optimal values of the dirty-paper design parameter $α$ in the random coding sense and in the TRC exponent sense are different from one another, and they are both different from the optimal $α$ that is required for attaining the channel capacity. For the Gel'fand-Pinsker channel, we allow for a variable-rate random binning code construction, and prove that the previously proposed maximum penalized mutual information decoder is asymptotically optimal within a given class of decoders, at least for the random coding error exponent.

preprint2022arXiv

Optimal Correlators and Waveforms for Mismatched Detection

We consider the classical Neymann-Pearson hypothesis testing problem of signal detection, where under the null hypothesis ($\calH_0$), the received signal is white Gaussian noise, and under the alternative hypothesis ($\calH_1$), the received signal includes also an additional non-Gaussian random signal, which in turn can be viewed as a deterministic waveform plus zero-mean, non-Gaussian noise. However, instead of the classical likelihood ratio test detector, which might be difficult to implement, in general, we impose a (mismatched) correlation detector, which is relatively easy to implement, and we characterize the optimal correlator weights in the sense of the best trade-off between the false-alarm error exponent and the missed-detection error exponent. Those optimal correlator weights depend (non-linearly, in general) on the underlying deterministic waveform under $\calH_1$. We then assume that the deterministic waveform may also be free to be optimized (subject to a power constraint), jointly with the correlator, and show that both the optimal waveform and the optimal correlator weights may take on values in a small finite set of typically no more than two to four levels, depending on the distribution of the non-Gaussian noise component. Finally, we outline an extension of the scope to a wider class of detectors that are based on linear combinations of the correlation and the energy of the received signal.

preprint2022arXiv

The DNA Storage Channel: Capacity and Error Probability

The DNA storage channel is considered, in which the $M$ Deoxyribonucleic acid (DNA) molecules comprising each codeword are stored without order, sampled $N$ times with replacement, and then sequenced over a discrete memoryless channel. For a constant coverage depth $M/N$ and molecule length scaling $Θ(\log M)$, lower (achievability) and upper (converse) bounds on the capacity of the channel, as well as a lower (achievability) bound on the reliability function of the channel are provided. Both the lower and upper bounds on the capacity generalize a bound which was previously known to hold only for the binary symmetric sequencing channel, and only under certain restrictions on the molecule length scaling and the crossover probability parameters. When specified to binary symmetric sequencing channel, these restrictions are completely removed for the lower bound and are significantly relaxed for the upper bound in the high-noise regime. The lower bound on the reliability function is achieved under a universal decoder, and reveals that the dominant error event is that of outage -- the event in which the capacity of the channel induced by the DNA molecule sampling operation does not support the target rate.

preprint2021arXiv

Encoding Individual Source Sequences for the Wiretap Channel

We consider the problem of encoding a deterministic source sequence (a.k.a.\ individual sequence) for the degraded wiretap channel by means of an encoder and decoder that can both be implemented as finite--state machines. Our first main result is a necessary condition for both reliable and secure transmission in terms of the given source sequence, the bandwidth expansion factor, the secrecy capacity, the number of states of the encoder and the number of states of the decoder. Equivalently, this necessary condition can be presented as a converse bound (i.e., a lower bound) on the smallest achievable bandwidth expansion factor. The bound is asymptotically achievable by Lempel-Ziv compression followed by good channel coding for the wiretap channel. Given that the lower bound is saturated, we also derive a lower bound on the minimum necessary rate of purely random bits needed for local randomness at the encoder in order to meet the security constraint. This bound too is achieved by the same achievability scheme. Finally, we extend the main results to the case where the legitimate decoder has access to a side information sequence, which is another individual sequence that may be related to the source sequence, and a noisy version of the side information sequence leaks to the wiretapper.

preprint2021arXiv

Trade-offs Between Error Exponents and Excess-Rate Exponents of Typical Slepian-Wolf Codes

Typical random codes (TRC) in a communication scenario of source coding with side information at the decoder is the main subject of this work. We study the semi-deterministic code ensemble, which is a certain variant of the ordinary random binning code ensemble. In this code ensemble, the relatively small type classes of the source are deterministically partitioned into the available bins in a one-to-one manner. As a consequence, the error probability decreases dramatically. The random binning error exponent and the error exponent of the TRC are derived and proved to be equal to one another in a few important special cases. We show that the performance under optimal decoding can be attained also by certain universal decoders, e.g., the stochastic likelihood decoder with an empirical entropy metric. Moreover, we discuss the trade-offs between the error exponent and the excess-rate exponent for the typical random semi-deterministic code and characterize its optimal rate function. We show that for any pair of correlated information sources, both error and excess-rate probabilities are exponentially vanishing when the blocklength tends to infinity.

preprint2020arXiv

An Integral Representation of the Logarithmic Function with Applications in Information Theory

We explore a well-known integral representation of the logarithmic function, and demonstrate its usefulness in obtaining compact, easily-computable exact formulas for quantities that involve expectations and higher moments of the logarithm of a positive random variable (or the logarithm of a sum of positive random variables). The integral representation of the logarithm is proved useful in a variety of information-theoretic applications, including universal lossless data compression, entropy and differential entropy evaluations, and the calculation of the ergodic capacity of the single-input, multiple-output (SIMO) Gaussian channel with random parameters (known to both transmitter and receiver). This integral representation and its variants are anticipated to serve as a useful tool in additional applications, as a rigorous alternative to the popular (but non-rigorous) replica method (at least in some situations).

preprint2020arXiv

On More General Distributions of Random Binning for Slepian-Wolf Encoding

Traditionally, ensembles of Slepian-Wolf (SW) codes are defined such that every bin of each $n$-vector of each source is randomly drawn under the uniform distribution across the sets $\{0,1,\ldots,2^{nR_X}-1\}$ and $\{0,1,\ldots,2^{nR_Y}-1\}$, where $R_X$ and $R_Y$ are the coding rates of the two sources, $X$ and $Y$, respectively. In a few more recent works, where only one source, say, $X$, is compressed and the other one, $Y$, serves as side information available at the decoder, the scope is extended to variable-rate S-W (VRSW) codes, where the rate is allowed to depend on the type class of the source string, but still, the random-binning distribution is assumed uniform within the corresponding, type-dependent, bin index set. In this expository work, we investigate the role of the uniformity of the random binning distribution from the perspective of the trade-off between the reliability (defined in terms of the error exponent) and the compression performance (measured from the viewpoint of the source coding exponent). To this end, we study a much wider class of random-binning distributions, which includes the ensemble of VRSW codes as a special case, but it also goes considerably beyond. We first show that, with the exception of some pathological cases, the smaller ensemble, of VRSW codes, is as good as the larger ensemble in terms the trade-off between the error exponent and the source coding exponent. Notwithstanding this finding, the wider class of ensembles proposed is motivated in two ways. The first is that it outperforms VRSW codes in the above-mentioned pathological cases, and the second is that it allows robustness: in the event of a system failure that causes unavailability of the compressed bit-stream from one of the sources, it still allows reconstruction of the other source within some controllable distortion.

preprint2020arXiv

Optimal Work Extraction and the Minimum Description Length Principle

We discuss work extraction from classical information engines (e.g., Szilárd) with $N$-particles, $q$ partitions, and initial arbitrary non-equilibrium states. In particular, we focus on their {\em optimal} behaviour, which includes the measurement of a set of quantities $Φ$ with a feedback protocol that extracts the maximal average amount of work. We show that the optimal non-equilibrium state to which the engine should be driven before the measurement is given by the normalised maximum-likelihood probability distribution of a statistical model that admits $Φ$ as sufficient statistics. Furthermore, we show that the minimax universal code redundancy $\mathcal{R}^*$ associated to this model, provides an upper bound to the work that the demon can extract on average from the cycle, in units of $k_{\rm B}T$. We also find that, in the limit of $N$ large, the maximum average extracted work cannot exceed $H[Φ]/2$, i.e. one half times the Shannon entropy of the measurement. Our results establish a connection between optimal work extraction in stochastic thermodynamics and optimal universal data compression, providing design principles for optimal information engines. In particular, they suggest that: (i) optimal coding is thermodynamically efficient, and (ii) it is essential to drive the system into a critical state in order to achieve optimal performance.

preprint2020arXiv

Some Useful Integral Representations for Information-Theoretic Analyses

This work is an extension of our earlier article, where a well-known integral representation of the logarithmic function was explored, and was accompanied with demonstrations of its usefulness in obtaining compact, easily-calculable, exact formulas for quantities that involve expectations of the logarithm of a positive random variable. Here, in the same spirit, we derive an exact integral representation (in one or two dimensions) of the moment of a nonnegative random variable, or the sum of such independent random variables, where the moment order is a general positive noninteger real (also known as fractional moments). The proposed formula is applied to a variety of examples with an information-theoretic motivation, and it is shown how it facilitates their numerical evaluations. In particular, when applied to the calculation of a moment of the sum of a large number, $n$, of nonnegative random variables, it is clear that integration over one or two dimensions, as suggested by our proposed integral representation, is significantly easier than the alternative of integrating over $n$ dimensions, as needed in the direct calculation of the desired moment.

preprint2020arXiv

The MMI Decoder is Asymptotically Optimal for the Typical Random Code and for the Expurgated Code

We provide two results concerning the optimality of the maximum mutual information (MMI) decoder. First, we prove that the error exponents of the typical random codes under the optimal maximum likelihood (ML) decoder and the MMI decoder are equal. As a corollary to this result, we also show that the error exponents of the expurgated codes under the ML and the MMI decoders are equal. These results strengthen the well known result due to Csiszár and Körner, according to which, these decoders achieve equal random coding error exponents, since the error exponents of the typical random code and the expurgated code are strictly higher than the random coding error exponents, at least at low coding rates. While the universal optimality of the MMI decoder, in the random-coding error exponent sense, is easily proven by commuting the expectation over the channel noise and the expectation over the ensemble, when it comes to typical and expurgated exponents, this commutation can no longer be carried out. Therefore, the proof of the universal optimality of the MMI decoder must be completely different and it turns out to be highly non-trivial.

preprint2020arXiv

Universal Decoding for Asynchronous Slepian-Wolf Encoding

We consider the problem of (almost) lossless source coding of two correlated memoryless sources using separate encoders and a joint decoder, that is, Slepian-Wolf (S-W) coding. In our setting, the encoding and decoding are asynchronous, i.e., there is a certain relative delay between the two sources. Neither the source parameters nor the relative delay are known to the encoders and the decoder. Since we assume that both encoders implement standard random binning, which does not require such knowledge anyway, the focus of this work is on the decoder. Our main contribution is in proposing a universal decoder, that independent of the unknown source parameters and the relative delay, and at the same time, is asymptotically as good as the optimal maximum a posteriori probability (MAP) decoder in the sense of the random coding error exponent achieved.Consequently, the achievable rate region is also the same as if the source parameters and the delay were known to the decoder.

preprint2016arXiv

Asymptotic MMSE Analysis Under Sparse Representation Modeling

Compressed sensing is a signal processing technique in which data is acquired directly in a compressed form. There are two modeling approaches that can be considered: the worst-case (Hamming) approach and a statistical mechanism, in which the signals are modeled as random processes rather than as individual sequences. In this paper, the second approach is studied. In particular, we consider a model of the form $\boldsymbol{Y} = \boldsymbol{H}\boldsymbol{X}+\boldsymbol{W}$, where each comportment of $\boldsymbol{X}$ is given by $X_i = S_iU_i$, where $\left\{U_i\right\}$ are i.i.d. Gaussian random variables, and $\left\{S_i\right\}$ are binary random variables independent of $\left\{U_i\right\}$, and not necessarily independent and identically distributed (i.i.d.), $\boldsymbol{H}\in\mathbb{R}^{k\times n}$ is a random matrix with i.i.d. entries, and $\boldsymbol{W}$ is white Gaussian noise. Using a direct relationship between optimum estimation and certain partition functions, and by invoking methods from statistical mechanics and from random matrix theory (RMT), we derive an asymptotic formula for the minimum mean-square error (MMSE) of estimating the input vector $\boldsymbol{X}$ given $\boldsymbol{Y}$ and $\boldsymbol{H}$, as $k,n\to\infty$, keeping the measurement rate, $R = k/n$, fixed. In contrast to previous derivations, which are based on the replica method, the analysis carried out in this paper is rigorous.

preprint2016arXiv

Converse Bounds on Modulation-Estimation Performance for the Gaussian Multiple-Access Channel

This paper focuses on the problem of separately modulating and jointly estimating two independent continuous-valued parameters sent over a Gaussian multiple-access channel (MAC) under the mean square error (MSE) criterion. To this end, we first improve an existing lower bound on the MSE that is obtained using the parameter modulation-estimation techniques for the single-user additive white Gaussian noise (AWGN) channel. As for the main contribution of this work, this improved modulation-estimation analysis is generalized to the model of the two-user Gaussian MAC, which will likely become an important mathematical framework for the analysis of remote sensing problems in wireless networks. We present outer bounds to the achievable region in the plane of the MSEs of the two user parameters, which provides a trade-off between the MSEs, in addition to the upper bounds on the achievable region of the MSE exponents, namely, the exponential decay rates of these MSEs in the asymptotic regime of long blocks.

preprint2016arXiv

Exact Random Coding Secrecy Exponents for the Wiretap Channel

We analyze the exact exponential decay rate of the expected amount of information leaked to the wiretapper in Wyner's wiretap channel setting using wiretap channel codes constructed from both i.i.d. and constant-composition random codes. Our analysis for those sampled from i.i.d. random coding ensemble shows that the previously-known achievable secrecy exponent using this ensemble is indeed the exact exponent for an average code in the ensemble. Furthermore, our analysis on wiretap channel codes constructed from the ensemble of constant-composition random codes leads to an exponent which, in addition to being the exact exponent for an average code, is larger than the achievable secrecy exponent that has been established so far in the literature for this ensemble (which in turn was known to be smaller than that achievable by wiretap channel codes sampled from i.i.d. random coding ensemble). We show examples where the exact secrecy exponent for the wiretap channel codes constructed from random constant-composition codes is larger than that of those constructed from i.i.d. random codes and examples where the exact secrecy exponent for the wiretap channel codes constructed from i.i.d. random codes is larger than that of those constructed from constant-composition random codes. We, hence, conclude that, unlike the error correction problem, there is no general ordering between the two random coding ensembles in terms of their secrecy exponent.

preprint2016arXiv

Lower Bounds on Parameter Modulation-Estimation Under Bandwidth Constraints

We consider the problem of modulating the value of a parameter onto a band-limited signal to be transmitted over a continuous-time, additive white Gaussian noise (AWGN) channel, and estimating this parameter at the receiver. The performance is measured by the mean power-$α$ error (MP$α$E), which is defined as the worst-case $α$-th order moment of the absolute estimation error. The optimal exponential decay rate of the MP$α$E as a function of the transmission time, is investigated. Two upper (converse) bounds on the MP$α$E exponent are derived, on the basis of known bounds for the AWGN channel of inputs with unlimited bandwidth. The bounds are computed for typical values of the error moment and the signal-to-noise ratio (SNR), and the SNR asymptotics of the different bounds are analyzed. The new bounds are compared to known converse and achievability bounds, which were derived from channel coding considerations.

preprint2016arXiv

On empirical cumulant generating functions of code lengths for individual sequences

We consider the problem of lossless compression of individual sequences using finite-state (FS) machines, from the perspective of the best achievable empirical cumulant generating function (CGF) of the code length, i.e., the normalized logarithm of the empirical average of the exponentiated code length. Since the probabilistic CGF is minimized in terms of the Rényi entropy of the source, one of the motivations of this study is to derive an individual-sequence analogue of the Rényi entropy, in the same way that the FS compressibility is the individual-sequence counterpart of the Shannon entropy. We consider the CGF of the code-length both from the perspective of fixed-to-variable (F-V) length coding and the perspective of variable-to-variable (V-V) length coding, where the latter turns out to yield a better result, that coincides with the FS compressibility. We also extend our results to compression with side information, available at both the encoder and decoder. In this case, the V-V version no longer coincides with the FS compressibility, but results in a different complexity measure.

preprint2016arXiv

Reliability of universal decoding based on vector-quantized codewords

Motivated by applications of biometric identification and content identification systems, we consider the problem of random coding for channels, where each codeword undergoes lossy compression (vector quantization), and where the decoder bases its decision only on the compressed codewords and the channel output, which is in turn, the channel's response to the transmission of an original codeword, before compression. For memoryless sources and memoryless channels with finite alphabets, we propose a new universal decoder and analyze its error exponent, which improves on an earlier result by Dasarathy and Draper (2011), who used the classic maximum mutual information (MMI) universal decoder. Further, we show that our universal decoder provides the same error exponent as that of the optimal, maximum likelihood (ML) decoder, at least as long as all single-letter transition probabilities of the channel are positive. We conjecture that the same argument remains true even without this positivity condition.

preprint2016arXiv

Universal Decoding for Source-Channel Coding with Side Information

We consider a setting of Slepian--Wolf coding, where the random bin of the source vector undergoes channel coding, and then decoded at the receiver, based on additional side information, correlated to the source. For a given distribution of the randomly selected channel codewords, we propose a universal decoder that depends on the statistics of neither the correlated sources nor the channel, assuming first that they are both memoryless. Exact analysis of the random-binning/random-coding error exponent of this universal decoder shows that it is the same as the one achieved by the optimal maximum a-posteriori (MAP) decoder. Previously known results on universal Slepian-Wolf source decoding, universal channel decoding, and universal source-channel decoding, are all obtained as special cases of this result. Subsequently, we further generalize the results in several directions, including: (i) finite-state sources and finite-state channels, along with a universal decoding metric that is based on Lempel-Ziv parsing, (ii) arbitrary sources and channels, where the universal decoding is with respect to a given class of decoding metrics, and (iii) full (symmetric) Slepian-Wolf coding, where both source streams are separately fed into random-binning source encoders, followed by random channel encoders, which are then jointly decoded by a universal decoder.

preprint2016arXiv

Universal decoding using a noisy codebook

We consider the topic of universal decoding with a decoder that does not have direct access to the codebook, but only to noisy versions of the various randomly generated codewords, a problem motivated by biometrical identification systems. Both the source that generates the original (clean) codewords, and the channel that corrupts them in generating the noisy codewords, as well as the main channel for communicating the messages, are all modeled by non-unifilar, finite-state systems (hidden Markov models). As in previous works on universal decoding, here too, the average error probability of our proposed universal decoder is shown to be as small as that of the optimal maximum likelihood (ML) decoder, up to a multiplicative factorthat is a sub-exponential function of the block length. It therefore has the same error exponent, whenever the ML decoder has a positive error exponent. The universal decoding metric is based on Lempel-Ziv (LZ) incremental parsing of each noisy codeword jointly with the given channel output vector, but this metric is somewhat different from the one proposed in earlier works on universal decoding for finite-state channels, by Ziv (1985) and by Lapidoth and Ziv (1998). The reason for the difference is that here, unlike in those earlier works, the probability distribution that governs the (noisy) codewords is, in general, not uniform across its support. This non-uniformity of the codeword distribution also makes our derivation more challenging. Another reason for the more challenging analysis is the fact that the effective induced channel between the noisy codeword of the transmitted message and the main channel output is not a finite-state channel in general.

preprint2015arXiv

A Large Deviations Approach to Secure Lossy Compression

We consider a Shannon cipher system for memoryless sources, in which distortion is allowed at the legitimate decoder. The source is compressed using a rate distortion code secured by a shared key, which satisfies a constraint on the compression rate, as well as a constraint on the exponential rate of the excess-distortion probability at the legitimate decoder. Secrecy is measured by the exponential rate of the exiguous-distortion probability at the eavesdropper, rather than by the traditional measure of equivocation. We define the perfect secrecy exponent as the maximal exiguous-distortion exponent achievable when the key rate is unlimited. Under limited key rate, we prove that the maximal achievable exiguous-distortion exponent is equal to the minimum between the average key rate and the perfect secrecy exponent, for a fairly general class of variable key rate codes.

preprint2015arXiv

Channel Detection in Coded Communication

We consider the problem of block-coded communication, where in each block, the channel law belongs to one of two disjoint sets. The decoder is aimed to decode only messages that have undergone a channel from one of the sets, and thus has to detect the set which contains the prevailing channel. We begin with the simplified case where each of the sets is a singleton. For any given code, we derive the optimum detection/decoding rule in the sense of the best trade-off among the probabilities of decoding error, false alarm, and misdetection, and also introduce sub-optimal detection/decoding rules which are simpler to implement. Then, various achievable bounds on the error exponents are derived, including the exact single-letter characterization of the random coding exponents for the optimal detector/decoder. We then extend the random coding analysis to general sets of channels, and show that there exists a universal detector/decoder which performs asymptotically as well as the optimal detector/decoder, when tuned to detect a channel from a specific pair of channels. The case of a pair of binary symmetric channels is discussed in detail.

preprint2015arXiv

Comments on "Identifying Functional Thermodynamics in Autonomous Maxwellian Ratchets" (arXiv:1507.01537v2)

We make a few comments on some misleading statements in the above paper.

preprint2015arXiv

Sequence complexity and work extraction

We consider a simplified version of a solvable model by Mandal and Jarzynski, which constructively demonstrates the interplay between work extraction and the increase of the Shannon entropy of an information reservoir which is in contact with the physical system. We extend Mandal and Jarzynski's main findings in several directions: First, we allow sequences of correlated bits rather than just independent bits. Secondly, at least for the case of binary information, we show that, in fact, the Shannon entropy is only one measure of complexity of the information that must increase in order for work to be extracted. The extracted work can also be upper bounded in terms of the increase in other quantities that measure complexity, like the predictability of future bits from past ones. Third, we provide an extension to the case of non-binary information (i.e., a larger alphabet), and finally, we extend the scope to the case where the incoming bits (before the interaction) form an individual sequence, rather than a random one. In this case, the entropy before the interaction can be replaced by the Lempel-Ziv (LZ) complexity of the incoming sequence, a fact that gives rise to an entropic meaning of the LZ complexity, not only in information theory, but also in physics.

preprint2015arXiv

The generalized likelihood decoder: random coding and expurgated bounds

The likelihood decoder is a stochastic decoder that selects the decoded message at random, using the posterior distribution of the true underlying message given the channel output. In this work, we study a generalized version of this decoder where the posterior is proportional to a general function that depends only on the joint empirical distribution of the output vector and the codeword. This framework allows both mismatched versions and universal (MMI) versions of the likelihood decoder, as well as the corresponding ordinary deterministic decoders, among many others. We provide a direct analysis method that yields the exact random coding exponent (as opposed to separate upper bounds and lower bounds that turn out to be compatible, which were derived earlier by Scarlett et al. We also extend the result from pure channel coding to combined source and channel coding (random binning followed by random channel coding) with side information available to the decoder. Finally, returning to pure channel coding, we derive also an expurgated exponent for the stochastic likelihood decoder, which turns out to be at least as tight (and in some cases, strictly so) as the classical expurgated exponent of the maximum likelihood decoder, even though the stochastic likelihood decoder is suboptimal.

preprint2014arXiv

Data Processing Bounds for Scalar Lossy Source Codes with Side Information at the Decoder

In this paper, we introduce new lower bounds on the distortion of scalar fixed-rate codes for lossy compression with side information available at the receiver. These bounds are derived by presenting the relevant random variables as a Markov chain and applying generalized data processing inequalities a la Ziv and Zakai. We show that by replacing the logarithmic function with other functions, in the data processing theorem we formulate, we obtain new lower bounds on the distortion of scalar coding with side information at the decoder. The usefulness of these results is demonstrated for uniform sources and the convex function $Q(t)=t^{1-α}$, $α>1$. The bounds in this case are shown to be better than one can obtain from the Wyner-Ziv rate-distortion function.

preprint2014arXiv

Exact correct-decoding exponent of the wiretap channel decoder

The security level of the achievability scheme for Wyner's wiretap channel model is examined from the perspective of the probability of correct decoding, $P_c$, at the wiretap channel decoder. In particular, for finite-alphabet memoryless channels, the exact random coding exponent of $P_c$ is derived as a function of the total coding rate $R_1$ and the rate of each sub-code $R_2$. Two different representations are given for this function and its basic properties are provided. We also characterize the region of pairs of rates $(R_1,R_2)$ of full security in the sense of the random coding exponent of $P_c$, in other words, the region where the exponent of this achievability scheme is the same as that of blind guessing at the eavesdropper side. Finally, an analogous derivation of the correct-decoding exponent is outlined for the case of the Gaussian channel.

preprint2014arXiv

Exact random coding error exponents of optimal bin index decoding

We consider ensembles of channel codes that are partitioned into bins, and focus on analysis of exact random coding error exponents associated with optimum decoding of the index of the bin to which the transmitted codeword belongs. Two main conclusions arise from this analysis: (i) for independent random selection of codewords within a given type class, the random coding exponent of optimal bin index decoding is given by the ordinary random coding exponent function, computed at the rate of the entire code, independently of the exponential rate of the size of the bin. (ii) for this ensemble of codes, sub-optimal bin index decoding, that is based on ordinary maximum likelihood (ML) decoding, is as good as the optimal bin index decoding in terms of the random coding error exponent achieved. Finally, for the sake of completeness, we also outline how our analysis of exact random coding exponents extends to the hierarchical ensemble that correspond to superposition coding and optimal decoding, where for each bin, first, a cloud center is drawn at random, and then the codewords of this bin are drawn conditionally independently given the cloud center. For this ensemble, conclusions (i) and (ii), mentioned above, no longer hold necessarily in general.

preprint2014arXiv

Expurgated Random-Coding Ensembles: Exponents, Refinements and Connections

This paper studies expurgated random-coding bounds and exponents for channel coding with a given (possibly suboptimal) decoding rule. Variations of Gallager's analysis are presented, yielding several asymptotic and non-asymptotic bounds on the error probability for an arbitrary codeword distribution. A simple non-asymptotic bound is shown to attain an exponent of Csiszár and Körner under constant-composition coding. Using Lagrange duality, this exponent is expressed in several forms, one of which is shown to permit a direct derivation via cost-constrained coding which extends to infinite and continuous alphabets. The method of type class enumeration is studied, and it is shown that this approach can yield improved exponents and better tightness guarantees for some codeword distributions. A generalization of this approach is shown to provide a multi-letter exponent which extends immediately to channels with memory. Finally, a refined analysis expurgated i.i.d. random coding is shown to yield a O\big(\frac{1}{\sqrt{n}}\big) prefactor, thus improving on the standard O(1) prefactor. Moreover, the implied constant is explicitly characterized.

preprint2014arXiv

Information-theoretic applications of the logarithmic probability comparison bound

A well-known technique in estimating probabilities of rare events in general and in information theory in particular (used, e.g., in the sphere-packing bound), is that of finding a reference probability measure under which the event of interest has probability of order one and estimating the probability in question by means of the Kullback-Leibler divergence. A method has recently been proposed in [2], that can be viewed as an extension of this idea in which the probability under the reference measure may itself be decaying exponentially, and the Renyi divergence is used instead. The purpose of this paper is to demonstrate the usefulness of this approach in various information-theoretic settings. For the problem of channel coding, we provide a general methodology for obtaining matched, mismatched and robust error exponent bounds, as well as new results in a variety of particular channel models. Other applications we address include rate-distortion coding and the problem of guessing.

preprint2014arXiv

On Compressive Sensing in Coding Problems: A Rigorous Approach

We take an information theoretic perspective on a classical sparse-sampling noisy linear model and present an analytical expression for the mutual information, which plays central role in a variety of communications/processing problems. Such an expression was addressed previously either by bounds, by simulations and by the (non-rigorous) replica method. The expression of the mutual information is based on techniques used in [1], addressing the minimum mean square error (MMSE) analysis. Using these expressions, we study specifically a variety of sparse linear communications models which include coding in different settings, accounting also for multiple access channels and different wiretap problems. For those, we provide single-letter expressions and derive achievable rates, capturing the communications/processing features of these timely models.

preprint2014arXiv

On zero-rate error exponents of finite-state channels with input-dependent states

We derive a single-letter formula for the zero-rate reliability (error exponent) of a finite-state channel whose state variable depends deterministically (and recursively) on past channel inputs, where the code complies with a given channel input constraint. Special attention is then devoted to the important special case of the Gaussian channel with inter-symbol interference (ISI), where more explicit results are obtained.

preprint2014arXiv

Optimum Trade-offs Between the Error Exponent and the Excess-Rate Exponent of Variable-Rate Slepian-Wolf Coding

We analyze the optimal trade-off between the error exponent and the excess-rate exponent for variable-rate Slepian-Wolf codes. In particular, we first derive upper (converse) bounds on the optimal error and excess-rate exponents, and then lower (achievable) bounds, via a simple class of variable-rate codes which assign the same rate to all source blocks of the same type class. Then, using the exponent bounds, we derive bounds on the optimal rate functions, namely, the minimal rate assigned to each type class, needed in order to achieve a given target error exponent. The resulting excess-rate exponent is then evaluated. Iterative algorithms are provided for the computation of both bounds on the optimal rate functions and their excess-rate exponents. The resulting Slepian-Wolf codes bridge between the two extremes of fixed-rate coding, which has minimal error exponent and maximal excess-rate exponent, and average-rate coding, which has maximal error exponent and minimal excess-rate exponent.

preprint2014arXiv

Simplified Erasure/List Decoding

We consider the problem of erasure/list decoding using certain classes of simplified decoders. Specifically, we assume a class of erasure/list decoders, such that a codeword is in the list if its likelihood is larger than a threshold. This class of decoders both approximates the optimal decoder of Forney, and also includes the following simplified subclasses of decoding rules: The first is a function of the output vector only, but not the codebook (which is most suitable for high rates), and the second is a scaled version of the maximum likelihood decoder (which is most suitable for low rates). We provide single-letter expressions for the exact random coding exponents of any decoder in these classes, operating over a discrete memoryless channel. For each class of decoders, we find the optimal decoder within the class, in the sense that it maximizes the erasure/list exponent, under a given constraint on the error exponent. We establish the optimality of the simplified decoders of the first and second kind for low and high rates, respectively.

preprint2014arXiv

Statistical physics of random binning

We consider the model of random binning and finite-temperature decoding for Slepian-Wolf codes, from a statistical-mechanical perspective. While ordinary random channel coding is intimately related to the random energy model (REM) - a statistical-mechanical model of disordered magnetic materials, it turns out that random binning (for Slepian-Wolf coding) is analogous to another, related statistical mechanical model of strong disorder, which we call the random dilution model (RDM). We use the latter analogy to characterize phase transitions pertaining to finite- temperature Slepian-Wolf decoding, which are somewhat similar, but not identical, to those of finite-temperature channel decoding. We then provide the exact random coding exponent of the bit error rate (BER) as a function of the coding rate and the decoding temperature, and discuss its properties. Finally, a few modifications and extensions of our results are outlined and discussed.

preprint2014arXiv

Universal Decoding for Gaussian Intersymbol Interference Channels

A universal decoding procedure is proposed for the intersymbol interference (ISI) Gaussian channels. The universality of the proposed decoder is in the sense of being independent of the various channel parameters, and at the same time, attaining the same random coding error exponent as the optimal maximum-likelihood (ML) decoder, which utilizes full knowledge of these unknown parameters. The proposed decoding rule can be regarded as a frequency domain version of the universal maximum mutual information (MMI) decoder. Contrary to previously suggested universal decoders for ISI channels, our proposed decoding metric can easily be evaluated.

preprint2014arXiv

Universal Quantization for Separate Encodings and Joint Decoding of Correlated Sources

We consider the multi-user lossy source-coding problem for continuous alphabet sources. In a previous work, Ziv proposed a single-user universal coding scheme which uses uniform quantization with dither, followed by a lossless source encoder (entropy coder). In this paper, we generalize Ziv's scheme to the multi-user setting. For this generalized universal scheme, upper bounds are derived on the redundancies, defined as the differences between the actual rates and the closest corresponding rates on the boundary of the rate region. It is shown that this scheme can achieve redundancies of no more than 0.754 bits per sample for each user. These bounds are obtained without knowledge of the multi-user rate region, which is an open problem in general. As a direct consequence of these results, inner and outer bounds on the rate-distortion achievable region are obtained.

preprint2013arXiv

Analysis of Mismatched Estimation Errors Using Gradients of Partition Functions

We consider the problem of signal estimation (denoising) from a statistical-mechanical perspective, in continuation to a recent work on the analysis of mean-square error (MSE) estimation using a direct relationship between optimum estimation and certain partition functions. The paper consists of essentially two parts. In the first part, using the aforementioned relationship, we derive single-letter expressions of the mismatched MSE of a codeword (from a randomly selected code), corrupted by a Gaussian vector channel. In the second part, we provide several examples to demonstrate phase transitions in the behavior of the MSE. These examples enable us to understand more deeply and to gather intuition regarding the roles of the real and the mismatched probability measures in creating these phase transitions.

preprint2013arXiv

Another look at expurgated bounds and their statistical-mechanical interpretation

We revisit the derivation of expurgated error exponents using a method of type class enumeration, which is inspired by statistical-mechanical methods, and which has already been used in the derivation of random coding exponents in several other scenarios. We compare our version of the expurgated bound to both the one by Gallager and the one by Csiszar, Korner and Marton (CKM). For expurgated ensembles of fixed composition codes over finite alphabets, our basic expurgated bound coincides with the CKM expurgated bound, which is in general tighter than Gallager's bound, but with equality for the optimum type class of codewords. Our method, however, extends beyond fixed composition codes and beyond finite alphabets, where it is natural to impose input constraints (e.g., power limitation). In such cases, the CKM expurgated bound may not apply directly, and our bound is in general tighter than Gallager's bound. In addition, while both the CKM and the Gallager expurgated bounds are based on Bhattacharyya bound for bounding the pairwise error probabilities, our bound allows the more general Chernoff distance measure, thus giving rise to additional improvement using the Chernoff parameter as a degree of freedom to be optimized.

preprint2013arXiv

Asymptotically optimal decision rules for joint detection and source coding

The problem of joint detection and lossless source coding is considered. We derive asymptotically optimal decision rules for deciding whether or not a sequence of observations has emerged from a desired information source, and to compress it if has. In particular, our decision rules asymptotically minimize the cost of compression in the case that the data has been classified as `desirable', subject to given constraints on the two kinds of the probability of error. In another version of this performance criterion, the constraint on the false alarm probability is replaced by the a constraint on the cost of compression in the false alarm event. We then analyze the asymptotic performance of these decision rules and demonstrate that they may exhibit certain phase transitions. We also derive universal decision rules for the case where the underlying sources (under either hypothesis or both) are unknown, and training sequences from each source may or may not be available. Finally, we discuss how our framework can be extended in several directions.

preprint2013arXiv

Codeword or noise? Exact random coding exponents for slotted asynchronism

We consider the problem of slotted asynchronous coded communication, where in each time frame (slot), the transmitter is either silent or transmits a codeword from a given (randomly selected) codebook. The task of the decoder is to decide whether transmission has taken place, and if so, to decode the message. We derive the optimum detection/decoding rule in the sense of the best trade-off among the probabilities of decoding error, false alarm, and misdetection. For this detection/decoding rule, we then derive single-letter characterizations of the exact exponential rates of these three probabilities for the average code in the ensemble.

preprint2013arXiv

Erasure/list exponents for Slepian-Wolf decoding

We analyze random coding error exponents associated with erasure/list Slepian-Wolf decoding using two different methods and then compare the resulting bounds. The first method follows the well known techniques of Gallager and Forney and the second method is based on a technique of distance enumeration, or more generally, type class enumeration, which is rooted in the statistical mechanics of a disordered system that is related to the random energy model (REM). The second method is guaranteed to yield exponent functions which are at least as tight as those of the first method, and it is demonstrated that for certain combinations of coding rates and thresholds, the bounds of the second method are strictly tighter than those of the first method, by an arbitrarily large factor. In fact, the second method may even yield an infinite exponent at regions where the first method gives finite values. We also discuss the option of variable-rate Slepian-Wolf encoding and demonstrate how it can improve on the resulting exponents.

preprint2013arXiv

List decoding - random coding exponents and expurgated exponents

Some new results are derived concerning random coding error exponents and expurgated exponents for list decoding with a deterministic list size $L$. Two asymptotic regimes are considered, the fixed list-size regime, where $L$ is fixed independently of the block length $n$, and the exponential list-size, where $L$ grows exponentially with $n$. We first derive a general upper bound on the list-decoding average error probability, which is suitable for both regimes. This bound leads to more specific bounds in the two regimes. In the fixed list-size regime, the bound is related to known bounds and we establish its exponential tightness. In the exponential list-size regime, we establish the achievability of the well known sphere packing lower bound. Relations to guessing exponents are also provided. An immediate byproduct of our analysis in both regimes is the universality of the maximum mutual information (MMI) list decoder in the error exponent sense. Finally, we consider expurgated bounds at low rates, both using Gallager's approach and the Csiszár-Körner-Marton approach, which is, in general better (at least for $L=1$). The latter expurgated bound, which involves the notion of {\it multi-information}, is also modified to apply to continuous alphabet channels, and in particular, to the Gaussian memoryless channel, where the expression of the expurgated bound becomes quite explicit.

preprint2013arXiv

On the data processing theorem in the semi-deterministic setting

Data processing lower bounds on the expected distortion are derived in the finite-alphabet semi-deterministic setting, where the source produces a deterministic, individual sequence, but the channel model is probabilistic, and the decoder is subjected to various kinds of limitations, e.g., decoders implementable by finite-state machines, with or without counters, and with or without a restriction of common reconstruction with high probability. Some of our bounds are given in terms of the Lempel-Ziv complexity of the source sequence or the reproduction sequence. We also demonstrate how some analogous results can be obtained for classes of linear encoders and linear decoders in the continuous alphabet case.

preprint2013arXiv

Statistical Physics: a Short Course for Electrical Engineering Students

This is a set of lecture notes of a course on statistical physics and thermodynamics, which is oriented, to a certain extent, towards electrical engineering students. The main body of the lectures is devoted to statistical physics, whereas much less emphasis is given to the thermodynamics part. In particular, the idea is to let the most important results of thermodynamics (most notably, the laws of thermodynamics) to be obtained as conclusions from the derivations in statistical physics. Beyond the variety of central topics in statistical physics that are important to the general scientific education of the EE student, special emphasis is devoted to subjects that are vital to the engineering education concretely. These include, first of all, quantum statistics, like the Fermi-Dirac distribution, as well as diffusion processes, which are both fundamental for deep understanding of semiconductor devices. Another important issue for the EE student is to understand mechanisms of noise generation and stochastic dynamics in physical systems, most notably, in electric circuitry. Accordingly, the fluctuation-dissipation theorem of statistical mechanics, which is the theoretical basis for understanding thermal noise processes in systems, is presented from a signals--and--systems point of view, in a way that would hopefully be understandable and useful for an engineering student, and well connected to other courses in the electrcial engineering curriculum like courses on random priocesses. The quantum regime, in this context, is important too and hence provided as well. Finally, we touch very briefly upon some relationships between statistical mechanics and information theory, which is the theoretical basis for communications engineering, and demonstrate how statistical-mechanical approach can be useful in order for the study of information-theoretic problems.

preprint2013arXiv

Zero-Delay and Causal Secure Source Coding

We investigate the combination between causal/zero-delay source coding and information-theoretic secrecy. Two source coding models with secrecy constraints are considered. We start by considering zero-delay perfectly secret lossless transmission of a memoryless source. We derive bounds on the key rate and coding rate needed for perfect zero-delay secrecy. In this setting, we consider two models which differ by the ability of the eavesdropper to parse the bit-stream passing from the encoder to the legitimate decoder into separate messages. We also consider causal source coding with a fidelity criterion and side information at the decoder and the eavesdropper. Unlike the zero-delay setting where variable-length coding is traditionally used but might leak information on the source through the length of the codewords, in this setting, since delay is allowed, block coding is possible. We show that in this setting, separation of encryption and causal source coding is optimal.

preprint2013arXiv

Zero-Delay and Causal Single-User and Multi-User Lossy Source Coding with Decoder Side Information

We consider zero-delay single-user and multi-user source coding with average distortion constraint and decoder side information. The zero-delay constraint translates into causal (sequential) encoder and decoder pairs as well as the use of instantaneous codes. For the single-user setting, we show that optimal performance is attained by time sharing at most two scalar encoder-decoder pairs, that use zero-error side information codes. Side information lookahead is shown to useless in this setting. We show that the restriction to causal encoding functions is the one that causes the performance degradation, compared to unrestricted systems, and not the sequential decoders or instantaneous codes. Furthermore, we show that even without delay constraints, if either the encoder or decoder are restricted a-priori to be scalar, the performance loss cannot be compensated by the other component, which can be scalar as well without further loss. Finally, we show that the multi-terminal source coding problem can be solved in the zero-delay regime and the rate-distortion region is given.

preprint2012arXiv

Average redundancy of the Shannon code for Markov sources

It is known that for memoryless sources, the average and maximal redundancy of fixed-to-variable length codes, such as the Shannon and Huffman codes, exhibit two modes of behavior for long blocks. It either converges to a limit or it has an oscillatory pattern, depending on the irrationality or rationality, respectively, of certain parameters that depend on the source. In this paper, we extend these findings, concerning the Shannon code, to the case of a Markov source, which is considerably more involved. While this dichotomy, of convergent vs. oscillatory behavior, is well known in other contexts (including renewal theory, ergodic theory, local limit theorems and large deviations of discrete distributions), in information theory (e.g., in redundancy analysis) it was recognized relatively recently. To the best of our knowledge, no results of this type were reported thus far for Markov sources. We provide a precise characterization of the convergent vs. oscillatory behavior of the Shannon code redundancy for a class of irreducible, periodic and aperiodic, Markov sources. These findings are obtained by analytic methods, such as Fourier/Fejer series analysis and spectral analysis of matrices.

preprint2012arXiv

Exponential error bounds on parameter modulation-estimation for discrete memoryless channels

We consider the problem of modulation and estimation of a random parameter $U$ to be conveyed across a discrete memoryless channel. Upper and lower bounds are derived for the best achievable exponential decay rate of a general moment of the estimation error, $\bE|\hat{U}-U|^ρ$, $ρ\ge 0$, when both the modulator and the estimator are subjected to optimization. These exponential error bounds turn out to be intimately related to error exponents of channel coding and to channel capacity. While in general, there is some gap between the upper and the lower bound, they asymptotically coincide both for very small and for very large values of the moment power $ρ$. This means that our achievability scheme, which is based on simple quantization of $U$ followed by channel coding, is nearly optimum in both limits. Some additional properties of the bounds are discussed and demonstrated, and finally, an extension to the case of a multidimensional parameter vector is outlined, with the principal conclusion that our upper and lower bound asymptotically coincide also for a high dimensionality.

preprint2012arXiv

On optimum parameter modulation-estimation from a large deviations perspective

We consider the problem of jointly optimum modulation and estimation of a real-valued random parameter, conveyed over an additive white Gaussian noise (AWGN) channel, where the performance metric is the large deviations behavior of the estimator, namely, the exponential decay rate (as a function of the observation time) of the probability that the estimation error would exceed a certain threshold. Our basic result is in providing an exact characterization of the fastest achievable exponential decay rate, among all possible modulator-estimator (transmitter-receiver) pairs, where the modulator is limited only in the signal power, but not in bandwidth. This exponential rate turns out to be given by the reliability function of the AWGN channel. We also discuss several ways to achieve this optimum performance, and one of them is based on quantization of the parameter, followed by optimum channel coding and modulation, which gives rise to a separation-based transmitter, if one views this setting from the perspective of joint source-channel coding. This is in spite of the fact that, in general, when error exponents are considered, the source-channel separation theorem does not hold true. We also discuss several observations, modifications and extensions of this result in several directions, including other channels, and the case of multidimensional parameter vectors. One of our findings concerning the latter, is that there is an abrupt threshold effect in the dimensionality of the parameter vector: below a certain critical dimension, the probability of excess estimation error may still decay exponentially, but beyond this value, it must converge to unity.

preprint2012arXiv

On Real-Time and Causal Secure Source Coding

We investigate two source coding problems with secrecy constraints. In the first problem we consider real--time fully secure transmission of a memoryless source. We show that although classical variable--rate coding is not an option since the lengths of the codewords leak information on the source, the key rate can be as low as the average Huffman codeword length of the source. In the second problem we consider causal source coding with a fidelity criterion and side information at the decoder and the eavesdropper. We show that when the eavesdropper has degraded side information, it is optimal to first use a causal rate distortion code and then encrypt its output with a key.

preprint2012arXiv

Perfectly secure encryption of individual sequences

In analogy to the well-known notion of finite--state compressibility of individual sequences, due to Lempel and Ziv, we define a similar notion of "finite-state encryptability" of an individual plaintext sequence, as the minimum asymptotic key rate that must be consumed by finite-state encrypters so as to guarantee perfect secrecy in a well-defined sense. Our main basic result is that the finite-state encryptability is equal to the finite-state compressibility for every individual sequence. This is in parallelism to Shannon's classical probabilistic counterpart result, asserting that the minimum required key rate is equal to the entropy rate of the source. However, the redundancy, defined as the gap between the upper bound (direct part) and the lower bound (converse part) in the encryption problem, turns out to decay at a different rate (in fact, much slower) than the analogous redundancy associated with the compression problem. We also extend our main theorem in several directions, allowing: (i) availability of side information (SI) at the encrypter/decrypter/eavesdropper, (ii) lossy reconstruction at the decrypter, and (iii) the combination of both lossy reconstruction and SI, in the spirit of the Wyner--Ziv problem.

preprint2012arXiv

Universal decoding for arbitrary channels relative to a given class of decoding metrics

We consider the problem of universal decoding for arbitrary unknown channels in the random coding regime. For a given random coding distribution and a given class of metric decoders, we propose a generic universal decoder whose average error probability is, within a sub-exponential multiplicative factor, no larger than that of the best decoder within this class of decoders. Since the optimum, maximum likelihood (ML) decoder of the underlying channel is not necessarily assumed to belong to the given class of decoders, this setting suggests a common generalized framework for: (i) mismatched decoding, (ii) universal decoding for a given family of channels, and (iii) universal coding and decoding for deterministic channels using the individual-sequence approach. The proof of our universality result is fairly simple, and it is demonstrated how some earlier results on universal decoding are obtained as special cases. We also demonstrate how our method extends to more complicated scenarios, like incorporation of noiseless feedback, and the multiple access channel.

preprint2011arXiv

Data processing inequalities based on a certain structured class of information measures with application to estimation theory

We study data processing inequalities that are derived from a certain class of generalized information measures, where a series of convex functions and multiplicative likelihood ratios are nested alternately. While these information measures can be viewed as a special case of the most general Zakai-Ziv generalized information measure, this special nested structure calls for attention and motivates our study. Specifically, a certain choice of the convex functions leads to an information measure that extends the notion of the Bhattacharyya distance (or the Chernoff divergence): While the ordinary Bhattacharyya distance is based on the (weighted) geometric mean of two replicas of the channel's conditional distribution, the more general information measure allows an arbitrary number of such replicas. We apply the data processing inequality induced by this information measure to a detailed study of lower bounds of parameter estimation under additive white Gaussian noise (AWGN) and show that in certain cases, tighter bounds can be obtained by using more than two replicas. While the resulting lower bound may not compete favorably with the best bounds available for the ordinary AWGN channel, the advantage of the new lower bound, relative to the other bounds, becomes significant in the presence of channel uncertainty, like unknown fading. This different behavior in the presence of channel uncertainty is explained by the convexity property of the information measure.

preprint2011arXiv

On optimum strategies for minimizing the exponential moments of a given cost function

We consider a general problem of finding a strategy that minimizes the exponential moment of a given cost function, with an emphasis on its relation to the more common criterion of minimization the expectation of the first moment of the same cost function. In particular, our main result is a theorem that gives simple sufficient conditions for a strategy to be optimum in the exponential moment sense. This theorem may be useful in various situations, and application examples are given. We also examine the asymptotic regime and investigate universal asymptotically optimum strategies in light of the aforementioned sufficient conditions, as well as phenomena of irregularities, or phase transitions, in the behavior of the asymptotic performance, which can be viewed and understood from a statistical-mechanical perspective. Finally, we propose a new route for deriving lower bounds on exponential moments of certain cost functions (like the square error in estimation problems) on the basis of well known lower bounds on their expectations.

preprint2011arXiv

Relations between redundancy patterns of the Shannon code and wave diffraction patterns of partially disordered media

The average redundancy of the Shannon code, $R_n$, as a function of the block length $n$, is known to exhibit two very different types of behavior, depending on the rationality or irrationality of certain parameters of the source: It either converges to 1/2 as $n$ grows without bound, or it may have a non-vanishing, oscillatory, (quasi-) periodic pattern around the value 1/2 for all large $n$. In this paper, we make an attempt to shed some insight into this erratic behavior of $R_n$, by drawing an analogy with the realm of physics of wave propagation, in particular, the elementary theory of scattering and diffraction. It turns out that there are two types of behavior of wave diffraction patterns formed by crystals, which are correspondingly analogous to the two types of patterns of $R_n$. When the crystal is perfect, the diffraction intensity spectrum exhibits very sharp peaks, a.k.a. Bragg peaks, at wavelengths of full constructive interference. These wavelengths correspond to the frequencies of the harmonic waves of the oscillatory mode of $R_n$. On the other hand, when the crystal is imperfect and there is a considerable degree of disorder in its structure, the Bragg peaks disappear, and the behavior of this mode is analogous to the one where $R_n$ is convergent.

preprint2011arXiv

Structure Theorems for Real-Time Variable-Rate Coding With and Without Side Information

The output of a discrete Markov source is to be encoded instantaneously by a variable-rate encoder and decoded by a finite-state decoder. Our performance measure is a linear combination of the distortion and the instantaneous rate. Structure theorems, pertaining to the encoder and next-state functions are derived for every given finite-state decoder, which can have access to side information.

preprint2011arXiv

Subset sum phase transitions and data compression

We propose a rigorous analysis approach for the subset sum problem in the context of lossless data compression, where the phase transition of the subset sum problem is directly related to the passage between ambiguous and non-ambiguous decompression, for a compression scheme that is based on specifying the sequence composition. The proposed analysis lends itself to straightforward extensions in several directions of interest, including non-binary alphabets, incorporation of side information at the decoder (Slepian-Wolf coding), and coding schemes based on multiple subset sums. It is also demonstrated that the proposed technique can be used to analyze the critical behavior in a more involved situation where the sequence composition is not specified by the encoder.

preprint2010arXiv

A statistical-mechanical view on source coding: physical compression and data compression

We draw a certain analogy between the classical information-theoretic problem of lossy data compression (source coding) of memoryless information sources and the statistical mechanical behavior of a certain model of a chain of connected particles (e.g., a polymer) that is subjected to a contracting force. The free energy difference pertaining to such a contraction turns out to be proportional to the rate-distortion function in the analogous data compression model, and the contracting force is proportional to the derivative this function. Beyond the fact that this analogy may be interesting on its own right, it may provide a physical perspective on the behavior of optimum schemes for lossy data compression (and perhaps also, an information-theoretic perspective on certain physical system models). Moreover, it triggers the derivation of lossy compression performance for systems with memory, using analysis tools and insights from statistical mechanics.

preprint2010arXiv

Data processing theorems and the second law of thermodynamics

We draw relationships between the generalized data processing theorems of Zakai and Ziv (1973 and 1975) and the dynamical version of the second law of thermodynamics, a.k.a. the Boltzmann H-Theorem, which asserts that the Shannon entropy, $H(X_t)$, pertaining to a finite--state Markov process $\{X_t\}$, is monotonically non-decreasing as a function of time $t$, provided that the steady-state distribution of this process is uniform across the state space (which is the case when the process designates an isolated system). It turns out that both the generalized data processing theorems and the Boltzmann H-Theorem can be viewed as special cases of a more general principle concerning the monotonicity (in time) of a certain generalized information measure applied to a Markov process. This gives rise to a new look at the generalized data processing theorem, which suggests to exploit certain degrees of freedom that may lead to better bounds, for a given choice of the convex function that defines the generalized mutual information.

preprint2010arXiv

Information Theory and Statistical Physics - Lecture Notes

This document consists of lecture notes for a graduate course, which focuses on the relations between Information Theory and Statistical Physics. The course is aimed at EE graduate students in the area of Communications and Information Theory, as well as to graduate students in Physics who have basic background in Information Theory. Strong emphasis is given to the analogy and parallelism between Information Theory and Statistical Physics, as well as to the insights, the analysis tools and techniques that can be borrowed from Statistical Physics and `imported' to certain problem areas in Information Theory. This is a research trend that has been very active in the last few decades, and the hope is that by exposing the student to the meeting points between these two disciplines, we will enhance his/her background and perspective to carry out research in the field. A short outline of the course is as follows: Introduction; Elementary Statistical Physics and its Relation to Information Theory; Analysis Tools in Statistical Physics; Systems of Interacting Particles and Phase Transitions; The Random Energy Model (REM) and Random Channel Coding; Additional Topics (optional).

preprint2010arXiv

Rate-distortion function via minimum mean square error estimation

We derive a simple general parametric representation of the rate-distortion function of a memoryless source, where both the rate and the distortion are given by integrals whose integrands include the minimum mean square error (MMSE) of the distortion $Δ=d(X,Y)$ based on the source symbol $X$, with respect to a certain joint distribution of these two random variables. At first glance, these relations may seem somewhat similar to the I-MMSE relations due to Guo, Shamai and Verdú, but they are, in fact, quite different. The new relations among rate, distortion, and MMSE are discussed from several aspects, and more importantly, it is demonstrated that they can sometimes be rather useful for obtaining non-trivial upper and lower bounds on the rate-distortion function, as well as for determining the exact asymptotic behavior for very low and for very large distortion. Analogous MMSE relations hold for channel capacity as well.

preprint2010arXiv

Statistical properties of entropy production derived from fluctuation theorems

Several implications of well-known fluctuation theorems, on the statistical properties of the entropy production, are studied using various approaches. We begin by deriving a tight lower bound on the variance of the entropy production for a given mean of this random variable. It is shown that the Evans-Searles fluctuation theorem alone imposes a significant lower bound on the variance only when the mean entropy production is very small. It is then nonetheless demonstrated that upon incorporating additional information concerning the entropy production, this lower bound can be significantly improved, so as to capture extensivity properties. Another important aspect of the fluctuation properties of the entropy production is the relationship between the mean and the variance, on the one hand, and the probability of the event where the entropy production is negative, on the other hand. Accordingly, we derive upper and lower bounds on this probability in terms of the mean and the variance. These bounds are tighter than previous bounds that can be found in the literature. Moreover, they are tight in the sense that there exist probability distributions, satisfying the Evans-Searles fluctuation theorem, that achieve them with equality. Finally, we present a general method for generating a wide class of inequalities that must be satisfied by the entropy production. We use this method to derive several new inequalities which go beyond the standard derivation of the second law.

preprint2010arXiv

Threshold effects in parameter estimation as phase transitions in statistical mechanics

Threshold effects in the estimation of parameters of non-linearly modulated, continuous-time, wide-band waveforms, are examined from a statistical physics perspective. These threshold effects are shown to be analogous to phase transitions of certain disordered physical systems in thermal equilibrium. The main message, in this work, is in demonstrating that this physical point of view may be insightful for understanding the interactions between two or more parameters to be estimated, from the aspects of the threshold effect.

preprint2009arXiv

Bose--Einstein Condensation in the Large Deviations Regime with Applications to Information System Models

We study the large deviations behavior of systems that admit a certain form of a product distribution, which is frequently encountered both in Physics and in various information system models. First, to fix ideas, we demonstrate a simple calculation of the large deviations rate function for a single constraint (event). Under certain conditions, the behavior of this function is shown to exhibit an analogue of Bose--Einstein condensation (BEC). More interestingly, we also study the large deviations rate function associated with two constraints (and the extension to any number of constraints is conceptually straightforward). The phase diagram of this rate function is shown to exhibit as many as seven phases, and it suggests a two--dimensional generalization of the notion of BEC (or more generally, a multi--dimensional BEC). While the results are illustrated for a simple model, the underlying principles are actually rather general. We also discuss several applications and implications pertaining to information system models.

preprint2009arXiv

Optimum estimation via gradients of partition functions and information measures: a statistical-mechanical perspective

In continuation to a recent work on the statistical--mechanical analysis of minimum mean square error (MMSE) estimation in Gaussian noise via its relation to the mutual information (the I-MMSE relation), here we propose a simple and more direct relationship between optimum estimation and certain information measures (e.g., the information density and the Fisher information), which can be viewed as partition functions and hence are amenable to analysis using statistical--mechanical techniques. The proposed approach has several advantages, most notably, its applicability to general sources and channels, as opposed to the I-MMSE relation and its variants which hold only for certain classes of channels (e.g., additive white Gaussian noise channels). We then demonstrate the derivation of the conditional mean estimator and the MMSE in a few examples. Two of these examples turn out to be generalizable to a fairly wide class of sources and channels. For this class, the proposed approach is shown to yield an approximate conditional mean estimator and an MMSE formula that has the flavor of a single-letter expression. We also show how our approach can easily be generalized to situations of mismatched estimation.

preprint2008arXiv

Statistical Physics of Signal Estimation in Gaussian Noise: Theory and Examples of Phase Transitions

We consider the problem of signal estimation (denoising) from a statistical mechanical perspective, using a relationship between the minimum mean square error (MMSE), of estimating a signal, and the mutual information between this signal and its noisy version. The paper consists of essentially two parts. In the first, we derive several statistical-mechanical relationships between a few important quantities in this problem area, such as the MMSE, the differential entropy, the Fisher information, the free energy, and a generalized notion of temperature. We also draw analogies and differences between certain relations pertaining to the estimation problem and the parallel relations in thermodynamics and statistical physics. In the second part of the paper, we provide several application examples, where we demonstrate how certain analysis tools that are customary in statistical physics, prove useful in the analysis of the MMSE. In most of these examples, the corresponding statistical-mechanical systems turn out to consist of strong interactions that cause phase transitions, which in turn are reflected as irregularities and discontinuities (similar to threshold effects) in the behavior of the MMSE.

preprint2007arXiv

Error Exponents of Erasure/List Decoding Revisited via Moments of Distance Enumerators

The analysis of random coding error exponents pertaining to erasure/list decoding, due to Forney, is revisited. Instead of using Jensen's inequality as well as some other inequalities in the derivation, we demonstrate that an exponentially tight analysis can be carried out by assessing the relevant moments of a certain distance enumerator. The resulting bound has the following advantages: (i) it is at least as tight as Forney's bound, (ii) under certain symmetry conditions associated with the channel and the random coding distribution, it is simpler than Forney's bound in the sense that it involves an optimization over one parameter only (rather than two), and (iii) in certain special cases, like the binary symmetric channel (BSC), the optimum value of this parameter can be found in closed form, and so, there is no need to conduct a numerical search. We have not found yet, however, a numerical example where this new bound is strictly better than Forney's bound. This may provide an additional evidence to support Forney's conjecture that his bound is tight for the average code. We believe that the technique we suggest in this paper can be useful in simplifying, and hopefully also improving, exponential error bounds in other problem settings as well.

preprint2007arXiv

Optimal Watermark Embedding and Detection Strategies Under Limited Detection Resources

An information-theoretic approach is proposed to watermark embedding and detection under limited detector resources. First, we consider the attack-free scenario under which asymptotically optimal decision regions in the Neyman-Pearson sense are proposed, along with the optimal embedding rule. Later, we explore the case of zero-mean i.i.d. Gaussian covertext distribution with unknown variance under the attack-free scenario. For this case, we propose a lower bound on the exponential decay rate of the false-negative probability and prove that the optimal embedding and detecting strategy is superior to the customary linear, additive embedding strategy in the exponential sense. Finally, these results are extended to the case of memoryless attacks and general worst case attacks. Optimal decision regions and embedding rules are offered, and the worst attack channel is identified.

preprint2007arXiv

The Generalized Random Energy Model and its Application to the Statistical Physics of Ensembles of Hierarchical Codes

In an earlier work, the statistical physics associated with finite--temperature decoding of code ensembles, along with the relation to their random coding error exponents, were explored in a framework that is analogous to Derrida's random energy model (REM) of spin glasses, according to which the energy levels of the various spin configurations are independent random variables. The generalized REM (GREM) extends the REM in that it introduces correlations between energy levels in an hierarchical structure. In this paper, we explore some analogies between the behavior of the GREM and that of code ensembles which have parallel hierarchical structures. In particular, in analogy to the fact that the GREM may have different types of phase transition effects, depending on the parameters of the model, then the above--mentioned hierarchical code ensembles behave substantially differently in the various domains of the design parameters of these codes. We make an attempt to explore the insights that can be imported from the statistical mechanics of the GREM and be harnessed to serve for code design considerations and guidelines.

Neri Merhav

What is connected

Connect this record

See the researcher in context

Building this map preview

72 published item(s)

Lossy Compression of Individual Sequences Revisited: Fundamental Limits of Finite-State Encoders

$D$-semifaithful codes that are universal over both memoryless sources and distortion measures

Codebook Mismatch Can Be Fully Compensated by Mismatched Decoding

Error Exponents of the Dirty-Paper and Gel'fand-Pinsker Channels

Optimal Correlators and Waveforms for Mismatched Detection

The DNA Storage Channel: Capacity and Error Probability

Encoding Individual Source Sequences for the Wiretap Channel

Trade-offs Between Error Exponents and Excess-Rate Exponents of Typical Slepian-Wolf Codes

An Integral Representation of the Logarithmic Function with Applications in Information Theory

On More General Distributions of Random Binning for Slepian-Wolf Encoding

Optimal Work Extraction and the Minimum Description Length Principle

Some Useful Integral Representations for Information-Theoretic Analyses

The MMI Decoder is Asymptotically Optimal for the Typical Random Code and for the Expurgated Code

Universal Decoding for Asynchronous Slepian-Wolf Encoding

Asymptotic MMSE Analysis Under Sparse Representation Modeling

Converse Bounds on Modulation-Estimation Performance for the Gaussian Multiple-Access Channel

Exact Random Coding Secrecy Exponents for the Wiretap Channel

Lower Bounds on Parameter Modulation-Estimation Under Bandwidth Constraints

On empirical cumulant generating functions of code lengths for individual sequences

Reliability of universal decoding based on vector-quantized codewords

Universal Decoding for Source-Channel Coding with Side Information

Universal decoding using a noisy codebook

A Large Deviations Approach to Secure Lossy Compression

Channel Detection in Coded Communication

Comments on "Identifying Functional Thermodynamics in Autonomous Maxwellian Ratchets" (arXiv:1507.01537v2)

Sequence complexity and work extraction

The generalized likelihood decoder: random coding and expurgated bounds

Data Processing Bounds for Scalar Lossy Source Codes with Side Information at the Decoder

Exact correct-decoding exponent of the wiretap channel decoder

Exact random coding error exponents of optimal bin index decoding

Expurgated Random-Coding Ensembles: Exponents, Refinements and Connections

Information-theoretic applications of the logarithmic probability comparison bound

On Compressive Sensing in Coding Problems: A Rigorous Approach

On zero-rate error exponents of finite-state channels with input-dependent states

Optimum Trade-offs Between the Error Exponent and the Excess-Rate Exponent of Variable-Rate Slepian-Wolf Coding

Simplified Erasure/List Decoding

Statistical physics of random binning

Universal Decoding for Gaussian Intersymbol Interference Channels

Universal Quantization for Separate Encodings and Joint Decoding of Correlated Sources

Analysis of Mismatched Estimation Errors Using Gradients of Partition Functions

Another look at expurgated bounds and their statistical-mechanical interpretation

Asymptotically optimal decision rules for joint detection and source coding

Codeword or noise? Exact random coding exponents for slotted asynchronism

Erasure/list exponents for Slepian-Wolf decoding

List decoding - random coding exponents and expurgated exponents

On the data processing theorem in the semi-deterministic setting

Statistical Physics: a Short Course for Electrical Engineering Students

Zero-Delay and Causal Secure Source Coding

Zero-Delay and Causal Single-User and Multi-User Lossy Source Coding with Decoder Side Information

Average redundancy of the Shannon code for Markov sources

Exponential error bounds on parameter modulation-estimation for discrete memoryless channels

On optimum parameter modulation-estimation from a large deviations perspective

On Real-Time and Causal Secure Source Coding

Perfectly secure encryption of individual sequences

Universal decoding for arbitrary channels relative to a given class of decoding metrics

Data processing inequalities based on a certain structured class of information measures with application to estimation theory

On optimum strategies for minimizing the exponential moments of a given cost function

Relations between redundancy patterns of the Shannon code and wave diffraction patterns of partially disordered media

Structure Theorems for Real-Time Variable-Rate Coding With and Without Side Information

Subset sum phase transitions and data compression

A statistical-mechanical view on source coding: physical compression and data compression

Data processing theorems and the second law of thermodynamics

Information Theory and Statistical Physics - Lecture Notes

Rate-distortion function via minimum mean square error estimation

Statistical properties of entropy production derived from fluctuation theorems

Threshold effects in parameter estimation as phase transitions in statistical mechanics

Bose--Einstein Condensation in the Large Deviations Regime with Applications to Information System Models

Optimum estimation via gradients of partition functions and information measures: a statistical-mechanical perspective

Statistical Physics of Signal Estimation in Gaussian Noise: Theory and Examples of Phase Transitions

Error Exponents of Erasure/List Decoding Revisited via Moments of Distance Enumerators

Optimal Watermark Embedding and Detection Strategies Under Limited Detection Resources

The Generalized Random Energy Model and its Application to the Statistical Physics of Ensembles of Hierarchical Codes