Source author record

Meir Feder

Meir Feder appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT physics.optics Machine Learning math.ST Statistics Theory

Catalog footprint

What is connected

40works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Beyond Ridge Regression for Distribution-Free Data

In supervised batch learning, the predictive normalized maximum likelihood (pNML) has been proposed as the min-max regret solution for the distribution-free setting, where no distributional assumptions are made on the data. However, the pNML is not defined for a large capacity hypothesis class as over-parameterized linear regression. For a large class, a common approach is to use regularization or a model prior. In the context of online prediction where the min-max solution is the Normalized Maximum Likelihood (NML), it has been suggested to use NML with ``luckiness'': A prior-like function is applied to the hypothesis class, which reduces its effective size. Motivated by the luckiness concept, for linear regression we incorporate a luckiness function that penalizes the hypothesis proportionally to its l2 norm. This leads to the ridge regression solution. The associated pNML with luckiness (LpNML) prediction deviates from the ridge regression empirical risk minimizer (Ridge ERM): When the test data reside in the subspace corresponding to the small eigenvalues of the empirical correlation matrix of the training data, the prediction is shifted toward 0. Our LpNML reduces the Ridge ERM error by up to 20% for the PMLB sets, and is up to 4.9% more robust in the presence of distribution shift compared to recent leading methods for UCI sets.

preprint2020arXiv

Deep pNML: Predictive Normalized Maximum Likelihood for Deep Neural Networks

The Predictive Normalized Maximum Likelihood (pNML) scheme has been recently suggested for universal learning in the individual setting, where both the training and test samples are individual data. The goal of universal learning is to compete with a ``genie'' or reference learner that knows the data values, but is restricted to use a learner from a given model class. The pNML minimizes the associated regret for any possible value of the unknown label. Furthermore, its min-max regret can serve as a pointwise measure of learnability for the specific training and data sample. In this work we examine the pNML and its associated learnability measure for the Deep Neural Network (DNN) model class. As shown, the pNML outperforms the commonly used Empirical Risk Minimization (ERM) approach and provides robustness against adversarial attacks. Together with its learnability measure it can detect out of distribution test examples, be tolerant to noisy labels and serve as a confidence measure for the ERM. Finally, we extend the pNML to a ``twice universal'' solution, that provides universality for model class selection and generates a learner competing with the best one from all model classes.

preprint2020arXiv

Non-linear Canonical Correlation Analysis: A Compressed Representation Approach

Canonical Correlation Analysis (CCA) is a linear representation learning method that seeks maximally correlated variables in multi-view data. Non-linear CCA extends this notion to a broader family of transformations, which are more powerful in many real-world applications. Given the joint probability, the Alternating Conditional Expectation (ACE) algorithm provides an optimal solution to the non-linear CCA problem. However, it suffers from limited performance and an increasing computational burden when only a finite number of samples is available. In this work we introduce an information-theoretic compressed representation framework for the non-linear CCA problem (CRCCA), which extends the classical ACE approach. Our suggested framework seeks compact representations of the data that allow a maximal level of correlation. This way we control the trade-off between the flexibility and the complexity of the model. CRCCA provides theoretical bounds and optimality conditions, as we establish fundamental connections to rate-distortion theory, the information bottleneck and remote source coding. In addition, it allows a soft dimensionality reduction, as the compression level is determined by the mutual information between the original noisy data and the extracted signals. Finally, we introduce a simple implementation of the CRCCA framework, based on lattice quantization.

preprint2020arXiv

One shot approach to lossy source coding under average distortion constraints

This paper presents a one shot analysis of the lossy compression problem under average distortion constraints. We calculate the exact expected distortion of a random code. The result is given as an integral formula using a newly defined functional $\tilde{D}(z,Q_Y)$ where $Q_Y$ is the random coding distribution and $z\in [0,1]$. When we plug in the code distribution as $Q_Y$, this functional produces the average distortion of the code, thus provide a converse result utilizing the same functional. Two alternative formulas are provided for $\tilde{D}(z,Q_Y)$, the first involves a supremum over some auxiliary distribution $Q_X$ which has resemblance to the channel coding meta-converse and the other involves an infimum over channels which resemble the well known Shannon distortion-rate function.

preprint2016arXiv

Large Alphabet Source Coding using Independent Component Analysis

Large alphabet source coding is a basic and well-studied problem in data compression. It has many applications such as compression of natural language text, speech and images. The classic perception of most commonly used methods is that a source is best described over an alphabet which is at least as large as the observed alphabet. In this work we challenge this approach and introduce a conceptual framework in which a large alphabet source is decomposed into "as statistically independent as possible" components. This decomposition allows us to apply entropy encoding to each component separately, while benefiting from their reduced alphabet size. We show that in many cases, such decomposition results in a sum of marginal entropies which is only slightly greater than the entropy of the source. Our suggested algorithm, based on a generalization of the Binary Independent Component Analysis, is applicable for a variety of large alphabet source coding setups. This includes the classical lossless compression, universal compression and high-dimensional vector quantization. In each of these setups, our suggested approach outperforms most commonly used methods. Moreover, our proposed framework is significantly easier to implement in most of these cases.

preprint2016arXiv

On the calculation of the minimax-converse of the channel coding problem

A minimax-converse has been suggested for the general channel coding problem by Polyanskiy etal. This converse comes in two flavors. The first flavor is generally used for the analysis of the coding problem with non-vanishing error probability and provides an upper bound on the rate given the error probability. The second flavor fixes the rate and provides a lower bound on the error probability. Both converses are given as a min-max optimization problem of an appropriate binary hypothesis testing problem. The properties of the first converse were studies by Polyanskiy and a saddle point was proved. In this paper we study the properties of the second form and prove that it also admits a saddle point. Moreover, an algorithm for the computation of the saddle point, and hence the bound, is developed. In the DMC case, the algorithm runs in a polynomial time.

preprint2016arXiv

Variational formulas for the power of the binary hypothesis testing problem with applications

Two variational formulas for the power of the binary hypothesis testing problem are derived. The first is given as the Legendre transform of a certain function and the second, induced from the first, is given in terms of the Cumulative Distribution Function (CDF) of the log-likelihood ratio. One application of the first formula is an upper bound on the power of the binary hypothesis testing problem in terms of the Re'nyi divergence. The second formula provide a general framework for proving asymptotic and non-asymptotic expressions for the power of the test utilizing corresponding expressions for the CDF of the log-likelihood. The framework is demonstrated in the central limit regime (i.e., for non-vanishing type I error) and in the large deviations regime.

preprint2015arXiv

A Simple Proof for the Optimality of Randomized Posterior Matching

Posterior matching (PM) is a sequential horizon-free feedback communication scheme introduced by the authors, who also provided a rather involved optimality proof showing it achieves capacity for a large class of memoryless channels. Naghshvar et al considered a non-sequential variation of PM with a fixed number of messages and a random decision-time, and gave a simpler proof establishing its optimality via a novel Shannon-Jensen divergence argument. Another simpler optimality proof was given by Li and El Gamal, who considered a fixed-rate fixed block-length variation of PM with an additional randomization. Both these works also provided error exponent bounds. However, their simpler achievability proofs apply only to discrete memoryless channels, and are restricted to a non-sequential setup with a fixed number of messages. In this paper, we provide a short and transparent proof for the optimality of the fully sequential horizon-free PM scheme over general memoryless channels. Borrowing the key randomization idea of Li and El Gamal, our proof is based on analyzing the random walk behavior of the shrinking posterior intervals induced by a reversed iterated function system (RIFS) decoder.

preprint2015arXiv

Achievable and Converse bounds over a general channel and general decoding metric

Achievable and converse bounds for general channels and mismatched decoding are derived. The direct (achievable) bound is derived using random coding and the analysis is tight up to factor 2. The converse is given in term of the achievable bound and the factor between them is given. This gives performance of the best rate-R code with possible mismatched decoding metric over a general channel, up to the factor that is identified. In the matched case we show that the converse equals the minimax meta-converse of Polyanskiy et al.

preprint2015arXiv

Generalized Independent Component Analysis Over Finite Alphabets

Independent component analysis (ICA) is a statistical method for transforming an observable multidimensional random vector into components that are as statistically independent as possible from each other.Usually the ICA framework assumes a model according to which the observations are generated (such as a linear transformation with additive noise). ICA over finite fields is a special case of ICA in which both the observations and the independent components are over a finite alphabet. In this work we consider a generalization of this framework in which an observation vector is decomposed to its independent components (as much as possible) with no prior assumption on the way it was generated. This generalization is also known as Barlow's minimal redundancy representation problem and is considered an open problem. We propose several theorems and show that this NP hard problem can be accurately solved with a branch and bound search tree algorithm, or tightly approximated with a series of linear problems. Our contribution provides the first efficient and constructive set of solutions to Barlow's problem.The minimal redundancy representation (also known as factorial code) has many applications, mainly in the fields of Neural Networks and Deep Learning. The Binary ICA (BICA) is also shown to have applications in several domains including medical diagnosis, multi-cluster assignment, network tomography and internet resource management. In this work we show this formulation further applies to multiple disciplines in source coding such as predictive coding, distributed source coding and coding of large alphabet sources.

preprint2015arXiv

On the Diversity-Multiplexing Tradeoff of Unconstrained Multiple-Access Channels

In this work the optimal diversity-multiplexing tradeoff (DMT) is investigated for the multiple-input multiple-output fading multiple-access channels with no power constraints (infinite constellations). For K users (K>1), M transmit antennas for each user, and N receive antennas, infinite constellations in general and lattices in particular are shown to attain the optimal DMT of finite constellations for the case N equals or greater than (K+1)M-1, i.e., user limited regime. On the other hand for N<(K+1)M-1 it is shown that infinite constellations can not attain the optimal DMT. This is in contrast to the point-to-point case in which infinite constellations are DMT optimal for any M and N. In general, this work shows that when the network is heavily loaded, i.e. K>max(1,(N-M+1)/M), taking into account the shaping region in the decoding process plays a crucial role in pursuing the optimal DMT. By investigating the cases where infinite constellations are optimal and suboptimal, this work also gives a geometrical interpretation to the DMT of infinite constellations in multiple-access channels.

preprint2015arXiv

Pulse collision picture of inter-channel nonlinear interference noise in fiber-optic communications

We model the build-up of inter-channel nonlinear interference noise (NLIN) in wavelength division multiplexed systems by considering the pulse collision dynamics in the time domain. The fundamental interactions can be classified as two-pulse, three-pulse, or four-pulse collisions and they can be either complete, or incomplete. Each type of collision is shown to have its unique signature and the overall nature of NLIN is determined by the relative importance of the various classes of pulse collisions in a given WDM system. The pulse-collision picture provides qualitative and quantitative insight into the character of NLIN, offering a simple and intuitive explanation to all of the reported and previously unexplained phenomena. In particular, we show that the most important contributions to NLIN follow from two-pulse and four-pulse collisions. While the contribution of two-pulse collisions is in the form of phase-noise and polarization-state-rotation with strong dependence on modulation format, four-pulse collisions generate complex circular noise whose variance is independent of modulation format. In addition, two-pulse collisions are strongest when the collision is complete, whereas four-pulse collisions are strongest when the collision is incomplete. We show that two-pulse collisions dominate the formation of NLIN in short links with lumped amplification, or in links with distributed amplification extending over arbitrary length. In long links using lumped amplification the relative significance of four-pulse collisions increases, emphasizing the circularity of the NLIN while reducing its dependence on modulation format.

preprint2014arXiv

A Universal Decoder Relative to a Given Family of Metrics

Consider the following framework of universal decoding suggested in [MerhavUniversal]. Given a family of decoding metrics and random coding distribution (prior), a single, universal, decoder is optimal if for any possible channel the average error probability when using this decoder is better than the error probability attained by the best decoder in the family up to a subexponential multiplicative factor. We describe a general universal decoder in this framework. The penalty for using this universal decoder is computed. The universal metric is constructed as follows. For each metric, a canonical metric is defined and conditions for the given prior to be normal are given. A sub-exponential set of canonical metrics of normal prior can be merged to a single universal optimal metric. We provide an example where this decoder is optimal while the decoder of [MerhavUniversal] is not.

preprint2014arXiv

Accumulation of nonlinear interference noise in fiber-optic systems

Through a series of extensive system simulations we show that all of the previously not understood discrepancies between the Gaussian noise (GN) model and simulations can be attributed to the omission of an important, recently reported, fourth-order noise (FON) term, that accounts for the statistical dependencies within the spectrum of the interfering channel. We examine the importance of the FON term as well as the dependence of NLIN on modulation format with respect to link-length and number of spans. A computationally efficient method for evaluating the FON contribution, as well as the overall NLIN power is provided.

preprint2014arXiv

Delay and Redundancy in Lossless Source Coding

The penalty incurred by imposing a finite delay constraint in lossless source coding of a memoryless source is investigated. It is well known that for the so-called block-to-variable and variable-to-variable codes, the redundancy decays at best polynomially with the delay, where in this case the delay is identified with the source block length or maximal source phrase length, respectively. In stark contrast, it is shown that for sequential codes (e.g., a delay-limited arithmetic code) the redundancy can be made to decay exponentially with the delay constraint. The corresponding redundancy-delay exponent is shown to be at least as good as the Rényi entropy of order 2 of the source, but (for almost all sources) not better than a quantity depending on the minimal source symbol probability and the alphabet size.

preprint2014arXiv

Dispersion of Infinite Constellations in Fast Fading Channels

In this work we extend the setting of communication without power constraint, proposed by Poltyrev, to fast fading channels with channel state information (CSI) at the receiver. The optimal codewords density, or actually the optimal normalized log density (NLD), is considered. Poltyrev's capacity for this channel is the highest achievable NLD, at possibly large block length, that guarantees a vanishing error probability. For a given finite block length n and a fixed error probability, there is a gap between the highest achievable NLD and Poltyrev's capacity. As in other channels, this gap asymptotically vanishes as the square root of the channel dispersion V over n, multiplied by the inverse Q-function of the allowed error probability. This dispersion, derived in the paper, equals the dispersion of the power constrained fast fading channel at the high SNR regime. Connections to the error exponent of the peak power constrained fading channel are also discussed.

preprint2014arXiv

Mitigation of inter-channel nonlinear interference in WDM systems

We demonstrate mitigation of inter-channel nonlinear interference noise (NLIN) in WDM systems for several amplification schemes. Using a practical decision directed recursive least-squares algorithm, we take advantage of the temporal correlations of NLIN to achieve a notable improvement in system performance.

preprint2014arXiv

Source Broadcasting to the Masses: Separation has a Bounded Loss

This work discusses the source broadcasting problem, i.e. transmitting a source to many receivers via a broadcast channel. The optimal rate-distortion region for this problem is unknown. The separation approach divides the problem into two complementary problems: source successive refinement and broadcast channel transmission. We provide bounds on the loss incorporated by applying time-sharing and separation in source broadcasting. If the broadcast channel is degraded, it turns out that separation-based time-sharing achieves at least a factor of the joint source-channel optimal rate, and this factor has a positive limit even if the number of receivers increases to infinity. For the AWGN broadcast channel a better bound is introduced, implying that all achievable joint source-channel schemes have a rate within one bit of the separation-based achievable rate region for two receivers, or within $\log_2 T$ bits for $T$ receivers.

preprint2013arXiv

A Universal Probability Assignment for Prediction of Individual Sequences

Is it a good idea to use the frequency of events in the past, as a guide to their frequency in the future (as we all do anyway)? In this paper the question is attacked from the perspective of universal prediction of individual sequences. It is shown that there is a universal sequential probability assignment, such that for a large class loss functions (optimization goals), the predictor minimizing the expected loss under this probability, is a good universal predictor. The proposed probability assignment is based on randomly dithering the empirical frequencies of states in the past, and it is easy to show that randomization is essential. This yields a very simple universal prediction scheme which is similar to Follow-the-Perturbed-Leader (FPL) and works for a large class of loss functions, as well as a partial justification for using probabilistic assumptions.

preprint2013arXiv

Finite-Memory Prediction as Well as the Empirical Mean

The problem of universally predicting an individual continuous sequence using a deterministic finite-state machine (FSM) is considered. The empirical mean is used as a reference as it is the constant that fits a given sequence within a minimal square error. With this reference, a reasonable prediction performance is the regret, namely the excess square-error over the reference loss, the empirical variance. The paper analyzes the tradeoff between the number of states of the universal FSM and the attainable regret. It first studies the case of a small number of states. A class of machines, denoted Degenerated Tracking Memory (DTM), is defined and the optimal machine in this class is shown to be the optimal among all machines for small enough number of states. Unfortunately, DTM machines become suboptimal as the number of available states increases. Next, the Exponential Decaying Memory (EDM) machine, previously used for predicting binary sequences, is considered. While this machine has poorer performance for small number of states, it achieves a vanishing regret for large number of states. Following that, an asymptotic lower bound of O(k^{-2/3}) on the achievable regret of any k-state machine is derived. This bound is attained asymptotically by the EDM machine. Furthermore, a new machine, denoted the Enhanced Exponential Decaying Memory machine, is shown to outperform the EDM machine for any number of states.

preprint2013arXiv

Fundamental Limits of Infinite Constellations in MIMO Fading Channels

The fundamental and natural connection between the infinite constellation (IC) dimension and the best diversity order it can achieve is investigated in this paper. In the first part of this work we develop an upper bound on the diversity order of IC's for any dimension and any number of transmit and receive antennas. By choosing the right dimensions, we prove in the second part of this work that IC's in general and lattices in particular can achieve the optimal diversity-multiplexing tradeoff of finite constellations. This work gives a framework for designing lattices for multiple-antenna channels using lattice decoding.

preprint2013arXiv

Information Spectrum Approach to the Source Channel Separation Theorem

A source-channel separation theorem for a general channel has recently been shown by Aggrawal et. al. This theorem states that if there exist a coding scheme that achieves a maximum distortion level d_{max} over a general channel W, then reliable communication can be accomplished over this channel at rates less then R(d_{max}), where R(.) is the rate distortion function of the source. The source, however, is essentially constrained to be discrete and memoryless (DMS). In this work we prove a stronger claim where the source is general, satisfying only a "sphere packing optimality" feature, and the channel is completely general. Furthermore, we show that if the channel satisfies the strong converse property as define by Han & verdu, then the same statement can be made with d_{avg}, the average distortion level, replacing d_{max}. Unlike the proofs there, we use information spectrum methods to prove the statements and the results can be quite easily extended to other situations.

preprint2013arXiv

New Bounds on the Capacity of Fiber-Optics Communications

By taking advantage of the temporal correlations of the nonlinear phase noise in WDM systems we show that the capacity of a nonlinear fiber link is notably higher than what is currently assumed. This advantage is translated into the doubling of the link distance for a fixed transmission rate.

preprint2013arXiv

Non-Random Coding Error Exponent for Lattices

An upper bound on the error probability of specific lattices, based on their distance-spectrum, is constructed. The derivation is accomplished using a simple alternative to the Minkowski-Hlawka mean-value theorem of the geometry of numbers. In many ways, the new bound greatly resembles the Shulman-Feder bound for linear codes. Based on the new bound, an error-exponent is derived for specific lattice sequences (of increasing dimension) over the AWGN channel. Measuring the sequence's gap to capacity, using the new exponent, is demonstrated.

preprint2013arXiv

Properties of nonlinear noise in long, dispersion-uncompensated fiber links

We study the properties of nonlinear interference noise (NLIN) in fiber-optic communications systems with large accumulated dispersion. Our focus is on settling the discrepancy between the results of the Gaussian noise (GN) model (according to which NLIN is additive Gaussian) and a recently published time-domain analysis, which attributes drastically different properties to the NLIN. Upon reviewing the two approaches we identify several unjustified assumptions that are key in the derivation of the GN model, and that are responsible for the discrepancy. We derive the true NLIN power and verify that the NLIN is not additive Gaussian, but rather it depends strongly on the data transmitted in the channel of interest. In addition we validate the time-domain model numerically and demonstrate the strong dependence of the NLIN on the interfering channels' modulation format.

preprint2013arXiv

The Random Coding Bound Is Tight for the Average Linear Code or Lattice

In 1973, Gallager proved that the random-coding bound is exponentially tight for the random code ensemble at all rates, even below expurgation. This result explained that the random-coding exponent does not achieve the expurgation exponent due to the properties of the random ensemble, irrespective of the utilized bounding technique. It has been conjectured that this same behavior holds true for a random ensemble of linear codes. This conjecture is proved in this paper. Additionally, it is shown that this property extends to Poltyrev's random-coding exponent for a random ensemble of lattices.

preprint2013arXiv

Time varying ISI model for nonlinear interference noise

We show that the effect of nonlinear interference in WDM systems is equivalent to slowly varying inter-symbol-interference (ISI), and hence its cancellation can be carried out by means of adaptive linear filtering. We characterize the ISI coefficients and discuss the potential gain following from their cancellation.

preprint2013arXiv

Universal communication part I: modulo additive channels

Which communication rates can be attained over a channel whose output is an unknown (possibly stochastic) function of the input that may vary arbitrarily in time with no a-priori model? Following the spirit of the finite-state compressibility of a sequence, defined by Lempel and Ziv, a "capacity" is defined for such a channel as the highest rate achievable by a designer knowing the particular relation that indeed exists between the input and output for all times, yet is constrained to use a fixed finite-length block communication scheme without feedback, i.e. use the same encoder and decoder over each block. In the case of the modulo additive channel, where the output sequence is obtained by modulo addition of an unknown individual sequence to the input sequence, this capacity is upper bounded by a function of the finite state compressibility of the noise sequence. A universal communication scheme with feedback that attains this capacity universally, without prior knowledge of the noise sequence, is presented.

preprint2013arXiv

Universal communication part II: channels with memory

Consider communication over a channel whose probabilistic model is completely unknown vector-wise and is not assumed to be stationary. Communication over such channels is challenging because knowing the past does not indicate anything about the future. The existence of reliable feedback and common randomness is assumed. In a previous paper it was shown that the Shannon capacity cannot be attained, in general, if the channel is not known. An alternative notion of "capacity" was defined, as the maximum rate of reliable communication by any block-coding system used over consecutive blocks. This rate was shown to be achievable for the modulo-additive channel with an individual, unknown noise sequence, and not achievable for some channels with memory. In this paper this "capacity" is shown to be achievable for general channel models possibly including memory, as long as this memory fades with time. In other words, there exists a system with feedback and common randomness that, without knowledge of the channel, asymptotically performs as well as any block code, which may be designed knowing the channel. For non-fading memory channels a weaker type of "capacity" is shown to be achievable.

preprint2012arXiv

A simpler derivation of the coding theorem

A simple proof for the Shannon coding theorem, using only the Markov inequality, is presented. The technique is useful for didactic purposes, since it does not require many preliminaries and the information density and mutual information follow naturally in the proof. It may also be applicable to situations where typicality is not natural.

preprint2012arXiv

Communication over Individual Channels -- a general framework

We consider the problem of communicating over a channel for which no mathematical model is specified, and the achievable rates are determined as a function of the channel input and output sequences known a-posteriori, without assuming any a-priori relation between them. In a previous paper we have shown that the empirical mutual information between the input and output sequences is achievable without specifying the channel model, by using feedback and common randomness, and a similar result for real-valued input and output alphabets. In this paper, we present a unifying framework which includes the two previous results as particular cases. We characterize the region of rate functions which are achievable, and show that asymptotically the rate function is equivalent to a conditional distribution of the channel input given the output. We present a scheme that achieves these rates with asymptotically vanishing overheads.

preprint2012arXiv

On the Achievable Communication Rates of Generalized Soliton Transmission Systems

We analyze the achievable communication rates of a generalized soliton-based transmission system for the optical fiber channel. This method is based on modulation of parameters of the scattering domain, via the inverse scattering transform, by the information bits. The decoder uses the direct spectral transform to estimate these parameters and decode the information message. Unlike ordinary On-Off Keying (OOK) soliton systems, the solitons' amplitude may take values in a continuous interval. A considerable rate gain is shown in the case where the waveforms are 2-bound soliton states. Using traditional information theory and inverse scattering perturbation theory, we analyze the influence of the amplitude fluctuations as well as soliton arrival time jitter, on the achievable rates. Using this approach we show that the time of arrival jitter (Gordon-Haus) limits the information rate in a continuous manner, as opposed to a strict threshold in OOK systems.

preprint2012arXiv

The Jacobi MIMO Channel

This paper presents a new fading model for MIMO channels, the Jacobi fading model. It asserts that $H$, the transfer matrix which couples the $m_t$ inputs into $m_r$ outputs, is a sub-matrix of an $m\times m$ random (Haar-distributed) unitary matrix. The (squared) singular values of $H$ follow the law of the classical Jacobi ensemble of random matrices; hence the name of the channel. One motivation to define such a channel comes from multimode/multicore optical fiber communication. It turns out that this model can be qualitatively different than the Rayleigh model, leading to interesting practical and theoretical results. This work first evaluates the ergodic capacity of the channel. Then, it considers the non-ergodic case, where it analyzes the outage probability and the diversity-multiplexing tradeoff. In the case where $k=m_t+m_r-m > 0$ it is shown that at least $k$ degrees of freedom are guaranteed not to fade for any channel realization, enabling a zero outage probability or infinite diversity order at the corresponding rates. A simple scheme utilizing (a possibly outdated) channel state feedback is provided, attaining the no-outage guarantee. Finally, noting that as $m$ increases, the Jacobi model approaches the Rayleigh model, the paper discusses the applicability of the model in other communication scenaria.

preprint2012arXiv

Universal Communication over Arbitrarily Varying Channels

We consider the problem of universally communicating over an unknown and arbitrarily varying channel, using feedback. The focus of this paper is on determining the input behavior, and specifically, a prior distribution which is used to randomly generate the codebook. We pose the problem of setting the prior as a sequential universal prediction problem, that attempts to approach a given target rate, which depends on the unknown channel sequence. The main result is that, for a channel comprised of an unknown, arbitrary sequence of memoryless channels, there is a system using feedback and common randomness that asymptotically attains, with high probability, the capacity of the time-averaged channel, universally for every sequence of channels. While no prior knowledge of the channel sequence is assumed, the rate achieved meets or exceeds the traditional arbitrarily varying channel (AVC) capacity for every memoryless AVC defined over the same alphabets, and therefore the system universally attains the random code AVC capacity, without knowledge of the AVC parameters. The system we present combines rateless coding with a universal prediction scheme for the prior. We present rough upper bounds on the rates that can be achieved in this setting and lower bounds for the redundancies.

preprint2011arXiv

Finite Dimensional Infinite Constellations

In the setting of a Gaussian channel without power constraints, proposed by Poltyrev, the codewords are points in an n-dimensional Euclidean space (an infinite constellation) and the tradeoff between their density and the error probability is considered. The capacity in this setting is the highest achievable normalized log density (NLD) with vanishing error probability. This capacity as well as error exponent bounds for this setting are known. In this work we consider the optimal performance achievable in the fixed blocklength (dimension) regime. We provide two new achievability bounds, and extend the validity of the sphere bound to finite dimensional infinite constellations. We also provide asymptotic analysis of the bounds: When the NLD is fixed, we provide asymptotic expansions for the bounds that are significantly tighter than the previously known error exponent results. When the error probability is fixed, we show that as n grows, the gap to capacity is inversely proportional (up to the first order) to the square-root of n where the proportion constant is given by the inverse Q-function of the allowed error probability, times the square root of 1/2. In an analogy to similar result in channel coding, the dispersion of infinite constellations is 1/2nat^2 per channel use. All our achievability results use lattices and therefore hold for the maximal error probability as well. Connections to the error exponent of the power constrained Gaussian channel and to the volume-to-noise ratio as a figure of merit are discussed. In addition, we demonstrate the tightness of the results numerically and compare to state-of-the-art coding schemes.

preprint2010arXiv

An Achievable Rate for the MIMO Individual Channel

We consider the problem of communicating over a multiple-input multiple-output (MIMO) real valued channel for which no mathematical model is specified, and achievable rates are given as a function of the channel input and output sequences known a-posteriori. This paper extends previous results regarding individual channels by presenting a rate function for the MIMO individual channel, and showing its achievability in a fixed transmission rate communication scenario.

preprint2010arXiv

Optimal Feedback Communication via Posterior Matching

In this paper we introduce a fundamental principle for optimal communication over general memoryless channels in the presence of noiseless feedback, termed posterior matching. Using this principle, we devise a (simple, sequential) generic feedback transmission scheme suitable for a large class of memoryless channels and input distributions, achieving any rate below the corresponding mutual information. This provides a unified framework for optimal feedback communication in which the Horstein scheme (BSC) and the Schalkwijk-Kailath scheme (AWGN channel) are special cases. Thus, as a corollary, we prove that the Horstein scheme indeed attains the BSC capacity, settling a longstanding conjecture. We further provide closed form expressions for the error probability of the scheme over a range of rates, and derive the achievable rates in a mismatch setting where the scheme is designed according to the wrong channel model. Several illustrative examples of the posterior matching scheme for specific channels are given, and the corresponding error probability expressions are evaluated. The proof techniques employed utilize novel relations between information rates and contraction properties of iterated function systems.

preprint2010arXiv

Parallel Bit Interleaved Coded Modulation

A new variant of bit interleaved coded modulation (BICM) is proposed. In the new scheme, called Parallel BICM, L identical binary codes are used in parallel using a mapper, a newly proposed finite-length interleaver and a binary dither signal. As opposed to previous approaches, the scheme does not rely on any assumptions of an ideal, infinite-length interleaver. Over a memoryless channel, the new scheme is proven to be equivalent to a binary memoryless channel. Therefore the scheme enables one to easily design coded modulation schemes using a simple binary code that was designed for that binary channel. The overall performance of the coded modulation scheme is analytically evaluated based on the performance of the binary code over the binary channel. The new scheme is analyzed from an information theoretic viewpoint, where the capacity, error exponent and channel dispersion are considered. The capacity of the scheme is identical to the BICM capacity. The error exponent of the scheme is numerically compared to a recently proposed mismatched-decoding exponent analysis of BICM.

preprint2009arXiv

Communication over Individual Channels

We consider the problem of communicating over a channel for which no mathematical model is specified. We present achievable rates as a function of the channel input and output known a-posteriori for discrete and continuous channels, as well as a rate-adaptive scheme employing feedback which achieves these rates asymptotically without prior knowledge of the channel behavior.

preprint2008arXiv

Signal Codes

Motivated by signal processing, we present a new class of channel codes, called signal codes, for continuous-alphabet channels. Signal codes are lattice codes whose encoding is done by convolving an integer information sequence with a fixed filter pattern. Decoding is based on the bidirectional sequential stack decoder, which can be implemented efficiently using the heap data structure. Error analysis and simulation results indicate that signal codes can achieve low error rate at approximately 1dB from channel capacity.

Meir Feder

What is connected

Connect this record

See the researcher in context

Building this map preview

40 published item(s)

Beyond Ridge Regression for Distribution-Free Data

Deep pNML: Predictive Normalized Maximum Likelihood for Deep Neural Networks

Non-linear Canonical Correlation Analysis: A Compressed Representation Approach

One shot approach to lossy source coding under average distortion constraints

Large Alphabet Source Coding using Independent Component Analysis

On the calculation of the minimax-converse of the channel coding problem

Variational formulas for the power of the binary hypothesis testing problem with applications

A Simple Proof for the Optimality of Randomized Posterior Matching

Achievable and Converse bounds over a general channel and general decoding metric

Generalized Independent Component Analysis Over Finite Alphabets

On the Diversity-Multiplexing Tradeoff of Unconstrained Multiple-Access Channels

Pulse collision picture of inter-channel nonlinear interference noise in fiber-optic communications

A Universal Decoder Relative to a Given Family of Metrics

Accumulation of nonlinear interference noise in fiber-optic systems

Delay and Redundancy in Lossless Source Coding

Dispersion of Infinite Constellations in Fast Fading Channels

Mitigation of inter-channel nonlinear interference in WDM systems

Source Broadcasting to the Masses: Separation has a Bounded Loss

A Universal Probability Assignment for Prediction of Individual Sequences

Finite-Memory Prediction as Well as the Empirical Mean

Fundamental Limits of Infinite Constellations in MIMO Fading Channels

Information Spectrum Approach to the Source Channel Separation Theorem

New Bounds on the Capacity of Fiber-Optics Communications

Non-Random Coding Error Exponent for Lattices

Properties of nonlinear noise in long, dispersion-uncompensated fiber links

The Random Coding Bound Is Tight for the Average Linear Code or Lattice

Time varying ISI model for nonlinear interference noise

Universal communication part I: modulo additive channels

Universal communication part II: channels with memory

A simpler derivation of the coding theorem

Communication over Individual Channels -- a general framework

On the Achievable Communication Rates of Generalized Soliton Transmission Systems

The Jacobi MIMO Channel

Universal Communication over Arbitrarily Varying Channels

Finite Dimensional Infinite Constellations

An Achievable Rate for the MIMO Individual Channel

Optimal Feedback Communication via Posterior Matching

Parallel Bit Interleaved Coded Modulation

Communication over Individual Channels

Signal Codes