Source author record

Or Ordentlich

Or Ordentlich appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning math.CO math.ST Statistics Theory Artificial Intelligence Cryptography and Security Discrete Mathematics math.MG math.NT

Catalog footprint

What is connected

23works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

High-Rate Quantized Matrix Multiplication II

This is the second part of the work investigating quantized matrix multiplication (MatMul). In part I we considered the case of calibration-free quantization, whereas here we discuss the setting where covariance matrix $Σ_X$ of the columns of the second factor is available. This setting arises in the ubiquitous task of weight-only post-training quantization of LLMs. Weight-only quantization is related to the problem of weighted mean squared error (WMSE) source coding, whose classical (reverse) waterfilling solution dictates how one should distribute rate between coordinates of the vector. We show how waterfilling can be used to improve practical LLM quantization algorithms (GPTQ), which at present allocate rate equally. A recent scheme (known as ``WaterSIC'') that only uses scalar INT quantizers is analyzed and its high-rate performance is shown to be (a) basis free (i.e., characterized by the determinant of $Σ_X$ and, thus, unlike existing schemes, is immune to applying random rotations); and (b) within a multiplicative factor of $\frac{2πe}{12}$ (or 0.25 bit/entry) of the information-theoretic distortion limit. GPTQ's performance, in turn, is affected by the choice of basis, but for a random rotation and actual $Σ_X$ from Llama-3-8B we find it to be within 0.1 bit (depending on the layer type) of WaterSIC, suggesting that GPTQ with random rotation is also near optimal, at least in the high-rate regime.

preprint2022arXiv

Deterministic Finite-Memory Bias Estimation

In this paper we consider the problem of estimating a Bernoulli parameter using finite memory. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables with expectation $θ$, where $θ\in [0,1]$. Consider a finite-memory deterministic machine with $S$ states, that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time according to the rule $M_n = f(M_{n-1},X_n)$, where $f$ is a deterministic time-invariant function. Assume that the machine outputs an estimate at each time point according to some fixed mapping from the state space to the unit interval. The quality of the estimation procedure is measured by the asymptotic risk, which is the long-term average of the instantaneous quadratic risk. The main contribution of this paper is an upper bound on the smallest worst-case asymptotic risk any such machine can attain. This bound coincides with a lower bound derived by Leighton and Rivest, to imply that $Θ(1/S)$ is the minimax asymptotic risk for deterministic $S$-state machines. In particular, our result disproves a longstanding $Θ(\log S/S)$ conjecture for this quantity, also posed by Leighton and Rivest.

preprint2022arXiv

On The Memory Complexity of Uniformity Testing

In this paper we consider the problem of uniformity testing with limited memory. We observe a sequence of independent identically distributed random variables drawn from a distribution $p$ over $[n]$, which is either uniform or is $\varepsilon$-far from uniform under the total variation distance, and our goal is to determine the correct hypothesis. At each time point we are allowed to update the state of a finite-memory machine with $S$ states, where each state of the machine is assigned one of the hypotheses, and we are interested in obtaining an asymptotic probability of error at most $0<δ<1/2$ uniformly under both hypotheses. The main contribution of this paper is deriving upper and lower bounds on the number of states $S$ needed in order to achieve a constant error probability $δ$, as a function of $n$ and $\varepsilon$, where our upper bound is $O(\frac{n\log n}{\varepsilon})$ and our lower bound is $Ω(n+\frac{1}{\varepsilon})$. Prior works in the field have almost exclusively used collision counting for upper bounds, and the Paninski mixture for lower bounds. Somewhat surprisingly, in the limited memory with unlimited samples setup, the optimal solution does not involve counting collisions, and the Paninski prior is not hard. Thus, different proof techniques are needed in order to attain our bounds.

preprint2022arXiv

Spiked Covariance Estimation from Modulo-Reduced Measurements

Consider the rank-1 spiked model: $\bf{X}=\sqrtνξ\bf{u}+ \bf{Z}$, where $ν$ is the spike intensity, $\bf{u}\in\mathbb{S}^{k-1}$ is an unknown direction and $ξ\sim \mathcal{N}(0,1),\bf{Z}\sim \mathcal{N}(\bf{0},\bf{I})$. Motivated by recent advances in analog-to-digital conversion, we study the problem of recovering $\bf{u}\in \mathbb{S}^{k-1}$ from $n$ i.i.d. modulo-reduced measurements $\bf{Y}=[\bf{X}]\mod Δ$, focusing on the high-dimensional regime ($k\gg 1$). We develop and analyze an algorithm that, for most directions $\bf{u}$ and $ν=\mathrm{poly}(k)$, estimates $\bf{u}$ to high accuracy using $n=\mathrm{poly}(k)$ measurements, provided that $Δ\gtrsim \sqrt{\log k}$. Up to constants, our algorithm accurately estimates $\bf{u}$ at the smallest possible $Δ$ that allows (in an information-theoretic sense) to recover $\bf{X}$ from $\bf{Y}$. A key step in our analysis involves estimating the probability that a line segment of length $\approx\sqrtν$ in a random direction $\bf{u}$ passes near a point in the lattice $Δ\mathbb{Z}^k$. Numerical experiments show that the developed algorithm performs well even in a non-asymptotic setting.

preprint2020arXiv

A Note on the Probability of Rectangles for Correlated Binary Strings

Consider two sequences of $n$ independent and identically distributed fair coin tosses, $X=(X_1,\ldots,X_n)$ and $Y=(Y_1,\ldots,Y_n)$, which are $ρ$-correlated for each $j$, i.e. $\mathbb{P}[X_j=Y_j] = {1+ρ\over 2}$. We study the question of how large (small) the probability $\mathbb{P}[X \in A, Y\in B]$ can be among all sets $A,B\subset\{0,1\}^n$ of a given cardinality. For sets $|A|,|B| = Θ(2^n)$ it is well known that the largest (smallest) probability is approximately attained by concentric (anti-concentric) Hamming balls, and this can be proved via the hypercontractive inequality (reverse hypercontractivity). Here we consider the case of $|A|,|B| = 2^{Θ(n)}$. By applying a recent extension of the hypercontractive inequality of Polyanskiy-Samorodnitsky (J. Functional Analysis, 2019), we show that Hamming balls of the same size approximately maximize $\mathbb{P}[X \in A, Y\in B]$ in the regime of $ρ\to 1$. We also prove a similar tight lower bound, i.e. show that for $ρ\to 0$ the pair of opposite Hamming balls approximately minimizes the probability $\mathbb{P}[X \in A, Y\in B]$.

preprint2020arXiv

An Information-Theoretic Proof of the Streaming Switching Lemma for Symmetric Encryption

Motivated by a fundamental paradigm in cryptography, we consider a recent variant of the classic problem of bounding the distinguishing advantage between a random function and a random permutation. Specifically, we consider the problem of deciding whether a sequence of $q$ values was sampled uniformly with or without replacement from $[N]$, where the decision is made by a streaming algorithm restricted to using at most $s$ bits of internal memory. In this work, the distinguishing advantage of such an algorithm is measured by the KL divergence between the distributions of its output as induced under the two cases. We show that for any $s=Ω(\log N)$ the distinguishing advantage is upper bounded by $O(q \cdot s / N)$, and even by $O(q \cdot s / N \log N)$ when $q \leq N^{1 - ε}$ for any constant $ε> 0$ where it is nearly tight with respect to the KL divergence.

preprint2020arXiv

Binary Hypothesis Testing with Deterministic Finite-Memory Decision Rules

In this paper we consider the problem of binary hypothesis testing with finite memory systems. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables, with expectation $p$ under $\mathcal{H}_0$ and $q$ under $\mathcal{H}_1$. Consider a finite-memory deterministic machine with $S$ states that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time according to the rule $M_n = f(M_{n-1},X_n)$, where $f$ is a deterministic time-invariant function. Assume that we let the process run for a very long time ($n\rightarrow \infty)$, and then make our decision according to some mapping from the state space to the hypothesis space. The main contribution of this paper is a lower bound on the Bayes error probability $P_e$ of any such machine. In particular, our findings show that the ratio between the maximal exponential decay rate of $P_e$ with $S$ for a deterministic machine and for a randomized one, can become unbounded, complementing a result by Hellman.

preprint2020arXiv

New bounds on the density of lattice coverings

We obtain new upper bounds on the minimal density of lattice coverings of Euclidean space by dilates of a convex body K. We also obtain bounds on the probability (with respect to the natural Haar-Siegel measure on the space of lattices) that a randomly chosen lattice L satisfies that L+K is all of space. As a step in the proof, we utilize and strengthen results on the discrete Kakeya problem.

preprint2016arXiv

Mutual Information Bounds via Adjacency Events

The mutual information between two jointly distributed random variables $X$ and $Y$ is a functional of the joint distribution $P_{XY},$ which is sometimes difficult to handle or estimate. A coarser description of the statistical behavior of $(X,Y)$ is given by the marginal distributions $P_X, P_Y$ and the adjacency relation induced by the joint distribution, where $x$ and $y$ are adjacent if $P(x,y)>0$. We derive a lower bound on the mutual information in terms of these entities. The bound is obtained by viewing the channel from $X$ to $Y$ as a probability distribution on a set of possible actions, where an action determines the output for any possible input, and is independently drawn. We also provide an alternative proof based on convex optimization, that yields a generally tighter bound. Finally, we derive an upper bound on the mutual information in terms of adjacency events between the action and the pair $(X,Y)$, where in this case an action $a$ and a pair $(x,y)$ are adjacent if $y=a(x)$. As an example, we apply our bounds to the binary deletion channel and show that for the special case of an i.i.d. input distribution and a range of deletion probabilities, our lower and upper bounds both outperform the best known bounds for the mutual information.

preprint2016arXiv

Novel Lower Bounds on the Entropy Rate of Binary Hidden Markov Processes

Recently, Samorodnitsky proved a strengthened version of Mrs. Gerber's Lemma, where the output entropy of a binary symmetric channel is bounded in terms of the average entropy of the input projected on a random subset of coordinates. Here, this result is applied for deriving novel lower bounds on the entropy rate of binary hidden Markov processes. For symmetric underlying Markov processes, our bound improves upon the best known bound in the very noisy regime. The nonsymmetric case is also considered, and explicit bounds are derived for Markov processes that satisfy the $(1,\infty)$-RLL constraint.

preprint2015arXiv

A Simple Proof for the Existence of "Good" Pairs of Nested Lattices

This paper provides a simplified proof for the existence of nested lattice codebooks allowing to achieve the capacity of the additive white Gaussian noise channel, as well as the optimal rate-distortion trade-off for a Gaussian source. The proof is self-contained and relies only on basic probabilistic and geometrical arguments. An ensemble of nested lattices that is different, and more elementary, than the one used in previous proofs is introduced. This ensemble is based on lifting different subcodes of a linear code to the Euclidean space using Construction A. In addition to being simpler, our analysis is less sensitive to the assumption that the additive noise is Gaussian. In particular, for additive ergodic noise channels it is shown that the achievable rates of the nested lattice coding scheme depend on the noise distribution only via its power. Similarly, the nested lattice source coding scheme attains the same rate-distortion trade-off for all ergodic sources with the same second moment.

preprint2015arXiv

An Improved Upper Bound for the Most Informative Boolean Function Conjecture

Suppose $X$ is a uniformly distributed $n$-dimensional binary vector and $Y$ is obtained by passing $X$ through a binary symmetric channel with crossover probability $α$. A recent conjecture by Courtade and Kumar postulates that $I(f(X);Y)\leq 1-h(α)$ for any Boolean function $f$. So far, the best known upper bound was $I(f(X);Y)\leq (1-2α)^2$. In this paper, we derive a new upper bound that holds for all balanced functions, and improves upon the best known bound for all $\tfrac{1}{3}<α<\tfrac{1}{2}$.

preprint2015arXiv

Minimum MS. E. Gerber's Lemma

Mrs. Gerber's Lemma lower bounds the entropy at the output of a binary symmetric channel in terms of the entropy of the input process. In this paper, we lower bound the output entropy via a different measure of input uncertainty, pertaining to the minimum mean squared error (MMSE) prediction cost of the input process. We show that in many cases our bound is tighter than the one obtained from Mrs. Gerber's Lemma. As an application, we evaluate the bound for binary hidden Markov processes, and obtain new estimates for the entropy rate.

preprint2015arXiv

Performance Analysis and Optimal Filter Design for Sigma-Delta Modulation via Duality with DPCM

Sampling above the Nyquist rate is at the heart of sigma-delta modulation, where the increase in sampling rate is translated to a reduction in the overall (mean-squared-error) reconstruction distortion. This is attained by using a feedback filter at the encoder, in conjunction with a low-pass filter at the decoder. The goal of this work is to characterize the optimal trade-off between the per-sample quantization rate and the resulting mean-squared-error distortion, under various restrictions on the feedback filter. To this end, we establish a duality relation between the performance of sigma-delta modulation, and that of differential pulse-code modulation when applied to (discrete-time) band-limited inputs. As the optimal trade-off for the latter scheme is fully understood, the full characterization for sigma-delta modulation, as well as the optimal feedback filters, immediately follow.

preprint2015arXiv

Subset-Universal Lossy Compression

A lossy source code $\mathcal{C}$ with rate $R$ for a discrete memoryless source $S$ is called subset-universal if for every $0<R'< R$, almost every subset of $2^{nR'}$ of its codewords achieves average distortion close to the source's distortion-rate function $D(R')$. In this paper we prove the asymptotic existence of such codes. Moreover, we show the asymptotic existence of a code that is subset-universal with respect to all sources with the same alphabet.

preprint2014arXiv

A VC-dimension-based Outer Bound on the Zero-Error Capacity of the Binary Adder Channel

The binary adder is a two-user multiple access channel whose inputs are binary and whose output is the real sum of the inputs. While the Shannon capacity region of this channel is well known, little is known regarding its zero-error capacity region, and a large gap remains between the best inner and outer bounds. In this paper, we provide an improved outer bound for this problem. To that end, we introduce a soft variation of the Saur-Perles-Shelah Lemma, that is then used in conjunction with an outer bound for the Shannon capacity region with an additional common message.

preprint2014arXiv

An Upper Bound on the Sizes of Multiset-Union-Free Families

Let $\mathcal{F}_1$ and $\mathcal{F}_2$ be two families of subsets of an $n$-element set. We say that $\mathcal{F}_1$ and $\mathcal{F}_2$ are multiset-union-free if for any $A,B\in \mathcal{F}_1$ and $C,D\in \mathcal{F}_2$ the multisets $A\uplus C$ and $B\uplus D$ are different, unless both $A = B$ and $C= D$. We derive a new upper bound on the maximal sizes of multiset-union-free pairs, improving a result of Urbanke and Li.

preprint2014arXiv

Precoded Integer-Forcing Universally Achieves the MIMO Capacity to Within a Constant Gap

An open-loop single-user multiple-input multiple-output communication scheme is considered where a transmitter, equipped with multiple antennas, encodes the data into independent streams all taken from the same linear code. The coded streams are then linearly precoded using the encoding matrix of a perfect linear dispersion space-time code. At the receiver side, integer-forcing equalization is applied, followed by standard single-stream decoding. It is shown that this communication architecture achieves the capacity of any Gaussian multiple-input multiple-output channel up to a gap that depends only on the number of transmit antennas.

preprint2014arXiv

The Approximate Sum Capacity of the Symmetric Gaussian K-User Interference Channel

Interference alignment has emerged as a powerful tool in the analysis of multi-user networks. Despite considerable recent progress, the capacity region of the Gaussian K-user interference channel is still unknown in general, in part due to the challenges associated with alignment on the signal scale using lattice codes. This paper develops a new framework for lattice interference alignment, based on the compute-and-forward approach. Within this framework, each receiver decodes by first recovering two or more linear combinations of the transmitted codewords with integer-valued coefficients and then solving these equations for its desired codeword. For the special case of symmetric channel gains, this framework is used to derive the approximate sum capacity of the Gaussian interference channel, up to an explicitly defined outage set of the channel gains. The key contributions are the capacity lower bounds for the weak through strong interference regimes, where each receiver should jointly decode its own codeword along with part of the interfering codewords. As part of the analysis, it is shown that decoding K linear combinations of the codewords can approach the sum capacity of the K-user Gaussian multiple-access channel up to a gap of no more than K log(K)/2 bits.

preprint2013arXiv

Integer-Forcing Source Coding

Integer-Forcing (IF) is a new framework, based on compute-and-forward, for decoding multiple integer linear combinations from the output of a Gaussian multiple-input multiple-output channel. This work applies the IF approach to arrive at a new low-complexity scheme, IF source coding, for distributed lossy compression of correlated Gaussian sources under a minimum mean squared error distortion measure. All encoders use the same nested lattice codebook. Each encoder quantizes its observation using the fine lattice as a quantizer and reduces the result modulo the coarse lattice, which plays the role of binning. Rather than directly recovering the individual quantized signals, the decoder first recovers a full-rank set of judiciously chosen integer linear combinations of the quantized signals, and then inverts it. In general, the linear combinations have smaller average powers than the original signals. This allows to increase the density of the coarse lattice, which in turn translates to smaller compression rates. We also propose and analyze a one-shot version of IF source coding, that is simple enough to potentially lead to a new design principle for analog-to-digital converters that can exploit spatial correlations between the sampled signals.

preprint2013arXiv

Successive Integer-Forcing and its Sum-Rate Optimality

Integer-forcing receivers generalize traditional linear receivers for the multiple-input multiple-output channel by decoding integer-linear combinations of the transmitted streams, rather then the streams themselves. Previous works have shown that the additional degree of freedom in choosing the integer coefficients enables this receiver to approach the performance of maximum-likelihood decoding in various scenarios. Nonetheless, even for the optimal choice of integer coefficients, the additive noise at the equalizer's output is still correlated. In this work we study a variant of integer-forcing, termed successive integer-forcing, that exploits these noise correlations to improve performance. This scheme is the integer-forcing counterpart of successive interference cancellation for traditional linear receivers. Similarly to the latter, we show that successive integer-forcing is capacity achieving when it is possible to optimize the rate allocation to the different streams. In comparison to standard successive interference cancellation receivers, the successive integer-forcing receiver offers more possibilities for capacity achieving rate tuples, and in particular, ones that are more balanced.

preprint2011arXiv

Cyclic-Coded Integer-Forcing Equalization

A discrete-time intersymbol interference channel with additive Gaussian noise is considered, where only the receiver has knowledge of the channel impulse response. An approach for combining decision-feedback equalization with channel coding is proposed, where decoding precedes the removal of intersymbol interference. This is accomplished by combining the recently proposed integer-forcing equalization approach with cyclic block codes. The channel impulse response is linearly equalized to an integer-valued response. This is then utilized by leveraging the property that a cyclic code is closed under (cyclic) integer-valued convolution. Explicit bounds on the performance of the proposed scheme are also derived.

preprint2011arXiv

Interference Alignment at Finite SNR for Time-Invariant Channels

An achievable rate region, based on lattice interference alignment, is derived for a class of time-invariant Gaussian interference channels with more than two users. The result is established via a new coding theorem for the two-user Gaussian multiple-access channel where both users use a single linear code. The class of interference channels treated is such that all interference channel gains are rational. For this class of interference channels, beyond recovering the known results on the degrees of freedom, an explicit rate region is derived for finite signal-to-noise ratios, shedding light on the nature of previously established asymptotic results.

Or Ordentlich

What is connected

Connect this record

See the researcher in context

Building this map preview

23 published item(s)

High-Rate Quantized Matrix Multiplication II

Deterministic Finite-Memory Bias Estimation

On The Memory Complexity of Uniformity Testing

Spiked Covariance Estimation from Modulo-Reduced Measurements

A Note on the Probability of Rectangles for Correlated Binary Strings

An Information-Theoretic Proof of the Streaming Switching Lemma for Symmetric Encryption

Binary Hypothesis Testing with Deterministic Finite-Memory Decision Rules

New bounds on the density of lattice coverings

Mutual Information Bounds via Adjacency Events

Novel Lower Bounds on the Entropy Rate of Binary Hidden Markov Processes

A Simple Proof for the Existence of "Good" Pairs of Nested Lattices

An Improved Upper Bound for the Most Informative Boolean Function Conjecture

Minimum MS. E. Gerber's Lemma

Performance Analysis and Optimal Filter Design for Sigma-Delta Modulation via Duality with DPCM

Subset-Universal Lossy Compression

A VC-dimension-based Outer Bound on the Zero-Error Capacity of the Binary Adder Channel

An Upper Bound on the Sizes of Multiset-Union-Free Families

Precoded Integer-Forcing Universally Achieves the MIMO Capacity to Within a Constant Gap

The Approximate Sum Capacity of the Symmetric Gaussian K-User Interference Channel

Integer-Forcing Source Coding

Successive Integer-Forcing and its Sum-Rate Optimality

Cyclic-Coded Integer-Forcing Equalization

Interference Alignment at Finite SNR for Time-Invariant Channels