Source author record

Galen Reeves

Galen Reeves appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning math.ST Statistics Theory Computation cond-mat.dis-nn Computer Vision cond-mat.stat-mech math.PR Methodology

Catalog footprint

What is connected

16works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Linear Operator Approximate Message Passing (OpAMP)

This paper introduces a framework for approximate message passing (AMP) in dynamic settings where the data at each iteration is passed through a linear operator. This framework is motivated in part by applications in large-scale, distributed computing where only a subset of the data is available at each iteration. An autoregressive memory term is used to mitigate information loss across iterations and a specialized algorithm, called projection AMP, is designed for the case where each linear operator is an orthogonal projection. Precise theoretical guarantees are provided for a class of Gaussian matrices and non-separable denoising functions. Specifically, it is shown that the iterates can be well-approximated in the high-dimensional limit by a Gaussian process whose second-order statistics are defined recursively via state evolution. These results are applied to the problem of estimating a rank-one spike corrupted by additive Gaussian noise using partial row updates, and the theory is validated by numerical simulations.

preprint2022arXiv

Approximating posteriors with high-dimensional nuisance parameters via integrated rotated Gaussian approximation

Posterior computation for high-dimensional data with many parameters can be challenging. This article focuses on a new method for approximating posterior distributions of a low- to moderate-dimensional parameter in the presence of a high-dimensional or otherwise computationally challenging nuisance parameter. The focus is on regression models and the key idea is to separate the likelihood into two components through a rotation. One component involves only the nuisance parameters, which can then be integrated out using a novel type of Gaussian approximation. We provide theory on approximation accuracy that holds for a broad class of forms of the nuisance component and priors. Applying our method to simulated and real data sets shows that it can outperform state-of-the-art posterior approximation approaches.

preprint2022arXiv

Fundamental limits for rank-one matrix estimation with groupwise heteroskedasticity

Low-rank matrix recovery problems involving high-dimensional and heterogeneous data appear in applications throughout statistics and machine learning. The contribution of this paper is to establish the fundamental limits of recovery for a broad class of these problems. In particular, we study the problem of estimating a rank-one matrix from Gaussian observations where different blocks of the matrix are observed under different noise levels. In the setting where the number of blocks is fixed while the number of variables tends to infinity, we prove asymptotically exact formulas for the minimum mean-squared error in estimating both the matrix and underlying factors. These results are based on a novel reduction from the low-rank matrix tensor product model (with homogeneous noise) to a rank-one model with heteroskedastic noise. As an application of our main result, we show that recently proposed methods based on applying principal component analysis (PCA) to weighted combinations of the data are optimal in some settings but sub-optimal in others. We also provide numerical results comparing our asymptotic formulas with the performance of methods based on weighted PCA, gradient descent, and approximate message passing.

preprint2021arXiv

Convergence of Gaussian-smoothed optimal transport distance with sub-gamma distributions and dependent samples

The Gaussian-smoothed optimal transport (GOT) framework, recently proposed by Goldfeld et al., scales to high dimensions in estimation and provides an alternative to entropy regularization. This paper provides convergence guarantees for estimating the GOT distance under more general settings. For the Gaussian-smoothed $p$-Wasserstein distance in $d$ dimensions, our results require only the existence of a moment greater than $d + 2p$. For the special case of sub-gamma distributions, we quantify the dependence on the dimension $d$ and establish a phase transition with respect to the scale parameter. We also prove convergence for dependent samples, only requiring a condition on the pairwise dependence of the samples measured by the covariance of the feature map of a kernel space. A key step in our analysis is to show that the GOT distance is dominated by a family of kernel maximum mean discrepancy (MMD) distances with a kernel that depends on the cost function as well as the amount of Gaussian smoothing. This insight provides further interpretability for the GOT framework and also introduces a class of kernel MMD distances with desirable properties. The theoretical results are supported by numerical experiments.

preprint2021arXiv

The Gaussian equivalence of generative models for learning with shallow neural networks

Understanding the impact of data structure on the computational tractability of learning is a key challenge for the theory of neural networks. Many theoretical works do not explicitly model training data, or assume that inputs are drawn component-wise independently from some simple probability distribution. Here, we go beyond this simple paradigm by studying the performance of neural networks trained on data drawn from pre-trained generative models. This is possible due to a Gaussian equivalence stating that the key metrics of interest, such as the training and test errors, can be fully captured by an appropriately chosen Gaussian model. We provide three strands of rigorous, analytical and numerical evidence corroborating this equivalence. First, we establish rigorous conditions for the Gaussian equivalence to hold in the case of single-layer generative models, as well as deterministic rates for convergence in distribution. Second, we leverage this equivalence to derive a closed set of equations describing the generalisation performance of two widely studied machine learning problems: two-layer neural networks trained using one-pass stochastic gradient descent, and full-batch pre-learned features or kernel methods. Finally, we perform experiments demonstrating how our theory applies to deep, pre-trained generative models. These results open a viable path to the theoretical study of machine learning models with realistic data.

preprint2020arXiv

Information-theoretic limits of a multiview low-rank symmetric spiked matrix model

We consider a generalization of an important class of high-dimensional inference problems, namely spiked symmetric matrix models, often used as probabilistic models for principal component analysis. Such paradigmatic models have recently attracted a lot of attention from a number of communities due to their phenomenological richness with statistical-to-computational gaps, while remaining tractable. We rigorously establish the information-theoretic limits through the proof of single-letter formulas for the mutual information and minimum mean-square error. On a technical side we improve the recently introduced adaptive interpolation method, so that it can be used to study low-rank models (i.e., estimation problems of "tall matrices") in full generality, an important step towards the rigorous analysis of more complicated inference and learning models.

preprint2019arXiv

All-or-Nothing Phenomena: From Single-Letter to High Dimensions

We consider the linear regression problem of estimating a $p$-dimensional vector $β$ from $n$ observations $Y = X β+ W$, where $β_j \stackrel{\text{i.i.d.}}{\sim} π$ for a real-valued distribution $π$ with zero mean and unit variance, $X_{ij} \stackrel{\text{i.i.d.}}{\sim} \mathcal{N}(0,1)$, and $W_i\stackrel{\text{i.i.d.}}{\sim} \mathcal{N}(0, σ^2)$. In the asymptotic regime where $n/p \to δ$ and $ p/ σ^2 \to \mathsf{snr}$ for two fixed constants $δ, \mathsf{snr}\in (0, \infty)$ as $p \to \infty$, the limiting (normalized) minimum mean-squared error (MMSE) has been characterized by the MMSE of an associated single-letter (additive Gaussian scalar) channel. In this paper, we show that if the MMSE function of the single-letter channel converges to a step function, then the limiting MMSE of estimating $β$ in the linear regression problem converges to a step function which jumps from $1$ to $0$ at a critical threshold. Moreover, we establish that the limiting mean-squared error of the (MSE-optimal) approximate message passing algorithm also converges to a step function with a larger threshold, providing evidence for the presence of a computational-statistical gap between the two thresholds.

preprint2016arXiv

Classification and Reconstruction of High-Dimensional Signals from Low-Dimensional Features in the Presence of Side Information

This paper offers a characterization of fundamental limits on the classification and reconstruction of high-dimensional signals from low-dimensional features, in the presence of side information. We consider a scenario where a decoder has access both to linear features of the signal of interest and to linear features of the side information signal; while the side information may be in a compressed form, the objective is recovery or classification of the primary signal, not the side information. The signal of interest and the side information are each assumed to have (distinct) latent discrete labels; conditioned on these two labels, the signal of interest and side information are drawn from a multivariate Gaussian distribution. With joint probabilities on the latent labels, the overall signal-(side information) representation is defined by a Gaussian mixture model. We then provide sharp sufficient and/or necessary conditions for these quantities to approach zero when the covariance matrices of the Gaussians are nearly low-rank. These conditions, which are reminiscent of the well-known Slepian-Wolf and Wyner-Ziv conditions, are a function of the number of linear features extracted from the signal of interest, the number of linear features extracted from the side information signal, and the geometry of these signals and their interplay. Moreover, on assuming that the signal of interest and the side information obey such an approximately low-rank model, we derive expansions of the reconstruction error as a function of the deviation from an exactly low-rank model; such expansions also allow identification of operational regimes where the impact of side information on signal reconstruction is most relevant. Our framework, which offers a principled mechanism to integrate side information in high-dimensional data problems, is also tested in the context of imaging applications.

preprint2016arXiv

Conditional Central Limit Theorems for Gaussian Projections

This paper addresses the question of when projections of a high-dimensional random vector are approximately Gaussian. This problem has been studied previously in the context of high-dimensional data analysis, where the focus is on low-dimensional projections of high-dimensional point clouds. The focus of this paper is on the typical behavior when the projections are generated by an i.i.d. Gaussian projection matrix. The main results are bounds on the deviation between the conditional distribution of the projections and a Gaussian approximation, where the conditioning is on the projection matrix. The bounds are given in terms of the quadratic Wasserstein distance and relative entropy and are stated explicitly as a function of the number of projections and certain key properties of the random vector. The proof uses Talagrand's transportation inequality and a general integral-moment inequality for mutual information. Applications to random linear estimation and compressed sensing are discussed.

preprint2016arXiv

The Replica-Symmetric Prediction for Compressed Sensing with Gaussian Matrices is Exact

This paper considers the fundamental limit of compressed sensing for i.i.d. signal distributions and i.i.d. Gaussian measurement matrices. Its main contribution is a rigorous characterization of the asymptotic mutual information (MI) and minimum mean-square error (MMSE) in this setting. Under mild technical conditions, our results show that the limiting MI and MMSE are equal to the values predicted by the replica method from statistical physics. This resolves a well-known problem that has remained open for over a decade.

preprint2015arXiv

Scalable Approximations of Marginal Posteriors in Variable Selection

In many contexts, there is interest in selecting the most important variables from a very large collection, commonly referred to as support recovery or variable, feature or subset selection. There is an enormous literature proposing a rich variety of algorithms. In scientific applications, it is of crucial importance to quantify uncertainty in variable selection, providing measures of statistical significance for each variable. The overwhelming majority of algorithms fail to produce such measures. This has led to a focus in the scientific literature on independent screening methods, which examine each variable in isolation, obtaining p-values measuring the significance of marginal associations. Bayesian methods provide an alternative, with marginal inclusion probabilities used in place of p-values. Bayesian variable selection has advantages, but is impractical computationally beyond small problems. In this article, we show that approximate message passing (AMP) and Bayesian compressed regression (BCR) can be used to rapidly obtain accurate approximations to marginal inclusion probabilities in high-dimensional variable selection. Theoretical support is provided, simulation studies are conducted to assess performance, and the method is applied to a study relating brain networks to creative reasoning.

preprint2013arXiv

Approximate Sparsity Pattern Recovery: Information-Theoretic Lower Bounds

Recovery of the sparsity pattern (or support) of an unknown sparse vector from a small number of noisy linear measurements is an important problem in compressed sensing. In this paper, the high-dimensional setting is considered. It is shown that if the measurement rate and per-sample signal-to-noise ratio (SNR) are finite constants independent of the length of the vector, then the optimal sparsity pattern estimate will have a constant fraction of errors. Lower bounds on the measurement rate needed to attain a desired fraction of errors are given in terms of the SNR and various key parameters of the unknown vector. The tightness of the bounds in a scaling sense, as a function of the SNR and the fraction of errors, is established by comparison with existing achievable bounds. Near optimality is shown for a wide variety of practically motivated signal models.

preprint2012arXiv

The Sampling Rate-Distortion Tradeoff for Sparsity Pattern Recovery in Compressed Sensing

Recovery of the sparsity pattern (or support) of an unknown sparse vector from a limited number of noisy linear measurements is an important problem in compressed sensing. In the high-dimensional setting, it is known that recovery with a vanishing fraction of errors is impossible if the measurement rate and the per-sample signal-to-noise ratio (SNR) are finite constants, independent of the vector length. In this paper, it is shown that recovery with an arbitrarily small but constant fraction of errors is, however, possible, and that in some cases computationally simple estimators are near-optimal. Bounds on the measurement rate needed to attain a desired fraction of errors are given in terms of the SNR and various key parameters of the unknown vector for several different recovery algorithms. The tightness of the bounds, in a scaling sense, as a function of the SNR and the fraction of errors, is established by comparison with existing information-theoretic necessary bounds. Near optimality is shown for a wide variety of practically motivated signal models.

preprint2011arXiv

A Compressed Sensing Wire-Tap Channel

A multiplicative Gaussian wire-tap channel inspired by compressed sensing is studied. Lower and upper bounds on the secrecy capacity are derived, and shown to be relatively tight in the large system limit for a large class of compressed sensing matrices. Surprisingly, it is shown that the secrecy capacity of this channel is nearly equal to the capacity without any secrecy constraint provided that the channel of the eavesdropper is strictly worse than the channel of the intended receiver. In other words, the eavesdropper can see almost everything and yet learn almost nothing. This behavior, which contrasts sharply with that of many commonly studied wiretap channels, is made possible by the fact that a small number of linear projections can make a crucial difference in the ability to estimate sparse vectors.

preprint2011arXiv

On the Role of Diversity in Sparsity Estimation

A major challenge in sparsity pattern estimation is that small modes are difficult to detect in the presence of noise. This problem is alleviated if one can observe samples from multiple realizations of the nonzero values for the same sparsity pattern. We will refer to this as "diversity". Diversity comes at a price, however, since each new realization adds new unknown nonzero values, thus increasing uncertainty. In this paper, upper and lower bounds on joint sparsity pattern estimation are derived. These bounds, which improve upon existing results even in the absence of diversity, illustrate key tradeoffs between the number of measurements, the accuracy of estimation, and the diversity. It is shown, for instance, that diversity introduces a tradeoff between the uncertainty in the noise and the uncertainty in the nonzero values. Moreover, it is shown that the optimal amount of diversity significantly improves the behavior of the estimation problem for both optimal and computationally efficient estimators.

preprint2010arXiv

"Compressed" Compressed Sensing

The field of compressed sensing has shown that a sparse but otherwise arbitrary vector can be recovered exactly from a small number of randomly constructed linear projections (or samples). The question addressed in this paper is whether an even smaller number of samples is sufficient when there exists prior knowledge about the distribution of the unknown vector, or when only partial recovery is needed. An information-theoretic lower bound with connections to free probability theory and an upper bound corresponding to a computationally simple thresholding estimator are derived. It is shown that in certain cases (e.g. discrete valued vectors or large distortions) the number of samples can be decreased. Interestingly though, it is also shown that in many cases no reduction is possible.

Galen Reeves

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Linear Operator Approximate Message Passing (OpAMP)

Approximating posteriors with high-dimensional nuisance parameters via integrated rotated Gaussian approximation

Fundamental limits for rank-one matrix estimation with groupwise heteroskedasticity

Convergence of Gaussian-smoothed optimal transport distance with sub-gamma distributions and dependent samples

The Gaussian equivalence of generative models for learning with shallow neural networks

Information-theoretic limits of a multiview low-rank symmetric spiked matrix model

All-or-Nothing Phenomena: From Single-Letter to High Dimensions

Classification and Reconstruction of High-Dimensional Signals from Low-Dimensional Features in the Presence of Side Information

Conditional Central Limit Theorems for Gaussian Projections

The Replica-Symmetric Prediction for Compressed Sensing with Gaussian Matrices is Exact

Scalable Approximations of Marginal Posteriors in Variable Selection

Approximate Sparsity Pattern Recovery: Information-Theoretic Lower Bounds

The Sampling Rate-Distortion Tradeoff for Sparsity Pattern Recovery in Compressed Sensing

A Compressed Sensing Wire-Tap Channel

On the Role of Diversity in Sparsity Estimation

"Compressed" Compressed Sensing