Source author record

Michael Celentano

Michael Celentano appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Machine Learning math.PR

Catalog footprint

What is connected

3works

4topics

2close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Fundamental Barriers to High-Dimensional Regression with Convex Penalties

In high-dimensional regression, we attempt to estimate a parameter vector $β_0\in\mathbb{R}^p$ from $n\lesssim p$ observations $\{(y_i,x_i)\}_{i\leq n}$ where $x_i\in\mathbb{R}^p$ is a vector of predictors and $y_i$ is a response variable. A well-established approach uses convex regularizers to promote specific structures (e.g. sparsity) of the estimate $\widehatβ$, while allowing for practical algorithms. Theoretical analysis implies that convex penalization schemes have nearly optimal estimation properties in certain settings. However, in general the gaps between statistically optimal estimation (with unbounded computational resources) and convex methods are poorly understood. We show that when the statistican has very simple structural information about the distribution of the entries of $β_0$, a large gap frequently exists between the best performance achieved by any convex regularizer satisfying a mild technical condition and either (i) the optimal statistical error or (ii) the statistical error achieved by optimal approximate message passing algorithms. Remarkably, a gap occurs at high enough signal-to-noise ratio if and only if the distribution of the coordinates of $β_0$ is not log-concave. These conclusions follow from an analysis of standard Gaussian designs. Our lower bounds are expected to be generally tight, and we prove tightness under certain conditions.

preprint2022arXiv

Sudakov-Fernique post-AMP, and a new proof of the local convexity of the TAP free energy

In many problems in modern statistics and machine learning, it is often of interest to establish that a first order method on a non-convex risk function eventually enters a region of parameter space in which the risk is locally convex. We derive an asymptotic comparison inequality, which we call the Sudakov-Fernique post-AMP inequality, which, in a certain class of problems involving a GOE matrix, is able to probe properties of an optimization landscape locally around the iterates of an approximate message passing (AMP) algorithm. As an example of its use, we provide a new, and arguably simpler, proof of some of the results of Celentano et al. (2021), which establishes that the so-called TAP free energy in the $\mathbb{Z}_2$-synchronization problem is locally convex in the region to which AMP converges. We further prove a conjecture of El Alaoui et al. (2022) involving the local convexity of a related but distinct TAP free energy, which, as a consequence, confirms that their algorithm efficiently samples from the Sherrington-Kirkpatrick Gibbs measure throughout the "easy" regime.

preprint2020arXiv

The estimation error of general first order methods

Modern large-scale statistical models require to estimate thousands to millions of parameters. This is often accomplished by iterative algorithms such as gradient descent, projected gradient descent or their accelerated versions. What are the fundamental limits to these approaches? This question is well understood from an optimization viewpoint when the underlying objective is convex. Work in this area characterizes the gap to global optimality as a function of the number of iterations. However, these results have only indirect implications in terms of the gap to statistical optimality. Here we consider two families of high-dimensional estimation problems: high-dimensional regression and low-rank matrix estimation, and introduce a class of `general first order methods' that aim at efficiently estimating the underlying parameters. This class of algorithms is broad enough to include classical first order optimization (for convex and non-convex objectives), but also other types of algorithms. Under a random design assumption, we derive lower bounds on the estimation error that hold in the high-dimensional asymptotics in which both the number of observations and the number of parameters diverge. These lower bounds are optimal in the sense that there exist algorithms whose estimation error matches the lower bounds up to asymptotically negligible terms. We illustrate our general results through applications to sparse phase retrieval and sparse principal component analysis.