Source author record

Pierre C Bellec

Pierre C Bellec appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Machine Learning Methodology

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Noise Covariance Estimation in Multi-Task High-dimensional Linear Models

This paper studies the multi-task high-dimensional linear regression models where the noise among different tasks is correlated, in the moderately high dimensional regime where sample size $n$ and dimension $p$ are of the same order. Our goal is to estimate the covariance matrix of the noise random vectors, or equivalently the correlation of the noise variables on any pair of two tasks. Treating the regression coefficients as a nuisance parameter, we leverage the multi-task elastic-net and multi-task lasso estimators to estimate the nuisance. By precisely understanding the bias of the squared residual matrix and by correcting this bias, we develop a novel estimator of the noise covariance that converges in Frobenius norm at the rate $n^{-1/2}$ when the covariates are Gaussian. This novel estimator is efficiently computable. Under suitable conditions, the proposed estimator of the noise covariance attains the same rate of convergence as the "oracle" estimator that knows in advance the regression coefficients of the multi-task model. The Frobenius error bounds obtained in this paper also illustrate the advantage of this new estimator compared to a method-of-moments estimator that does not attempt to estimate the nuisance. As a byproduct of our techniques, we obtain an estimate of the generalization error of the multi-task elastic-net and multi-task lasso estimators. Extensive simulation studies are carried out to illustrate the numerical performance of the proposed method.

preprint2020arXiv

First order expansion of convex regularized estimators

We consider first order expansions of convex penalized estimators in high-dimensional regression problems with random designs. Our setting includes linear regression and logistic regression as special cases. For a given penalty function $h$ and the corresponding penalized estimator $\hatβ$, we construct a quantity $η$, the first order expansion of $\hatβ$, such that the distance between $\hatβ$ and $η$ is an order of magnitude smaller than the estimation error $\|\hatβ - β^*\|$. In this sense, the first order expansion $η$ can be thought of as a generalization of influence functions from the mathematical statistics literature to regularized estimators in high-dimensions. Such first order expansion implies that the risk of $\hatβ$ is asymptotically the same as the risk of $η$ which leads to a precise characterization of the MSE of $\hatβ$; this characterization takes a particularly simple form for isotropic design. Such first order expansion also leads to inference results based on $\hatβ$. We provide sufficient conditions for the existence of such first order expansion for three regularizers: the Lasso in its constrained form, the lasso in its penalized form, and the Group-Lasso. The results apply to general loss functions under some conditions and those conditions are satisfied for the squared loss in linear regression and for the logistic loss in the logistic model.

preprint2020arXiv

Second order Stein: SURE for SURE and other applications in high-dimensional inference

Stein's formula states that a random variable of the form $z^\top f(z) - \text{div} f(z)$ is mean-zero for functions $f$ with integrable gradient. Here, $\text{div} f$ is the divergence of the function $f$ and $z$ is a standard normal vector. This paper aims to propose a Second Order Stein formula to characterize the variance of such random variables for all functions $f(z)$ with square integrable gradient, and to demonstrate the usefulness of this formula in various applications. In the Gaussian sequence model, a consequence of Stein's formula is Stein's Unbiased Risk Estimate (SURE), an unbiased estimate of the mean squared risk for almost any estimator $\hatμ$ of the unknown mean. A first application of the Second Order Stein formula is an Unbiased Risk Estimate for SURE itself (SURE for SURE): an unbiased estimate {providing} information about the squared distance between SURE and the squared estimation error of $\hatμ$. SURE for SURE has a simple form as a function of the data and is applicable to all $\hatμ$ with square integrable gradient, e.g. the Lasso and the Elastic Net. In addition to SURE for SURE, the following applications are developed: (1) Upper bounds on the risk of SURE when the estimation target is the mean squared error; (2) Confidence regions based on SURE; (3) Oracle inequalities satisfied by SURE-tuned estimates; (4) An upper bound on the variance of the size of the model selected by the Lasso; (5) Explicit expressions of SURE for SURE for the Lasso and the Elastic-Net; (6) In the linear model, a general semi-parametric scheme to de-bias a differentiable initial estimator for inference of a low-dimensional projection of the unknown $β$, with a characterization of the variance after de-biasing; and (7) An accuracy analysis of a Gaussian Monte Carlo scheme to approximate the divergence of functions $f: R^n\to R^n$.

Pierre C Bellec

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Noise Covariance Estimation in Multi-Task High-dimensional Linear Models

First order expansion of convex regularized estimators

Second order Stein: SURE for SURE and other applications in high-dimensional inference