Source author record

Matthieu Lerasle

Matthieu Lerasle appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Machine Learning math.PR Methodology

Catalog footprint

What is connected

13works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

On the robustness of the minimum $\ell_2$ interpolator

We analyse the interpolator with minimal $\ell_2$-norm $\hatβ$ in a general high dimensional linear regression framework where $\mathbb Y=\mathbb Xβ^*+ξ$ where $\mathbb X$ is a random $n\times p$ matrix with independent $\mathcal N(0,Σ)$ rows and without assumption on the noise vector $ξ\in \mathbb R^n$. We prove that, with high probability, the prediction loss of this estimator is bounded from above by $(\|β^*\|^2_2r_{cn}(Σ)\vee \|ξ\|^2)/n$, where $r_{k}(Σ)=\sum_{i\geq k}λ_i(Σ)$ are the rests of the sum of eigenvalues of $Σ$. These bounds show a transition in the rates. For high signal to noise ratios, the rates $\|β^*\|^2_2r_{cn}(Σ)/n$ broadly improve the existing ones. For low signal to noise ratio, we also provide lower bound holding with large probability. Under assumptions on the sprectrum of $Σ$, this lower bound is of order $\| ξ\|_2^2/n$, matching the upper bound. Consequently, in the large noise regime, we are able to precisely track the prediction error with large probability. This results give new insight when the interpolation can be harmless in high dimensions.

preprint2021arXiv

Robust high dimensional learning for Lipschitz and convex losses

We establish risk bounds for Regularized Empirical Risk Minimizers (RERM) when the loss is Lipschitz and convex and the regularization function is a norm. In a first part, we obtain these results in the i.i.d. setup under subgaussian assumptions on the design. In a second part, a more general framework where the design might have heavier tails and data may be corrupted by outliers both in the design and the response variables is considered. In this situation, RERM performs poorly in general. We analyse an alternative procedure based on median-of-means principles and called minmax MOM. We show optimal subgaussian deviation rates for these estimators in the relaxed setting. The main results are meta-theorems allowing a wide-range of applications to various problems in learning theory. To show a non-exhaustive sample of these potential applications, it is applied to classification problems with logistic loss functions regularized by LASSO and SLOPE, to regression problems with Huber loss regularized by Group LASSO and Total Variation. Another advantage of the minmax MOM formulation is that it suggests a systematic way to slightly modify descent based algorithms used in high-dimensional statistics to make them robust to outliers. We illustrate this principle in a Simulations section where a minmax MOM version of classical proximal descent algorithms are turned into robust to outliers algorithms.

preprint2020arXiv

Learning the distribution of latent variables in paired comparison models with round-robin scheduling

Paired comparison data considered in this paper originate from the comparison of a large number N of individuals in couples. The dataset is a collection of results of contests between two individuals when each of them has faced n opponents, where n is much larger than N. Individual are represented by independent and identically distributed random parameters characterizing their abilities.The paper studies the maximum likelihood estimator of the parameters distribution. The analysis relies on the construction of a graphical model encoding conditional dependencies of the observations which are the outcomes of the first n contests each individual is involved in. This graphical model allows to prove geometric loss of memory properties and deduce the asymptotic behavior of the likelihood function. This paper sets the focus on graphical models obtained from round-robin scheduling of these contests.Following a classical construction in learning theory, the asymptotic likelihood is used to measure performance of the maximum likelihood estimator. Risk bounds for this estimator are finally obtained by sub-Gaussian deviation results for Markov chains applied to the graphical model.

preprint2019arXiv

Aggregated Hold-Out

Aggregated hold-out (Agghoo) is a method which averages learning rules selected by hold-out (that is, cross-validation with a single split). We provide the first theoretical guarantees on Agghoo, ensuring that it can be used safely: Agghoo performs at worst like the hold-out when the risk is convex. The same holds true in classification with the 0-1 risk, with an additional constant factor. For the hold-out, oracle inequalities are known for bounded losses, as in binary classification. We show that similar results can be proved, under appropriate assumptions, for other risk-minimization problems. In particular, we obtain an oracle inequality for regularized kernel regression with a Lip-schitz loss, without requiring that the Y variable or the regressors be bounded. Numerical experiments show that aggregation brings a significant improvement over the hold-out and that Agghoo is competitive with cross-validation.

preprint2016arXiv

Sharp oracle inequalities and slope heuristic for specification probabilities estimation in discrete random fields

We study the problem of estimating the one-point specification probabilities in non-necessary finite discrete random fields from partially observed independent samples. Our procedures are based on model selection by minimization of a penalized empirical criterion. The selected estimators satisfy sharp oracle inequalities in $L_2$-risk. We also obtain theoretical results on the slope heuristic for this problem, justifying the slope algorithm to calibrate the leading constant in the penalty. The practical performances of our methods are investigated in two simulation studies. We illustrate the usefulness of our approach by applying the methods to a multi-unit neuronal data from a rat hippocampus.

preprint2015arXiv

Choice of V for V-Fold Cross-Validation in Least-Squares Density Estimation

This paper studies V-fold cross-validation for model selection in least-squares density estimation. The goal is to provide theoretical grounds for choosing V in order to minimize the least-squares loss of the selected estimator. We first prove a non-asymptotic oracle inequality for V-fold cross-validation and its bias-corrected version (V-fold penalization). In particular, this result implies that V-fold penalization is asymptotically optimal in the nonparametric case. Then, we compute the variance of V-fold cross-validation and related criteria, as well as the variance of key quantities for model selection performance. We show that these variances depend on V like 1+4/(V-1), at least in some particular cases, suggesting that the performance increases much from V=2 to V=5 or 10, and then is almost constant. Overall, this can explain the common advice to take V=5---at least in our setting and when the computational power is limited---, as supported by some simulation experiments. An oracle inequality and exact formulas for the variance are also proved for Monte-Carlo cross-validation, also known as repeated cross-validation, where the parameter V is replaced by the number B of random splits of the data.

preprint2015arXiv

Sub-Gaussian mean estimators

We discuss the possibilities and limitations of estimating the mean of a real-valued random variable from independent and identically distributed observations from a non-asymptotic point of view. In particular, we define estimators with a sub-Gaussian behavior even for certain heavy-tailed distributions. We also prove various impossibility results for mean estimators.

preprint2015arXiv

The number of potential winners in Bradley-Terry model in random environment

We consider a Bradley-Terry model in random environment where each player faces each other once. More precisely the strengths of the players are assumed to be random and we study the influence of their distributions on the asymptotic number of potential winners.First we prove that under mild assumptions, mainly on their moments, if the strengths are unbounded, the asymptotic probability that the best player wins is 1. We also exhibit a sufficient convexity condition to obtain the same result when the strengths are bounded. When this last condition fails, the number of potential winners grows at a rate depending on the tail of the distribution of strengths. We also study the minimal strength required for an additional player to win in this last case.

preprint2012arXiv

Markov Approximations of chains of infinite order in the $\bar{d}$-metric

We derive explicit upper bounds for the $\bar{d}$-distance between a chain of infinite order and its canonical $k$-steps Markov approximation. Our proof is entirely constructive and involves a "coupling from the past" argument. The new method covers non necessarily continuous probability kernels, and chains with null transition probabilities. These results imply in particular the Bernoulli property for these processes.

preprint2011arXiv

Optimal model selection for density estimation of stationary data under various mixing conditions

We propose a block-resampling penalization method for marginal density estimation with nonnecessary independent observations. When the data are $β$ or $τ$-mixing, the selected estimator satisfies oracle inequalities with leading constant asymptotically equal to 1. We also prove in this setting the slope heuristic, which is a data-driven method to optimize the leading constant in the penalty.

preprint2010arXiv

Adaptive non-asymptotic confidence balls in density estimation

We build confidence balls for the common density $s$ of a real valued sample $X_1,...,X_n$. We use resampling methods to estimate the projection of $s$ onto finite dimensional linear spaces and a model selection procedure to choose an optimal approximation space. The covering property is ensured for all $n\geq2$ and the balls are adaptive over a collection of linear spaces.

preprint2010arXiv

An Oracle Approach for Interaction Neighborhood Estimation in Random Fields

We consider the problem of interaction neighborhood estimation from the partial observation of a finite number of realizations of a random field. We introduce a model selection rule to choose estimators of conditional probabilities among natural candidates. Our main result is an oracle inequality satisfied by the resulting estimator. We use then this selection rule in a two-step procedure to evaluate the interacting neighborhoods. The selection rule selects a small prior set of possible interacting points and a cutting step remove from this prior set the irrelevant points. We also prove that the Ising models satisfy the assumptions of the main theorems, without restrictions on the temperature, on the structure of the interacting graph or on the range of the interactions. It provides therefore a large class of applications for our results. We give a computationally efficient procedure in these models. We finally show the practical efficiency of our approach in a simulation study.

preprint2010arXiv

Optimal model selection in density estimation

We build penalized least-squares estimators using the slope heuristic and resampling penalties. We prove oracle inequalities for the selected estimator with leading constant asymptotically equal to 1. We compare the practical performances of these methods in a short simulation study.

Matthieu Lerasle

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

On the robustness of the minimum $\ell_2$ interpolator

Robust high dimensional learning for Lipschitz and convex losses

Learning the distribution of latent variables in paired comparison models with round-robin scheduling

Aggregated Hold-Out

Sharp oracle inequalities and slope heuristic for specification probabilities estimation in discrete random fields

Choice of V for V-Fold Cross-Validation in Least-Squares Density Estimation

Sub-Gaussian mean estimators

The number of potential winners in Bradley-Terry model in random environment

Markov Approximations of chains of infinite order in the $\bar{d}$-metric

Optimal model selection for density estimation of stationary data under various mixing conditions

Adaptive non-asymptotic confidence balls in density estimation

An Oracle Approach for Interaction Neighborhood Estimation in Random Fields

Optimal model selection in density estimation