Researcher profile

Alexandre B. Tsybakov

Alexandre B. Tsybakov contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2021arXiv

Variable selection, monotone likelihood ratio and group sparsity

In the pivotal variable selection problem, we derive the exact non-asymptotic minimax selector over the class of all $s$-sparse vectors, which is also the Bayes selector with respect to the uniform prior. While this optimal selector is, in general, not realizable in polynomial time, we show that its tractable counterpart (the scan selector) attains the minimax expected Hamming risk to within factor 2, and is also exact minimax with respect to the probability of wrong recovery. As a consequence, we establish explicit lower bounds under the monotone likelihood ratio property and we obtain a tight characterization of the minimax risk in terms of the best separable selector risk. We apply these general results to derive necessary and sufficient conditions of exact and almost full recovery in the location model with light tail distributions and in the problem of group variable selection under Gaussian noise.

preprint2020arXiv

Adaptive robust estimation in sparse vector model

For the sparse vector model, we consider estimation of the target vector, of its L2-norm and of the noise variance. We construct adaptive estimators and establish the optimal rates of adaptive estimation when adaptation is considered with respect to the triplet "noise level - noise distribution - sparsity". We consider classes of noise distributions with polynomially and exponentially decreasing tails as well as the case of Gaussian noise. The obtained rates turn out to be different from the minimax non-adaptive rates when the triplet is known. A crucial issue is the ignorance of the noise variance. Moreover, knowing or not knowing the noise distribution can also influence the rate. For example, the rates of estimation of the noise variance can differ depending on whether the noise is Gaussian or sub-Gaussian without a precise knowledge of the distribution. Estimation of noise variance in our setting can be viewed as an adaptive variant of robust estimation of scale in the contamination model, where instead of fixing the "nominal" distribution in advance, we assume that it belongs to some class of distributions.

preprint2011arXiv

Estimation of high-dimensional low-rank matrices

Suppose that we observe entries or, more generally, linear combinations of entries of an unknown $m\times T$-matrix $A$ corrupted by noise. We are particularly interested in the high-dimensional setting where the number $mT$ of unknown entries can be much larger than the sample size $N$. Motivated by several applications, we consider estimation of matrix $A$ under the assumption that it has small rank. This can be viewed as dimension reduction or sparsity assumption. In order to shrink toward a low-rank representation, we investigate penalized least squares estimators with a Schatten-$p$ quasi-norm penalty term, $p\leq1$. We study these estimators under two possible assumptions---a modified version of the restricted isometry condition and a uniform bound on the ratio &#34;empirical norm induced by the sampling operator/Frobenius norm.&#34; The main results are stated as nonasymptotic upper bounds on the prediction risk and on the Schatten-$q$ risk of the estimators, where $q\in[p,2]$. The rates that we obtain for the prediction risk are of the form $rm/N$ (for $m=T$), up to logarithmic factors, where $r$ is the rank of $A$. The particular examples of multi-task learning and matrix completion are worked out in detail. The proofs are based on tools from the theory of empirical processes. As a by-product, we derive bounds for the $k$th entropy numbers of the quasi-convex Schatten class embeddings $S_p^M\hookrightarrow S_2^M$, $p<1$, which are of independent interest.

preprint2011arXiv

Fast learning rates for plug-in classifiers under the margin condition

It has been recently shown that, under the margin (or low noise) assumption, there exist classifiers attaining fast rates of convergence of the excess Bayes risk, i.e., the rates faster than $n^{-1/2}$. The works on this subject suggested the following two conjectures: (i) the best achievable fast rate is of the order $n^{-1}$, and (ii) the plug-in classifiers generally converge slower than the classifiers based on empirical risk minimization. We show that both conjectures are not correct. In particular, we construct plug-in classifiers that can achieve not only the fast, but also the {\it super-fast} rates, i.e., the rates faster than $n^{-1}$. We establish minimax lower bounds showing that the obtained rates cannot be improved.

preprint2010arXiv

Detection boundary in sparse regression

We study the problem of detection of a p-dimensional sparse vector of parameters in the linear regression model with Gaussian noise. We establish the detection boundary, i.e., the necessary and sufficient conditions for the possibility of successful detection as both the sample size n and the dimension p tend to the infinity. Testing procedures that achieve this boundary are also exhibited. Our results encompass the high-dimensional setting (p>> n). The main message is that, under some conditions, the detection boundary phenomenon that has been proved for the Gaussian sequence model, extends to high-dimensional linear regression. Finally, we establish the detection boundaries when the variance of the noise is unknown. Interestingly, the detection boundaries sometimes depend on the knowledge of the variance in a high-dimensional setting.

preprint2010arXiv

SPADES and mixture models

This paper studies sparse density estimation via $\ell_1$ penalization (SPADES). We focus on estimation in high-dimensional mixture models and nonparametric adaptive density estimation. We show, respectively, that SPADES can recover, with high probability, the unknown components of a mixture of probability densities and that it yields minimax adaptive density estimates. These results are based on a general sparsity oracle inequality that the SPADES estimates satisfy. We offer a data driven method for the choice of the tuning parameter used in the construction of SPADES. The method uses the generalized bisection method first introduced in \citebb09. The suggested procedure bypasses the need for a grid search and offers substantial computational savings. We complement our theoretical results with a simulation study that employs this method for approximations of one and two-dimensional densities with mixtures. The numerical results strongly support our theoretical findings.

preprint2010arXiv

Sparse recovery under matrix uncertainty

We consider the model {eqnarray*}y=Xθ^*+ξ, Z=X+Ξ,{eqnarray*} where the random vector $y\in\mathbb{R}^n$ and the random $n\times p$ matrix $Z$ are observed, the $n\times p$ matrix $X$ is unknown, $Ξ$ is an $n\times p$ random noise matrix, $ξ\in\mathbb{R}^n$ is a noise independent of $Ξ$, and $θ^*$ is a vector of unknown parameters to be estimated. The matrix uncertainty is in the fact that $X$ is observed with additive error. For dimensions $p$ that can be much larger than the sample size $n$, we consider the estimation of sparse vectors $θ^*$. Under matrix uncertainty, the Lasso and Dantzig selector turn out to be extremely unstable in recovering the sparsity pattern (i.e., of the set of nonzero components of $θ^*$), even if the noise level is very small. We suggest new estimators called matrix uncertainty selectors (or, shortly, the MU-selectors) which are close to $θ^*$ in different norms and in the prediction risk if the restricted eigenvalue assumption on $X$ is satisfied. We also show that under somewhat stronger assumptions, these estimators recover correctly the sparsity pattern.