Researcher profile

Victor J. Yohai

Victor J. Yohai contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2014arXiv

Composite Robust Estimators for Linear Mixed Models

The Classical Tukey-Huber Contamination Model (CCM) is a usual framework to describe the mechanism of outliers generation in robust statistics. In a data set with $n$ observations and $p$ variables, under the CCM, an outlier is a unit, even if only one or few values are corrupted. Classical robust procedures were designed to cope with this setting and the impact of observations were limited whenever necessary. Recently, a different mechanism of outliers generation, namely Independent Contamination Model (ICM), was introduced. In this new setting each cell of the data matrix might be corrupted or not with a probability independent on the status of the other cells. ICM poses new challenge to robust statistics since the percentage of contaminated rows dramatically increase with $p$, often reaching more than $50\%$. When this situation appears, classical affine equivariant robust procedures do not work since their breakdown point is $50\%$. For this contamination model we propose a new type of robust methods namely composite robust procedures which are inspired on the idea of composite likelihood, where low dimension likelihood, very often the likelihood of pairs, are aggregate together in order to obtain an approximation of the full likelihood which is more tractable. Our composite robust procedures are build over pairs of observations in order to gain robustness in the independent contamination model. We propose composite S and $τ$-estimators for linear mixed models. Composite $τ$-estimators are proved to have an high breakdown point both in the CCM and ICM. A Monte Carlo study shows that our estimators compare favorably with respect to classical S-estimators under the CCM and outperform them under the ICM. One example based on a real data set illustrates the new robust procedure.

preprint2014arXiv

Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination

Multivariate location and scatter matrix estimation is a cornerstone in multivariate data analysis. We consider this problem when the data may contain independent cellwise and casewise outliers. Flat data sets with a large number of variables and a relatively small number of cases are common place in modern statistical applications. In these cases global down-weighting of an entire case, as performed by traditional robust procedures, may lead to poor results. We highlight the need for a new generation of robust estimators that can efficiently deal with cellwise outliers and at the same time show good performance under casewise outliers.

preprint2010arXiv

Robust location estimation with missing data

In a missing-data setting, we have a sample in which a vector of explanatory variables x_i is observed for every subject i, while scalar outcomes y_i are missing by happenstance on some individuals. In this work we propose robust estimates of the distribution of the responses assuming missing at random (MAR) data, under a semiparametric regression model. Our approach allows the consistent estimation of any weakly continuous functional of the response's distribution. In particular, strongly consistent estimates of any continuous location functional, such as the median or MM functionals, are proposed. A robust fit for the regression model combined with the robust properties of the location functional gives rise to a robust recipe for estimating the location parameter. Robustness is quantified through the breakdown point of the proposed procedure. The asymptotic distribution of the location estimates is also derived.