Researcher profile

Etienne Roquain

Etienne Roquain contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2023arXiv

Online multiple testing with super-uniformity reward

Valid online inference is an important problem in contemporary multiple testing research,to which various solutions have been proposed recently. It is well-known that these existing methods can suffer from a significant loss of power if the null $p$-values are conservative. In this work, we extend the previously introduced methodology to obtain more powerful procedures for the case of super-uniformly distributed $p$-values. These types of $p$-values arise in important settings, e.g. when discrete hypothesis tests are performed or when the $p$-values are weighted. To this end, we introduce the method of super-uniformity reward (SUR) that incorporates information about the individual null cumulative distribution functions. Our approach yields several new 'rewarded' procedures that offer uniform power improvements over known procedures and come with mathematical guarantees for controlling online error criteria based either on the family-wise error rate (FWER) or the marginal false discovery rate (mFDR). We illustrate the benefit of super-uniform rewarding in real-data analyses and simulation studies. While discrete tests serve as our leading example, we also show how our method can be applied to weighted $p$-values.

preprint2022arXiv

Empirical Bayes cumulative $\ell$-value multiple testing procedure for sparse sequences

In the sparse sequence model, we consider a popular Bayesian multiple testing procedure and investigate for the first time its behaviour from the frequentist point of view. Given a spike-and-slab prior on the high-dimensional sparse unknown parameter, one can easily compute posterior probabilities of coming from the spike, which correspond to the well known local-fdr values, also called $\ell$-values. The spike-and-slab weight parameter is calibrated in an empirical Bayes fashion, using marginal maximum likelihood. The multiple testing procedure under study, called here the cumulative $\ell$-value procedure, ranks coordinates according to their empirical $\ell$-values and thresholds so that the cumulative ranked sum does not exceed a user-specified level $t$. We validate the use of this method from the multiple testing perspective: for alternatives of appropriately large signal strength, the false discovery rate (FDR) of the procedure is shown to converge to the target level $t$, while its false negative rate (FNR) goes to $0$. We complement this study by providing convergence rates for the method. Additionally, we prove that the $q$-value multiple testing procedure shares similar convergence rates in this model.

preprint2010arXiv

On the false discovery proportion convergence under Gaussian equi-correlation

We study the convergence of the false discovery proportion (FDP) of the Benjamini-Hochberg procedure in the Gaussian equi-correlated model, when the correlation $ρ_m$ converges to zero as the hypothesis number $m$ grows to infinity. By contrast with the standard convergence rate $m^{1/2}$ holding under independence, this study shows that the FDP converges to the false discovery rate (FDR) at rate $\{\min(m,1/ρ_m)\}^{1/2}$ in this equi-correlated model.

preprint2010arXiv

Some nonasymptotic results on resampling in high dimension, I: Confidence regions, II: Multiple tests

We study generalized bootstrap confidence regions for the mean of a random vector whose coordinates have an unknown dependency structure. The random vector is supposed to be either Gaussian or to have a symmetric and bounded distribution. The dimensionality of the vector can possibly be much larger than the number of observations and we focus on a nonasymptotic control of the confidence level, following ideas inspired by recent results in learning theory. We consider two approaches, the first based on a concentration principle (valid for a large class of resampling weights) and the second on a resampled quantile, specifically using Rademacher weights. Several intermediate results established in the approach based on concentration principles are of interest in their own right. We also discuss the question of accuracy when using Monte Carlo approximations of the resampled quantities.

preprint2010arXiv

Spatial clustering of array CGH features in combination with hierarchical multiple testing

We propose a new approach for clustering DNA features using array CGH data from multiple tumor samples. We distinguish data-collapsing: joining contiguous DNA clones or probes with extremely similar data into regions, from clustering: joining contiguous, correlated regions based on a maximum likelihood principle. The model-based clustering algorithm accounts for the apparent spatial patterns in the data. We evaluate the randomness of the clustering result by a cluster stability score in combination with cross-validation. Moreover, we argue that the clustering really captures spatial genomic dependency by showing that coincidental clustering of independent regions is very unlikely. Using the region and cluster information, we combine testing of these for association with a clinical variable in an hierarchical multiple testing approach. This allows for interpreting the significance of both regions and clusters while controlling the Family-Wise Error Rate simultaneously. We prove that in the context of permutation tests and permutation-invariant clusters it is allowed to perform clustering and testing on the same data set. Our procedures are illustrated on two cancer data sets.