Researcher profile

Stephan Morgenthaler

Stephan Morgenthaler contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2020arXiv

Statistical applications of random matrix theory: comparison of two populations I

This paper investigates a statistical procedure for testing the equality of two independent estimated covariance matrices when the number of potentially dependent data vectors is large and proportional to the size of the vectors, that is, the number of variables. Inspired by the spike models used in random matrix theory, we concentrate on the largest eigenvalues of the matrices in order to determine significance. To avoid false rejections we must guard against residual spikes and need a sufficiently precise description of the behaviour of the largest eigenvalues under the null hypothesis. In this paper, we lay a foundation by treating alternatives based on perturbations of order $1$, that is, a single large eigenvalue. Our statistic allows the user to test the equality of two populations. Future work will extend the result to perturbations of order $k$ and demonstrate conservativeness of the procedure for more general matrices.

preprint2020arXiv

Statistical applications of Random matrix theory: comparison of two populations II

This paper investigates a statistical procedure for testing the equality of two independent estimated covariance matrices when the number of potentially dependent data vectors is large and proportional to the size of the vectors, that is, the number of variables. Inspired by the spike models used in random matrix theory, we concentrate on the largest eigenvalues of the matrices in order to determine significance. To avoid false rejections we must guard against residual spikes and need a sufficiently precise description of the behaviour of the largest eigenvalues under the null hypothesis. In this paper we propose some "invariant" theorems that allows us to extend the test of arXiv:2002.12741 for perturbation of order $1$ to some general tests for order $k$. The statistics introduced in this paper allow the user to test the equality of two populations based on high-dimensional multivariate data. Simulations show that these tests have more power of detection than standard multivariate approaches.

preprint2020arXiv

Statistical applications of Random matrix theory: comparison of two populations III

This paper investigates a statistical procedure for testing the equality of two independently estimated covariance matrices when the number of potentially dependent data vectors is large and proportional to the size of the vectors, that is, the number of variables. Inspired by the spike models used in random matrix theory, we concentrate on the largest eigenvalues of the matrices in order to determine significant differences. To avoid false rejections we must guard against residual spikes and need a sufficiently precise description of the properties of the largest eigenvalues under the null hypothesis. In this paper, we extend arXiv:2002.12741 for perturbation of order $1$ and arXiv:2002.12703 studying simpler statistic. The residual spike introduce in the first paper is investigated and leads to a statistic that results in a good test of equality of two populations. Simulations show that this new test does not rely on some hypotheses that were necessary for the proofs and in the second paper.

preprint2013arXiv

Identifying Graphical Models

The ability to identify reliably a positive or negative partial correlation between the expression levels of two genes is influenced by the number $p$ of genes, the number $n$ of analyzed samples, and the statistical properties of the measurements. Classical statistical theory teaches that the product of the root sample size multiplied by the size of the partial correlation is the crucial quantity. But this has to be combined with some adjustment for multiplicity depending on $p$, which makes the classical analysis somewhat arbitrary. We investigate this problem through the lens of the Kullback-Leibler divergence, which is a measure of the average information for detecting an effect. We conclude that commonly sized studies in genetical epidemiology are not able to reliably detect moderately strong links.

preprint2013arXiv

Optimality in multiple comparison procedures

When many (m) null hypotheses are tested with a single dataset, the control of the number of false rejections is often the principal consideration. Two popular controlling rates are the probability of making at least one false discovery (FWER) and the expected fraction of false discoveries among all rejections (FDR). Scaled multiple comparison error rates form a new family that bridges the gap between these two extremes. For example, the Scaled Expected Value (SEV) limits the number of false positives relative to an arbitrary increasing function of the number of rejections, that is, E(FP/s(R)). We discuss the problem of how to choose in practice which procedure to use, with elements of an optimality theory, by considering the number of false rejections FP separately from the number of correct rejections TP. Using this framework we will show how to choose an element in the new family mentioned above.

preprint2013arXiv

Two step multiple comparisons procedures for positively dependent data with application to detecting differences in human brain network topologies

We consider the problem of testing positively dependent multiple hypotheses assuming that a prior information about the dependence structure is available. We propose two-step multiple comparisons procedures that exploit the prior information of the dependence structure, without relying on strong assumptions. In the first step, we group the tests into subsets where tests are supposed to be positively dependent and in each of which we compute the standardized mean of the test scores. Given the subset mean scores or equivalently the subsets p-values, we apply a first screening at a predefined threshold, which results in two types of subsets. Based on this typing, the original single test p-values are modified such that they can be used in conjunction with any multiple comparison procedure. We show by means of different simulation that power is gained with the proposed two-step methods, and compare it with traditional multiple comparison procedures. As an illustration, our method is applied on real data comparing topological differences between two groups of human brain networks.

preprint2010arXiv

Efficient statistical analysis of large correlated multivariate datasets: a case study on brain connectivity matrices

In neuroimaging, a large number of correlated tests are routinely performed to detect active voxels in single-subject experiments or to detect regions that differ between individuals belonging to different groups. In order to bound the probability of a false discovery of pair-wise differences, a Bonferroni or other correction for multiplicity is necessary. These corrections greatly reduce the power of the comparisons which means that small signals (differences) remain hidden and therefore have been more or less successful depending on the application. We introduce a method that improves the power of a family of correlated statistical tests by reducing their number in an orderly fashion using our a-priori understanding of the problem . The tests are grouped by blocks that respect the data structure and only one or a few tests per group are performed. For each block we construct an appropriate summary statistic that characterizes a meaningful feature of the block. The comparisons are based on these summary statistics by a block-wise approach. We contrast this method with the one based on the individual measures in terms of power. Finally, we apply the method to compare brain connectivity matrices. Although the method is used in this study on the particular case of imaging, the proposed strategy can be applied to a large variety of problems that involves multiple comparisons when the tests can be grouped according to attributes that depend on the specific problem. Keywords and phrases: Multiple comparisons ; Family-wise error rate; False discovery rate; Bonferroni procedure; Human brain connectivity; Brain connectivity matrices.