Source author record

Mette Langaas

Mette Langaas appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Methodology

Catalog footprint

What is connected

4works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Powerful extreme phenotype sampling designs and score tests for genetic association studies

We consider cross-sectional genetic association studies (common and rare variants) where non-genetic information is available, or feasible to obtain for $N$ individuals, but where it is infeasible to genotype all $N$ individuals. We consider continuously measurable Gaussian traits (phenotypes). Genotyping $n<N$ extreme phenotype individuals can yield better power to detect phenotype-genotype associations, as compared to randomly selecting $n$ individuals. We define a person as having an extreme phenotype if the observed phenotype is above a specified threshold or below a specified thresholds. We consider a model where these thresholds can be tailored to each individual. The classical extreme sampling design is to set equal thresholds for all individuals. We introduce a design ($z$-extreme sampling) where personalized thresholds are defined based on the residuals of a regression model including only non-genetic (fully available) information. We derive score tests for the situation where only $n$ extremes are analyzed (complete case analysis), and for the situation where the non-genetic information on $N-n$ non-extremes is included in the analysis (all case analysis). For the classical design, all case analysis is generally more powerful than complete case analysis. For the $z$-extreme sample, we show that all case and complete case tests are equally powerful. Simulations and data analysis also show that $z$-extreme sampling is at least as powerful as the classical extreme sampling design and the classical design is shown to be at times less powerful than random sampling. The method of dichotomizing extreme phenotypes is also discussed.

preprint2016arXiv

Is the familywise error rate in genomics controlled by methods based on the effective number of independent tests?

In genome-wide association (GWA) studies the goal is to detect association between one or more genetic markers and a given phenotype. The number of genetic markers in a GWA study can be in the order hundreds of thousands and therefore multiple testing methods are needed. This paper presents a set of popular methods to be used to correct for multiple testing in GWA studies. All are based on the concept of estimating an effective number of independent tests. We compare these methods using simulated data and data from the TOP study, and show that the effective number of independent tests is not additive over blocks of independent genetic markers unless we assume a common value for the local significance level. We also show that the reviewed methods based on estimating the effective number of independent tests in general do not control the familywise error rate.

preprint2013arXiv

Exact conditional p-values from arbitrary ranking of a sample space: An application to genome-wide association studies

We introduce a method for computation of exact conditional efficiency robust enumeration p-values for detection of genotype--phenotype associations at a single bi-allelic genetic locus. Our method can be based on any arbitrary ranking test statistics, such as efficiency robust test statistics or asymptotic p-values. The resulting p-values are exact conditional enumeration p-values and satisfy the basic statistical validity property. Practically, the method allows performing statistically valid significance testing in genomic analyses with unknown modes of inheritance at individual bi-allelic genetic loci -- the situation typical in genome-wide association studies. We provide an open-source R code implementing the method.

preprint2013arXiv

Robust Methods for Disease-Genotype Association in Genetic Association Studies: Calculate P-values Using Exact Conditional Enumeration instead of Asymptotic Approximations

In genetic association studies, detecting disease-genotype associations is a primary goal. For most diseases, the underlying genetic model is unknown, and we study seven robust test statistics for monotone association. For a given test statistic, there are many ways to calculate a p-value, but in genetic association studies, calculations have predominantly been based on asymptotic approximations or on simulated permutations. We show that when the number of permutations tends to infinity, the permutation p-value approaches the exact conditional enumeration p-value, and further that calculating the latter p-value is much more efficient than performing simulated permutations. We then answer two research questions. (i) Which of the test statistics under study are the most powerful for monotone genetic models? (ii) Based on test size, power, and computational considerations, should asymptotic approximations or exact conditional enumeration be used for calculating p-values? We have studied case-control sample sizes with 500-5000 cases and 500-15000 controls, and significance levels from 5e-8 to 0.05, thus our results are applicable to genetic association studies with only one genetic marker under study, intermediate follow-up studies, and genome wide association studies. We find that if all monotone genetic models are of interest, the best performance is achieved for a test statistics based on the maximum over a range of Cochrane-Armitage trend tests with different scores and for a constrained likelihood ratio test. For significance levels below 0.05, asymptotic approximations may give a test size up to 20 times the nominal level, and should therefore be used with caution. Further, calculating p-values based on exact conditional enumeration is a powerful, valid and computationally feasible approach, and we advocate its use in genetic association studies.

Mette Langaas

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Powerful extreme phenotype sampling designs and score tests for genetic association studies

Is the familywise error rate in genomics controlled by methods based on the effective number of independent tests?

Exact conditional p-values from arbitrary ranking of a sample space: An application to genome-wide association studies

Robust Methods for Disease-Genotype Association in Genetic Association Studies: Calculate P-values Using Exact Conditional Enumeration instead of Asymptotic Approximations