Source author record

Fred A. Wright

Fred A. Wright appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Methodology Statistics Theory Applications Computation Quantitative Methods

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Fast Multivariate Probit Estimation via a Two-Stage Composite Likelihood

The multivariate probit is popular for modeling correlated binary data, with an attractive balance of flexibility and simplicity. However, considerable challenges remain in computation and in devising a clear statistical framework. Interest in the multivariate probit has increased in recent years. Current applications include genomics and precision medicine, where simultaneous modeling of multiple traits may be of interest, and computational efficiency is an important consideration. We propose a fast method for multivariate probit estimation via a two-stage composite likelihood. We explore computational and statistical efficiency, and note that the approach sets the stage for extensions beyond the purely binary setting.

preprint2014arXiv

A procedure to detect general association based on concentration of ranks

In modern high-throughput applications, it is important to identify pairwise associations between variables, and desirable to use methods that are powerful and sensitive to a variety of association relationships. We describe RankCover, a new non-parametric association test for association between two variables that measures the concentration of paired ranked points. Here `concentration' is quantified using a disk-covering statistic that is similar to those employed in spatial data analysis. Analysis of simulated datasets demonstrates that the method is robust and often powerful in comparison to competing general association tests. We illustrate RankCover in the analysis of several real datasets.

preprint2014arXiv

Consistent Testing for Recurrent Genomic Aberrations

Genomic aberrations, such as somatic copy number alterations, are frequently observed in tumor tissue. Recurrent aberrations, occurring in the same region across multiple subjects, are of interest because they may highlight genes associated with tumor development or progression. A number of tools have been proposed to assess the statistical significance of recurrent DNA copy number aberrations, but their statistical properties have not been carefully studied. Cyclic shift testing, a permutation procedure using independent random shifts of genomic marker observations on the genome, has been proposed to identify recurrent aberrations, and is potentially useful for a wider variety of purposes, including identifying regions with methylation aberrations or overrepresented in disease association studies. For data following a countable-state Markov model, we prove the asymptotic validity of cyclic shift $p$-values under a fixed sample size regime as the number of observed markers tends to infinity. We illustrate cyclic shift testing for a variety of data types, producing biologically relevant findings for three publicly available datasets.

preprint2012arXiv

Convergence and prediction of principal component scores in high-dimensional settings

A number of settings arise in which it is of interest to predict Principal Component (PC) scores for new observations using data from an initial sample. In this paper, we demonstrate that naive approaches to PC score prediction can be substantially biased toward 0 in the analysis of large matrices. This phenomenon is largely related to known inconsistency results for sample eigenvalues and eigenvectors as both dimensions of the matrix increase. For the spiked eigenvalue model for random matrices, we expand the generality of these results, and propose bias-adjusted PC score prediction. In addition, we compute the asymptotic correlation coefficient between PC scores from sample and population eigenvectors. Simulation and real data examples from the genetics literature show the improved bias and numerical properties of our estimators.

preprint2010arXiv

A geometric interpretation of the permutation $p$-value and its application in eQTL studies

Permutation $p$-values have been widely used to assess the significance of linkage or association in genetic studies. However, the application in large-scale studies is hindered by a heavy computational burden. We propose a geometric interpretation of permutation $p$-values, and based on this geometric interpretation, we develop an efficient permutation $p$-value estimation method in the context of regression with binary predictors. An application to a study of gene expression quantitative trait loci (eQTL) shows that our method provides reliable estimates of permutation $p$-values while requiring less than 5% of the computational time compared with direct permutations. In fact, our method takes a constant time to estimate permutation $p$-values, no matter how small the $p$-value. Our method enables a study of the relationship between nominal $p$-values and permutation $p$-values in a wide range, and provides a geometric perspective on the effective number of independent tests.