Source author record

James M. Flegal

James M. Flegal appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation math.ST Methodology Statistics Theory Applications stat.OT

Catalog footprint

What is connected

11works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Assessing and Visualizing Simultaneous Simulation Error

Monte Carlo experiments produce samples in order to estimate features of a given distribution. However, simultaneous estimation of means and quantiles has received little attention, despite being common practice. In this setting we establish a multivariate central limit theorem for any finite combination of sample means and quantiles under the assumption of a strongly mixing process, which includes the standard Monte Carlo and Markov chain Monte Carlo settings. We build on this to provide a fast algorithm for constructing hyperrectangular confidence regions having the desired simultaneous coverage probability and a convenient marginal interpretation. The methods are incorporated into standard ways of visualizing the results of Monte Carlo experiments enabling the practitioner to more easily assess the reliability of the results. We demonstrate the utility of this approach in various Monte Carlo settings including simulation studies based on independent and identically distributed samples and Bayesian analyses using Markov chain Monte Carlo sampling.

preprint2016arXiv

Estimating standard errors for importance sampling estimators with multiple Markov chains

The naive importance sampling estimator, based on samples from a single importance density, can be numerically unstable. Instead, we consider generalized importance sampling estimators where samples from more than one probability distribution are combined. We study this problem in the Markov chain Monte Carlo context, where independent samples are replaced with Markov chain samples. If the chains converge to their respective target distributions at a polynomial rate, then under two finite moment conditions, we show a central limit theorem holds for the generalized estimators. Further, we develop an easy to implement method to calculate valid asymptotic standard errors based on batch means. We also provide a batch means estimator for calculating asymptotically valid standard errors of Geyer(1994) reverse logistic estimator. We illustrate the method using a Bayesian variable selection procedure in linear regression. In particular, the generalized importance sampling estimator is used to perform empirical Bayes variable selection and the batch means estimator is used to obtain standard errors in a high-dimensional setting where current methods are not applicable.

preprint2016arXiv

Strong Consistency of Multivariate Spectral Variance Estimators

Markov chain Monte Carlo (MCMC) algorithms are used to estimate features of interest of a distribution. The Monte Carlo error in estimation has an asymptotic normal distribution whose multivariate nature has so far been ignored in the MCMC community. We present a class of multivariate spectral variance estimators for the asymptotic covariance matrix in the Markov chain central limit theorem and provide conditions for strong consistency. We examine the finite sample properties of the multivariate spectral variance estimators and its eigenvalues in the context of a vector autoregressive process of order 1.

preprint2015arXiv

Bayesian inference for a flexible class of bivariate beta distributions

Several bivariate beta distributions have been proposed in the literature. In particular, Olkin and Liu (2003) proposed a 3 parameter bivariate beta model, which Arnold and Ng (2011) extend to 5 and 8 parameter models. The 3 parameter model allows for only positive correlation, while the latter models can accommodate both positive and negative correlation. However, these come at the expense of a density that is mathematically intractable. The focus of this research is on Bayesian estimation for the 5 and 8 parameter models. Since the likelihood does not exist in closed form, we apply approximate Bayesian computation, a likelihood free approach. Simulation studies have been carried out for the 5 and 8 parameter cases under various priors and tolerance levels. We apply the 5 parameter model to a real data set by allowing the model to serve as a prior to correlated proportions of a bivariate beta binomial model. Results and comparisons are then discussed.

preprint2015arXiv

Bayesian model selection on linear mixed-effects models for comparisons between multiple treatments and a control

We propose a novel Bayesian model selection technique on linear mixed-effects models to compare multiple treatments with a control. A fully Bayesian approach is implemented to estimate the marginal inclusion probabilities that provide a direct measure of the difference between treatments and the control, along with the model-averaged posterior distributions. Default priors are proposed for model selection incorporating domain knowledge and a component-wise Gibbs sampler is developed for efficient posterior computation. We demonstrate the proposed method based on simulated data and an experimental dataset from a longitudinal study of mouse lifespan and weight trajectories.

preprint2014arXiv

A practical sequential stopping rule for high-dimensional MCMC and its application to spatial-temporal Bayesian models

A current challenge for many Bayesian analyses is determining when to terminate high-dimensional Markov chain Monte Carlo simulations. To this end, we propose using an automated sequential stopping procedure that terminates the simulation when the computational uncertainty is small relative to the posterior uncertainty. Such a stopping rule has previously been shown to work well in settings with posteriors of moderate dimension. In this paper, we illustrate its utility in high-dimensional simulations while overcoming some current computational issues. Further, we investigate the relationship between the stopping rule and effective sample size. As examples, we consider two complex Bayesian analyses on spatially and temporally correlated datasets. The first involves a dynamic space-time model on weather station data and the second a spatial variable selection model on fMRI brain imaging data. Our results show the sequential stopping rule is easy to implement, provides uncertainty estimates, and performs well in high-dimensional settings.

preprint2013arXiv

A Modified Gibbs Sampler on General State Spaces

We present a modified Gibbs sampler for general state spaces. We establish that this modification can lead to substantial gains in statistical efficiency while maintaining the overall quality of convergence. We illustrate our results in two examples including a toy Normal-Normal model and a Bayesian version of the random effects model.

preprint2013arXiv

Relative fixed-width stopping rules for Markov chain Monte Carlo simulations

Markov chain Monte Carlo (MCMC) simulations are commonly employed for estimating features of a target distribution, particularly for Bayesian inference. A fundamental challenge is determining when these simulations should stop. We consider a sequential stopping rule that terminates the simulation when the width of a confidence interval is sufficiently small relative to the size of the target parameter. Specifically, we propose relative magnitude and relative standard deviation stopping rules in the context of MCMC. In each setting, we develop sufficient conditions for asymptotic validity, that is conditions to ensure the simulation will terminate with probability one and the resulting confidence intervals will have the proper coverage probability. Our results are applicable in a wide variety of MCMC estimation settings, such as expectation, quantile, or simultaneous multivariate estimation. Finally, we investigate the finite sample properties through a variety of examples and provide some recommendations to practitioners.

preprint2012arXiv

Exact sampling for intractable probability distributions via a Bernoulli factory

Many applications in the field of statistics require Markov chain Monte Carlo methods. Determining appropriate starting values and run lengths can be both analytically and empirically challenging. A desire to overcome these problems has led to the development of exact, or perfect, sampling algorithms which convert a Markov chain into an algorithm that produces i.i.d. samples from the stationary distribution. Unfortunately, very few of these algorithms have been developed for the distributions that arise in statistical applications, which typically have uncountable support. Here we study an exact sampling algorithm using a geometrically ergodic Markov chain on a general state space. Our work provides a significant reduction to the number of input draws necessary for the Bernoulli factory, which enables exact sampling via a rejection sampling approach. We illustrate the algorithm on a univariate Metropolis-Hastings sampler and a bivariate Gibbs sampler, which provide a proof of concept and insight into hyper-parameter selection. Finally, we illustrate the algorithm on a Bayesian version of the one-way random effects model with data from a styrene exposure study.

preprint2010arXiv

Batch means and spectral variance estimators in Markov chain Monte Carlo

Calculating a Monte Carlo standard error (MCSE) is an important step in the statistical analysis of the simulation output obtained from a Markov chain Monte Carlo experiment. An MCSE is usually based on an estimate of the variance of the asymptotic normal distribution. We consider spectral and batch means methods for estimating this variance. In particular, we establish conditions which guarantee that these estimators are strongly consistent as the simulation effort increases. In addition, for the batch means and overlapping batch means methods we establish conditions ensuring consistency in the mean-square sense which in turn allows us to calculate the optimal batch size up to a constant of proportionality. Finally, we examine the empirical finite-sample properties of spectral variance and batch means estimators and provide recommendations for practitioners.