Source author record

Amy H. Herring

Amy H. Herring appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Methodology Machine Learning stat.OT

Catalog footprint

What is connected

7works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

A generalized Bayes framework for probabilistic clustering

Loss-based clustering methods, such as k-means and its variants, are standard tools for finding groups in data. However, the lack of quantification of uncertainty in the estimated clusters is a disadvantage. Model-based clustering based on mixture models provides an alternative, but such methods face computational problems and large sensitivity to the choice of kernel. This article proposes a generalized Bayes framework that bridges between these two paradigms through the use of Gibbs posteriors. In conducting Bayesian updating, the log likelihood is replaced by a loss function for clustering, leading to a rich family of clustering methods. The Gibbs posterior represents a coherent updating of Bayesian beliefs without needing to specify a likelihood for the data, and can be used for characterizing uncertainty in clustering. We consider losses based on Bregman divergence and pairwise similarities, and develop efficient deterministic algorithms for point estimation along with sampling algorithms for uncertainty quantification. Several existing clustering algorithms, including k-means, can be interpreted as generalized Bayes estimators under our framework, and hence we provide a method of uncertainty quantification for these approaches.

preprint2020arXiv

Bayesian Hierarchical Factor Regression Models to Infer Cause of Death From Verbal Autopsy Data

In low-resource settings where vital registration of death is not routine it is often of critical interest to determine and study the cause of death (COD) for individuals and the cause-specific mortality fraction (CSMF) for populations. Post-mortem autopsies, considered the gold standard for COD assignment, are often difficult or impossible to implement due to deaths occurring outside the hospital, expense, and/or cultural norms. For this reason, Verbal Autopsies (VAs) are commonly conducted, consisting of a questionnaire administered to next of kin recording demographic information, known medical conditions, symptoms, and other factors for the decedent. This article proposes a novel class of hierarchical factor regression models that avoid restrictive assumptions of standard methods, allow both the mean and covariance to vary with COD category, and can include covariate information on the decedent, region, or events surrounding death. Taking a Bayesian approach to inference, this work develops an MCMC algorithm and validates the FActor Regression for Verbal Autopsy (FARVA) model in simulation experiments. An application of FARVA to real VA data shows improved goodness-of-fit and better predictive performance in inferring COD and CSMF over competing methods. Code and a user manual are made available at https://github.com/kelrenmor/farva.

preprint2020arXiv

Bayesian joint modeling of chemical structure and dose response curves

Today there are approximately 85,000 chemicals regulated under the Toxic Substances Control Act, with around 2,000 new chemicals introduced each year. It is impossible to screen all of these chemicals for potential toxic effects either via full organism in vivo studies or in vitro high-throughput screening (HTS) programs. Toxicologists face the challenge of choosing which chemicals to screen, and predicting the toxicity of as-yet-unscreened chemicals. Our goal is to describe how variation in chemical structure relates to variation in toxicological response to enable in silico toxicity characterization designed to meet both of these challenges. With our Bayesian partially Supervised Sparse and Smooth Factor Analysis ($\text{BS}^3\text{FA}$) model, we learn a distance between chemicals targeted to toxicity, rather than one based on molecular structure alone. Our model also enables the prediction of chemical dose-response profiles based on chemical structure (that is, without in vivo or in vitro testing) by taking advantage of a large database of chemicals that have already been tested for toxicity in HTS programs. We show superior simulation performance in distance learning and modest to large gains in predictive ability compared to existing methods. Results from the high-throughput screening data application elucidate the relationship between chemical structure and a toxicity-relevant high-throughput assay. An R package for $\text{BS}^3\text{FA}$ is available online at https://github.com/kelrenmor/bs3fa.

preprint2019arXiv

Centered Partition Process: Informative Priors for Clustering

There is a very rich literature proposing Bayesian approaches for clustering starting with a prior probability distribution on partitions. Most approaches assume exchangeability, leading to simple representations in terms of Exchangeable Partition Probability Functions (EPPF). Gibbs-type priors encompass a broad class of such cases, including Dirichlet and Pitman-Yor processes. Even though there have been some proposals to relax the exchangeability assumption, allowing covariate-dependence and partial exchangeability, limited consideration has been given on how to include concrete prior knowledge on the partition. For example, we are motivated by an epidemiological application, in which we wish to cluster birth defects into groups and we have prior knowledge of an initial clustering provided by experts. As a general approach for including such prior knowledge, we propose a Centered Partition (CP) process that modifies the EPPF to favor partitions close to an initial one. Some properties of the CP prior are described, a general algorithm for posterior computation is developed, and we illustrate the methodology through simulation examples and an application to the motivating epidemiology study of birth defects.

preprint2016arXiv

Bayesian Local Extrema Splines

We consider the problem of shape restricted nonparametric regression on a closed set X ?\in R; where it is reasonable to assume the function has no more than H local extrema interior to X: Following a Bayesian approach we develop a nonparametric prior over a novel class of local extrema splines. This approach is shown to be consistent when modeling any continuously differentiable function within the class of functions considered, and is used to develop methods for hypothesis testing on the shape of the curve. Sampling algorithms are developed, and the method is applied in simulation studies and data examples where the shape of the curve is of interest.

preprint2016arXiv

Nonparametric Bayes models for mixed-scale longitudinal surveys

Modeling and computation for multivariate longitudinal surveys have proven challenging, particularly when data are not all continuous and Gaussian but contain discrete measurements. In many social science surveys, study participants are selected via complex survey designs such as stratified random sampling, leading to discrepancies between the sample and population, which are further compounded by missing data and loss to follow up. Survey weights are typically constructed to address these issues, but it is not clear how to include them in models. Motivated by data on sexual development, we propose a novel nonparametric approach for mixed-scale longitudinal data in surveys. In the proposed approach, the mixed-scale multivariate response is expressed through an underlying continuous variable with dynamic latent factors inducing time-varying associations. Bias from the survey design is adjusted for in posterior computation relying on a Markov chain Monte Carlo algorithm. The approach is assessed in simulation studies, and applied to the National Longitudinal Study of Adolescent to Adult Health.

preprint2015arXiv

A Conversation with Alan Gelfand

Alan E. Gelfand was born April 17, 1945, in the Bronx, New York. He attended public grade schools and did his undergraduate work at what was then called City College of New York (CCNY, now CUNY), excelling at mathematics. He then surprised and saddened his mother by going all the way across the country to Stanford to graduate school, where he completed his dissertation in 1969 under the direction of Professor Herbert Solomon, making him an academic grandson of Herman Rubin and Harold Hotelling. Alan then accepted a faculty position at the University of Connecticut (UConn) where he was promoted to tenured associate professor in 1975 and to full professor in 1980. A few years later he became interested in decision theory, then empirical Bayes, which eventually led to the publication of Gelfand and Smith [J. Amer. Statist. Assoc. 85 (1990) 398-409], the paper that introduced the Gibbs sampler to most statisticians and revolutionized Bayesian computing. In the mid-1990s, Alan's interests turned strongly to spatial statistics, leading to fundamental contributions in spatially-varying coefficient models, coregionalization, and spatial boundary analysis (wombling). He spent 33 years on the faculty at UConn, retiring in 2002 to become the James B. Duke Professor of Statistics and Decision Sciences at Duke University, serving as chair from 2007-2012. At Duke, he has continued his work in spatial methodology while increasing his impact in the environmental sciences. To date, he has published over 260 papers and 6 books; he has also supervised 36 Ph.D. dissertations and 10 postdocs. This interview was done just prior to a conference of his family, academic descendants, and colleagues to celebrate his 70th birthday and his contributions to statistics which took place on April 19-22, 2015 at Duke University.

Amy H. Herring

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

A generalized Bayes framework for probabilistic clustering

Bayesian Hierarchical Factor Regression Models to Infer Cause of Death From Verbal Autopsy Data

Bayesian joint modeling of chemical structure and dose response curves

Centered Partition Process: Informative Priors for Clustering

Bayesian Local Extrema Splines

Nonparametric Bayes models for mixed-scale longitudinal surveys

A Conversation with Alan Gelfand