Source author record

Michael Betancourt

Michael Betancourt appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Computation Applications astro-ph.HE astro-ph.IM Mathematical Software physics.data-an

Catalog footprint

What is connected

10works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Efficient Automatic Differentiation of Implicit Functions

Derivative-based algorithms are ubiquitous in statistics, machine learning, and applied mathematics. Automatic differentiation offers an algorithmic way to efficiently evaluate these derivatives from computer programs that execute relevant functions. Implementing automatic differentiation for programs that incorporate implicit functions, such as the solution to an algebraic or differential equation, however, requires particular care. Contemporary applications typically appeal to either the application of the implicit function theorem or, in certain circumstances, specialized adjoint methods. In this paper we show that both of these approaches can be generalized to any implicit function, although the generalized adjoint method is typically more effective for automatic differentiation. To showcase the relative advantages and limitations of the two methods we demonstrate their application on a suite of common implicit functions.

preprint2020arXiv

Bayesian aggregation of average data: An application in drug development

Throughout the different phases of a drug development program, randomized trials are used to establish the tolerability, safety, and efficacy of a candidate drug. At each stage one aims to optimize the design of future studies by extrapolation from the available evidence at the time. This includes collected trial data and relevant external data. However, relevant external data are typically available as averages only, for example from trials on alternative treatments reported in the literature. Here we report on such an example from a drug development for wet age-related macular degeneration. This disease is the leading cause of severe vision loss in the elderly. While current treatment options are efficacious, they are also a substantial burden for the patient. Hence, new treatments are under development which need to be compared against existing treatments. The general statistical problem this leads to is meta-analysis, which addresses the question of how we can combine datasets collected under different conditions. Bayesian methods have long been used to achieve partial pooling. Here we consider the challenge when the model of interest is complex (hierarchical and nonlinear) and one dataset is given as raw data while the second dataset is given as averages only. In such a situation, common meta-analytic methods can only be applied when the model is sufficiently simple for analytic approaches. When the model is too complex, for example nonlinear, an analytic approach is not possible. We provide a Bayesian solution by using simulation to approximately reconstruct the likelihood of the external summary and allowing the parameters in the model to vary under the different conditions. We first evaluate our approach using fake-data simulations and then report results for the drug development program that motivated this research.

preprint2020arXiv

The Discrete Adjoint Method: Efficient Derivatives for Functions of Discrete Sequences

Gradient-based techniques are becoming increasingly critical in quantitative fields, notably in statistics and computer science. The utility of these techniques, however, ultimately depends on how efficiently we can evaluate the derivatives of the complex mathematical functions that arise in applications. In this paper we introduce a discrete adjoint method that efficiently evaluates derivatives for functions of discrete sequences.

preprint2020arXiv

Toward a principled Bayesian workflow in cognitive science

Experiments in research on memory, language, and in other areas of cognitive science are increasingly being analyzed using Bayesian methods. This has been facilitated by the development of probabilistic programming languages such as Stan, and easily accessible front-end packages such as brms. The utility of Bayesian methods, however, ultimately depends on the relevance of the Bayesian model, in particular whether or not it accurately captures the structure of the data and the data analyst's domain expertise. Even with powerful software, the analyst is responsible for verifying the utility of their model. To demonstrate this point, we introduce a principled Bayesian workflow (Betancourt, 2018) to cognitive science. Using a concrete working example, we describe basic questions one should ask about the model: prior predictive checks, computational faithfulness, model sensitivity, and posterior predictive checks. The running example for demonstrating the workflow is data on reading times with a linguistic manipulation of object versus subject relative clause sentences. This principled Bayesian workflow also demonstrates how to use domain knowledge to inform prior distributions. It provides guidelines and checks for valid data analysis, avoiding overfitting complex models to noise, and capturing relevant data structure in a probabilistic model. Given the increasing use of Bayesian methods, we aim to discuss how these methods can be properly employed to obtain robust answers to scientific questions. All data and code accompanying this paper are available from https://osf.io/b2vx9/.

preprint2016arXiv

Diagnosing Suboptimal Cotangent Disintegrations in Hamiltonian Monte Carlo

When properly tuned, Hamiltonian Monte Carlo scales to some of the most challenging high-dimensional problems at the frontiers of applied statistics, but when that tuning is suboptimal the performance leaves much to be desired. In this paper I show how suboptimal choices of one critical degree of freedom, the cotangent disintegration, manifest in readily observed diagnostics that facilitate the robust application of the algorithm.

preprint2016arXiv

Identifying the Optimal Integration Time in Hamiltonian Monte Carlo

By leveraging the natural geometry of a smooth probabilistic system, Hamiltonian Monte Carlo yields computationally efficient Markov Chain Monte Carlo estimation. At least provided that the algorithm is sufficiently well-tuned. In this paper I show how the geometric foundations of Hamiltonian Monte Carlo implicitly identify the optimal choice of these parameters, especially the integration time. I then consider the practical consequences of these principles in both existing algorithms and a new implementation called \emph{Exhaustive Hamiltonian Monte Carlo} before demonstrating the utility of these ideas in some illustrative examples.

preprint2015arXiv

A Unified Treatment of Predictive Model Comparison

The predictive performance of any inferential model is critical to its practical success, but quantifying predictive performance is a subtle statistical problem. In this paper I show how the natural structure of any inferential problem defines a canonical measure of relative predictive performance and then demonstrate how approximations of this measure yield many of the model comparison techniques popular in statistics and machine learning.

preprint2015arXiv

The Stan Math Library: Reverse-Mode Automatic Differentiation in C++

As computational challenges in optimization and statistical inference grow ever harder, algorithms that utilize derivatives are becoming increasingly more important. The implementation of the derivatives that make these algorithms so powerful, however, is a substantial user burden and the practicality of these algorithms depends critically on tools like automatic differentiation that remove the implementation burden entirely. The Stan Math Library is a C++, reverse-mode automatic differentiation library designed to be usable, extensive and extensible, efficient, scalable, stable, portable, and redistributable in order to facilitate the construction and utilization of such algorithms. Usability is achieved through a simple direct interface and a cleanly abstracted functional interface. The extensive built-in library includes functions for matrix operations, linear algebra, differential equation solving, and most common probability functions. Extensibility derives from a straightforward object-oriented framework for expressions, allowing users to easily create custom functions. Efficiency is achieved through a combination of custom memory management, subexpression caching, traits-based metaprogramming, and expression templates. Partial derivatives for compound functions are evaluated lazily for improved scalability. Stability is achieved by taking care with arithmetic precision in algebraic expressions and providing stable, compound functions where possible. For portability, the library is standards-compliant C++ (03) and has been tested for all major compilers for Windows, Mac OS X, and Linux.

preprint2014arXiv

Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference

Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9,176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST. \smallskip

preprint2011arXiv

The Geometry of Hamiltonian Monte Carlo

With its systematic exploration of probability distributions, Hamiltonian Monte Carlo is a potent Markov Chain Monte Carlo technique; it is an approach, however, ultimately contingent on the choice of a suitable Hamiltonian function. By examining both the symplectic geometry underlying Hamiltonian dynamics and the requirements of Markov Chain Monte Carlo, we construct the general form of admissible Hamiltonians and propose a particular choice with potential application in Bayesian inference.

Michael Betancourt

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Efficient Automatic Differentiation of Implicit Functions

Bayesian aggregation of average data: An application in drug development

The Discrete Adjoint Method: Efficient Derivatives for Functions of Discrete Sequences

Toward a principled Bayesian workflow in cognitive science

Diagnosing Suboptimal Cotangent Disintegrations in Hamiltonian Monte Carlo

Identifying the Optimal Integration Time in Hamiltonian Monte Carlo

A Unified Treatment of Predictive Model Comparison

The Stan Math Library: Reverse-Mode Automatic Differentiation in C++

Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference

The Geometry of Hamiltonian Monte Carlo