Source author record

Babak Shahbaba

Babak Shahbaba appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation Machine Learning Applications Methodology Quantitative Methods math.DS math.ST Neurons and Cognition physics.comp-ph Statistics Theory

Catalog footprint

What is connected

18works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Scaling Up Bayesian Uncertainty Quantification for Inverse Problems using Deep Neural Networks

Due to the importance of uncertainty quantification (UQ), Bayesian approach to inverse problems has recently gained popularity in applied mathematics, physics, and engineering. However, traditional Bayesian inference methods based on Markov Chain Monte Carlo (MCMC) tend to be computationally intensive and inefficient for such high dimensional problems. To address this issue, several methods based on surrogate models have been proposed to speed up the inference process. More specifically, the calibration-emulation-sampling (CES) scheme has been proven to be successful in large dimensional UQ problems. In this work, we propose a novel CES approach for Bayesian inference based on deep neural network models for the emulation phase. The resulting algorithm is computationally more efficient and more robust against variations in the training set. Further, by using an autoencoder (AE) for dimension reduction, we have been able to speed up our Bayesian inference method up to three orders of magnitude. Overall, our method, henceforth called \emph{Dimension-Reduced Emulative Autoencoder Monte Carlo (DREAMC)} algorithm, is able to scale Bayesian UQ up to thousands of dimensions for inverse problems. Using two low-dimensional (linear and nonlinear) inverse problems we illustrate the validity of this approach. Next, we apply our method to two high-dimensional numerical examples (elliptic and advection-diffussion) to demonstrate its computational advantages over existing algorithms.

preprint2020arXiv

Conjoined Dirichlet Process

Biclustering is a class of techniques that simultaneously clusters the rows and columns of a matrix to sort heterogeneous data into homogeneous blocks. Although many algorithms have been proposed to find biclusters, existing methods suffer from the pre-specification of the number of biclusters or place constraints on the model structure. To address these issues, we develop a novel, non-parametric probabilistic biclustering method based on Dirichlet processes to identify biclusters with strong co-occurrence in both rows and columns. The proposed method utilizes dual Dirichlet process mixture models to learn row and column clusters, with the number of resulting clusters determined by the data rather than pre-specified. Probabilistic biclusters are identified by modeling the mutual dependence between the row and column clusters. We apply our method to two different applications, text mining and gene expression analysis, and demonstrate that our method improves bicluster extraction in many settings compared to existing approaches.

preprint2020arXiv

Optimal Experimental Design for Mathematical Models of Hematopoiesis

The hematopoietic system has a highly regulated and complex structure in which cells are organized to successfully create and maintain new blood cells. Feedback regulation is crucial to tightly control this system, but the specific mechanisms by which control is exerted are not completely understood. In this work, we aim to uncover the underlying mechanisms in hematopoiesis by conducting perturbation experiments, where animal subjects are exposed to an external agent in order to observe the system response and evolution. Developing a proper experimental design for these studies is an extremely challenging task. To address this issue, we have developed a novel Bayesian framework for optimal design of perturbation experiments. We model the numbers of hematopoietic stem and progenitor cells in mice that are exposed to a low dose of radiation. We use a differential equations model that accounts for feedback and feedforward regulation. A significant obstacle is that the experimental data are not longitudinal, rather each data point corresponds to a different animal. This model is embedded in a hierarchical framework with latent variables that capture unobserved cellular population levels. We select the optimum design based on the amount of information gain, measured by the Kullback-Leibler divergence between the probability distributions before and after observing the data. We evaluate our approach using synthetic and experimental data. We show that a proper design can lead to better estimates of model parameters even with relatively few subjects. Additionally, we demonstrate that the model parameters show a wide range of sensitivities to design options. Our method should allow scientists to find the optimal design by focusing on their specific parameters of interest and provide insight to hematopoiesis. Our approach can be extended to more complex models where latent components are used.

preprint2016arXiv

Bayesian Inference on Matrix Manifolds for Linear Dimensionality Reduction

We reframe linear dimensionality reduction as a problem of Bayesian inference on matrix manifolds. This natural paradigm extends the Bayesian framework to dimensionality reduction tasks in higher dimensions with simpler models at greater speeds. Here an orthogonal basis is treated as a single point on a manifold and is associated with a linear subspace on which observations vary maximally. Throughout this paper, we employ the Grassmann and Stiefel manifolds for various dimensionality reduction problems, explore the connection between the two manifolds, and use Hybrid Monte Carlo for posterior sampling on the Grassmannian for the first time. We delineate in which situations either manifold should be considered. Further, matrix manifold models are used to yield scientific insight in the context of cognitive neuroscience, and we conclude that our methods are suitable for basic inference as well as accurate prediction.

preprint2016arXiv

Geodesic Lagrangian Monte Carlo over the space of positive definite matrices: with application to Bayesian spectral density estimation

We extend the application of Hamiltonian Monte Carlo to allow for sampling from probability distributions defined over symmetric or Hermitian positive definite matrices. To do so, we exploit the Riemannian structure induced by Cartan's century-old canonical metric. The geodesics that correspond to this metric are available in closed-form and---within the context of Lagrangian Monte Carlo---provide a principled way to travel around the space of positive definite matrices. Our method improves Bayesian inference on such matrices by allowing for a broad range of priors, so we are not limited to conjugate priors only. In the context of spectral density estimation, we use the (non-conjugate) complex reference prior as an example modeling option made available by the algorithm. Results based on simulated and real-world multivariate time series are presented in this context, and future directions are outlined.

preprint2015arXiv

A Geometric View of Posterior Approximation

Although Bayesian methods are robust and principled, their application in practice could be limited since they typically rely on computationally intensive Markov Chain Monte Carlo algorithms for their implementation. One possible solution is to find a fast approximation of posterior distribution and use it for statistical inference. For commonly used approximation methods, such as Laplace and variational free energy, the objective is mainly defined in terms of computational convenience as opposed to a true distance measure between the target and approximating distributions. In this paper, we provide a geometric view of posterior approximation based on a valid distance measure derived from ambient Fisher geometry. Our proposed framework is easily generalizable and can inspire a new class of methods for approximate Bayesian inference.

preprint2015arXiv

Dependent Matérn Processes for Multivariate Time Series

For the challenging task of modeling multivariate time series, we propose a new class of models that use dependent Matérn processes to capture the underlying structure of data, explain their interdependencies, and predict their unknown values. Although similar models have been proposed in the econometric, statistics, and machine learning literature, our approach has several advantages that distinguish it from existing methods: 1) it is flexible to provide high prediction accuracy, yet its complexity is controlled to avoid overfitting; 2) its interpretability separates it from black-box methods; 3) finally, its computational efficiency makes it scalable for high-dimensional time series. In this paper, we use several simulated and real data sets to illustrate these advantages. We will also briefly discuss some extensions of our model.

preprint2015arXiv

Sampling constrained probability distributions using Spherical Augmentation

Statistical models with constrained probability distributions are abundant in machine learning. Some examples include regression models with norm constraints (e.g., Lasso), probit, many copula models, and latent Dirichlet allocation (LDA). Bayesian inference involving probability distributions confined to constrained domains could be quite challenging for commonly used sampling algorithms. In this paper, we propose a novel augmentation technique that handles a wide range of constraints by mapping the constrained domain to a sphere in the augmented space. By moving freely on the surface of this sphere, sampling algorithms handle constraints implicitly and generate proposals that remain within boundaries when mapped back to the original space. Our proposed method, called {Spherical Augmentation}, provides a mathematically natural and computationally efficient framework for sampling from constrained probability distributions. We show the advantages of our method over state-of-the-art sampling algorithms, such as exact Hamiltonian Monte Carlo, using several examples including truncated Gaussian distributions, Bayesian Lasso, Bayesian bridge regression, reconstruction of quantized stationary Gaussian process, and LDA for topic modeling.

preprint2014arXiv

A Semiparametric Bayesian Model for Detecting Synchrony Among Multiple Neurons

We propose a scalable semiparametric Bayesian model to capture dependencies among multiple neurons by detecting their co-firing (possibly with some lag time) patterns over time. After discretizing time so there is at most one spike at each interval, the resulting sequence of 1's (spike) and 0's (silence) for each neuron is modeled using the logistic function of a continuous latent variable with a Gaussian process prior. For multiple neurons, the corresponding marginal distributions are coupled to their joint probability distribution using a parametric copula model. The advantages of our approach are as follows: the nonparametric component (i.e., the Gaussian process model) provides a flexible framework for modeling the underlying firing rates; the parametric component (i.e., the copula model) allows us to make inference regarding both contemporaneous and lagged relationships among neurons; using the copula model, we construct multivariate probabilistic models by separating the modeling of univariate marginal distributions from the modeling of dependence structure among variables; our method is easy to implement using a computationally efficient sampling algorithm that can be easily extended to high dimensional problems. Using simulated data, we show that our approach could correctly capture temporal dependencies in firing rates and identify synchronous neurons. We also apply our model to spike train data obtained from prefrontal cortical areas in rat's brain.

preprint2014arXiv

An Efficient Bayesian Inference Framework for Coalescent-Based Nonparametric Phylodynamics

Phylodynamics focuses on the problem of reconstructing past population size dynamics from current genetic samples taken from the population of interest. This technique has been extensively used in many areas of biology, but is particularly useful for studying the spread of quickly evolving infectious diseases agents, e.g.,\ influenza virus. Phylodynamics inference uses a coalescent model that defines a probability density for the genealogy of randomly sampled individuals from the population. When we assume that such a genealogy is known, the coalescent model, equipped with a Gaussian process prior on population size trajectory, allows for nonparametric Bayesian estimation of population size dynamics. While this approach is quite powerful, large data sets collected during infectious disease surveillance challenge the state-of-the-art of Bayesian phylodynamics and demand computationally more efficient inference framework. To satisfy this demand, we provide a computationally efficient Bayesian inference framework based on Hamiltonian Monte Carlo for coalescent process models. Moreover, we show that by splitting the Hamiltonian function we can further improve the efficiency of this approach. Using several simulated and real datasets, we show that our method provides accurate estimates of population size dynamics and is substantially faster than alternative methods based on elliptical slice sampler and Metropolis-adjusted Langevin algorithm.

preprint2014arXiv

Wormhole Hamiltonian Monte Carlo

In machine learning and statistics, probabilistic inference involving multimodal distributions is quite difficult. This is especially true in high dimensional problems, where most existing algorithms cannot easily move from one mode to another. To address this issue, we propose a novel Bayesian inference approach based on Markov Chain Monte Carlo. Our method can effectively sample from multimodal distributions, especially when the dimension is high and the modes are isolated. To this end, it exploits and modifies the Riemannian geometric properties of the target distribution to create \emph{wormholes} connecting modes in order to facilitate moving between them. Further, our proposed method uses the regeneration technique in order to adapt the algorithm by identifying new modes and updating the network of wormholes without affecting the stationary distribution. To find new modes, as opposed to rediscovering those previously identified, we employ a novel mode searching algorithm that explores a \emph{residual energy} function obtained by subtracting an approximate Gaussian mixture density (based on previously discovered modes) from the target density function.

preprint2013arXiv

Discussion of "Geodesic Monte Carlo on Embedded Manifolds"

Contributed discussion and rejoinder to "Geodesic Monte Carlo on Embedded Manifolds" (arXiv:1301.6064)

preprint2013arXiv

Spherical Hamiltonian Monte Carlo for Constrained Target Distributions

We propose a new Markov Chain Monte Carlo (MCMC) method for constrained target distributions. Our method first maps the $D$-dimensional constrained domain of parameters to the unit ball ${\bf B}_0^D(1)$. Then, it augments the resulting parameter space to the $D$-dimensional sphere, ${\bf S}^D$. The boundary of ${\bf B}_0^D(1)$ corresponds to the equator of ${\bf S}^D$. This change of domains enables us to implicitly handle the original constraints because while the sampler moves freely on the sphere, it proposes states that are within the constraints imposed on the original parameter space. To improve the computational efficiency of our algorithm, we split the Lagrangian dynamics into several parts such that a part of the dynamics can be handled analytically by finding the geodesic flow on the sphere. We apply our method to several examples including truncated Gaussian, Bayesian Lasso, Bayesian bridge regression, and a copula model for identifying synchrony among multiple neurons. Our results show that the proposed method can provide a natural and efficient framework for handling several types of constraints on target distributions.

preprint2012arXiv

Bayesian Nonparametric Variable Selection as an Exploratory Tool for Finding Genes that Matter

High-throughput scientific studies involving no clear a'priori hypothesis are common. For example, a large-scale genomic study of a disease may examine thousands of genes without hypothesizing that any specific gene is responsible for the disease. In these studies, the objective is to explore a large number of possible factors (e.g. genes) in order to identify a small number that will be considered in follow-up studies that tend to be more thorough and on smaller scales. For large-scale studies, we propose a nonparametric Bayesian approach based on random partition models. Our model thus divides the set of candidate factors into several subgroups according to their degrees of relevance, or potential effect, in relation to the outcome of interest. The model allows for a latent rank to be assigned to each factor according to the overall potential importance of its corresponding group. The posterior expectation or mode of these ranks is used to set up a threshold for selecting potentially relevant factors. Using simulated data, we demonstrate that our approach could be quite effective in finding relevant genes compared to several alternative methods. We apply our model to two large-scale studies. The first study involves transcriptome analysis of infection by human cytomegalovirus (HCMV). The objective of the second study is to identify differentially expressed genes between two types of leukemia.

preprint2012arXiv

Lagrangian Dynamical Monte Carlo

Hamiltonian Monte Carlo (HMC) improves the computational efficiency of the Metropolis algorithm by reducing its random walk behavior. Riemannian Manifold HMC (RMHMC) further improves HMC's performance by exploiting the geometric properties of the parameter space. However, the geometric integrator used for RMHMC involves implicit equations that require costly numerical analysis (e.g., fixed-point iteration). In some cases, the computational overhead for solving implicit equations undermines RMHMC's benefits. To avoid this problem, we propose an explicit geometric integrator that replaces the momentum variable in RMHMC by velocity. We show that the resulting transformation is equivalent to transforming Riemannian Hamilton dynamics to Lagrangian dynamics. Experimental results show that our method improves RMHMC's overall computational efficiency. All computer programs and data sets are available online (http://www.ics.uci.edu/~babaks/Site/Codes.html) in order to allow replications of the results reported in this paper.

preprint2012arXiv

Split Hamiltonian Monte Carlo

We show how the Hamiltonian Monte Carlo algorithm can sometimes be speeded up by "splitting" the Hamiltonian in a way that allows much of the movement around the state space to be done at low computational cost. One context where this is possible is when the log density of the distribution of interest (the potential energy function) can be written as the log of a Gaussian density, which is a quadratic function, plus a slowly varying function. Hamiltonian dynamics for quadratic energy functions can be analytically solved. With the splitting technique, only the slowly-varying part of the energy needs to be handled numerically, and this can be done with a larger stepsize (and hence fewer steps) than would be necessary with a direct simulation of the dynamics. Another context where splitting helps is when the most important terms of the potential energy function and its gradient can be evaluated quickly, with only a slowly-varying part requiring costly computations. With splitting, the quick portion can be handled with a small stepsize, while the costly portion uses a larger stepsize. We show that both of these splitting approaches can reduce the computational cost of sampling from the posterior distribution for a logistic regression model, using either a Gaussian approximation centered on the posterior mode, or a Hamiltonian split into a term that depends on only a small number of critical cases, and another term that involves the larger number of cases whose influence on the posterior distribution is small. Supplemental materials for this paper are available online.

preprint2012arXiv

Split HMC for Gaussian Process Models

In this paper, we discuss an extension of the Split Hamiltonian Monte Carlo (Split HMC) method for Gaussian process model (GPM). This method is based on splitting the Hamiltonian in a way that allows much of the movement around the state space to be done at low computational cost. To this end, we approximate the negative log density (i.e., the energy function) of the distribution of interest by a quadratic function U0 for which Hamiltonian dynamics can be solved analytically. The overall energy function U is then written as U0 + U1, where U1 is the approximation error. The Hamiltonian is then split into two parts; one part is based on U0 is handled analytically, the other part is based on U1 for which we approximate Hamiltonian's equations by discretizing time. We use simulated and real data to compare the performance of our method to the standard HMC. We find that splitting the Hamiltonian for GP models could lead to substantial improvement (up to 10 folds) of sampling efficiency, which is measured in terms of the amount of time required for producing an independent sample with high acceptance probability from posterior distributions.

preprint2010arXiv

Bayesian Gene Set Analysis

Gene expression microarray technologies provide the simultaneous measurements of a large number of genes. Typical analyses of such data focus on the individual genes, but recent work has demonstrated that evaluating changes in expression across predefined sets of genes often increases statistical power and produces more robust results. We introduce a new methodology for identifying gene sets that are differentially expressed under varying experimental conditions. Our approach uses a hierarchical Bayesian framework where a hyperparameter measures the significance of each gene set. Using simulated data, we compare our proposed method to alternative approaches, such as Gene Set Enrichment Analysis (GSEA) and Gene Set Analysis (GSA). Our approach provides the best overall performance. We also discuss the application of our method to experimental data based on p53 mutation status.

Babak Shahbaba

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

Scaling Up Bayesian Uncertainty Quantification for Inverse Problems using Deep Neural Networks

Conjoined Dirichlet Process

Optimal Experimental Design for Mathematical Models of Hematopoiesis

Bayesian Inference on Matrix Manifolds for Linear Dimensionality Reduction

Geodesic Lagrangian Monte Carlo over the space of positive definite matrices: with application to Bayesian spectral density estimation

A Geometric View of Posterior Approximation

Dependent Matérn Processes for Multivariate Time Series

Sampling constrained probability distributions using Spherical Augmentation

A Semiparametric Bayesian Model for Detecting Synchrony Among Multiple Neurons

An Efficient Bayesian Inference Framework for Coalescent-Based Nonparametric Phylodynamics

Wormhole Hamiltonian Monte Carlo

Discussion of "Geodesic Monte Carlo on Embedded Manifolds"

Spherical Hamiltonian Monte Carlo for Constrained Target Distributions

Bayesian Nonparametric Variable Selection as an Exploratory Tool for Finding Genes that Matter

Lagrangian Dynamical Monte Carlo

Split Hamiltonian Monte Carlo

Split HMC for Gaussian Process Models

Bayesian Gene Set Analysis