Source author record

Anirban Bhattacharya

Anirban Bhattacharya appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Methodology Computation cond-mat.mtrl-sci Applications

Catalog footprint

What is connected

23works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

A Hybrid Approximation to the Marginal Likelihood

Computing the marginal likelihood or evidence is one of the core challenges in Bayesian analysis. While there are many established methods for estimating this quantity, they predominantly rely on using a large number of posterior samples obtained from a Markov Chain Monte Carlo (MCMC) algorithm. As the dimension of the parameter space increases, however, many of these methods become prohibitively slow and potentially inaccurate. In this paper, we propose a novel method in which we use the MCMC samples to learn a high probability partition of the parameter space and then form a deterministic approximation over each of these partition sets. This two-step procedure, which constitutes both a probabilistic and a deterministic component, is termed a Hybrid approximation to the marginal likelihood. We demonstrate its versatility in a plethora of examples with varying dimension and sample size, and we also highlight the Hybrid approximation's effectiveness in situations where there is either a limited number or only approximate MCMC samples available.

preprint2020arXiv

Bayesian Hierarchical Modeling on Covariance Valued Data

Analysis of structural and functional connectivity (FC) of human brains is of pivotal importance for diagnosis of cognitive ability. The Human Connectome Project (HCP) provides an excellent source of neural data across different regions of interest (ROIs) of the living human brain. Individual specific data were available from an existing analysis (Dai et al., 2017) in the form of time varying covariance matrices representing the brain activity as the subjects perform a specific task. As a preliminary objective of studying the heterogeneity of brain connectomics across the population, we develop a probabilistic model for a sample of covariance matrices using a scaled Wishart distribution. We stress here that our data units are available in the form of covariance matrices, and we use the Wishart distribution to create our likelihood function rather than its more common usage as a prior on covariance matrices. Based on empirical explorations suggesting the data matrices to have low effective rank, we further model the center of the Wishart distribution using an orthogonal factor model type decomposition. We encourage shrinkage towards a low rank structure through a novel shrinkage prior and discuss strategies to sample from the posterior distribution using a combination of Gibbs and slice sampling. We extend our modeling framework to a dynamic setting to detect change points. The efficacy of the approach is explored in various simulation settings and exemplified on several case studies including our motivating HCP data. We extend our modeling framework to a dynamic setting to detect change points.

preprint2020arXiv

Evidence bounds in singular models: probabilistic and variational perspectives

The marginal likelihood or evidence in Bayesian statistics contains an intrinsic penalty for larger model sizes and is a fundamental quantity in Bayesian model comparison. Over the past two decades, there has been steadily increasing activity to understand the nature of this penalty in singular statistical models, building on pioneering work by Sumio Watanabe. Unlike regular models where the Bayesian information criterion (BIC) encapsulates a first-order expansion of the logarithm of the marginal likelihood, parameter counting gets trickier in singular models where a quantity called the real log canonical threshold (RLCT) summarizes the effective model dimensionality. In this article, we offer a probabilistic treatment to recover non-asymptotic versions of established evidence bounds as well as prove a new result based on the Gibbs variational inequality. In particular, we show that mean-field variational inference correctly recovers the RLCT for any singular model in its canonical or normal form. We additionally exhibit sharpness of our bound by analyzing the dynamics of a general purpose coordinate ascent algorithm (CAVI) popularly employed in variational inference.

preprint2020arXiv

Mass-shifting phenomenon of truncated multivariate normal priors

We show that lower-dimensional marginal densities of dependent zero-mean normal distributions truncated to the positive orthant exhibit a mass-shifting phenomenon. Despite the truncated multivariate normal density having a mode at the origin, the marginal density assigns increasingly small mass near the origin as the dimension increases. The phenomenon accentuates with stronger correlation between the random variables. A precise quantification characterizing the role of the dimension as well as the dependence is provided. This surprising behavior has serious implications towards Bayesian constrained estimation and inference, where the prior, in addition to having a full support, is required to assign a substantial probability near the origin to capture at parts of the true function of interest. Without further modification, we show that truncated normal priors are not suitable for modeling at regions and propose a novel alternative strategy based on shrinking the coordinates using a multiplicative scale parameter. The proposed shrinkage prior is empirically shown to guard against the mass shifting phenomenon while retaining computational efficiency.

preprint2020arXiv

Nonasymptotic Laplace approximation under model misspecification

We present non-asymptotic two-sided bounds to the log-marginal likelihood in Bayesian inference. The classical Laplace approximation is recovered as the leading term. Our derivation permits model misspecification and allows the parameter dimension to grow with the sample size. We do not make any assumptions about the asymptotic shape of the posterior, and instead require certain regularity conditions on the likelihood ratio and that the posterior to be sufficiently concentrated.

preprint2020arXiv

Nonparametric Bayesian Deconvolution of a Symmetric Unimodal Density

We consider nonparametric measurement error density deconvolution subject to heteroscedastic measurement errors as well as symmetry about zero and shape constraints, in particular unimodality. The problem is motivated by applications where the observed data are estimated effect sizes from regressions on multiple factors, where the target is the distribution of the true effect sizes. We exploit the fact that any symmetric and unimodal density can be expressed as a mixture of symmetric uniform densities, and model the mixing density in a new way using a Dirichlet process location-mixture of Gamma distributions. We do the computations within a Bayesian context, describe a simple scalable implementation that is linear in the sample size, and show that the estimate of the unknown target density is consistent. Within our application context of regression effect sizes, the target density is likely to have a large probability near zero (the near null effects) coupled with a heavy-tailed distribution (the actual effects). Simulations show that unlike standard deconvolution methods, our Constrained Bayesian Deconvolution method does a much better job of reconstruction of the target density. Applications to a genome-wise association study (GWAS) and microarray data reveal similar results.

preprint2016arXiv

Bayesian fractional posteriors

We consider the fractional posterior distribution that is obtained by updating a prior distribution via Bayes theorem with a fractional likelihood function, a usual likelihood function raised to a fractional power. First, we analyze the contraction property of the fractional posterior in a general misspecified framework. Our contraction results only require a prior mass condition on certain Kullback-Leibler (KL) neighborhood of the true parameter (or the KL divergence minimizer in the misspecified case), and obviate constructions of test functions and sieves commonly used in the literature for analyzing the contraction property of a regular posterior. We show through a counterexample that some condition controlling the complexity of the parameter space is necessary for the regular posterior to contract, rendering additional flexibility on the choice of the prior for the fractional posterior. Second, we derive a novel Bayesian oracle inequality based on a PAC-Bayes inequality in misspecified models. Our derivation reveals several advantages of averaging based Bayesian procedures over optimization based frequentist procedures. As an application of the Bayesian oracle inequality, we derive a sharp oracle inequality in the convex regression problem under an arbitrary dimension. We also illustrate the theory in Gaussian process regression and density estimation problems.

preprint2016arXiv

Fast sampling with Gaussian scale-mixture priors in high-dimensional regression

We propose an efficient way to sample from a class of structured multivariate Gaussian distributions which routinely arise as conditional posteriors of model parameters that are assigned a conditionally Gaussian prior. The proposed algorithm only requires matrix operations in the form of matrix multiplications and linear system solutions. We exhibit that the computational complexity of the proposed algorithm grows linearly with the dimension unlike existing algorithms relying on Cholesky factorizations with cubic orders of complexity. The algorithm should be broadly applicable in settings where Gaussian scale mixture priors are used on high dimensional model parameters. We provide an illustration through posterior sampling in a high dimensional regression setting with a horseshoe prior on the vector of regression coefficients.

preprint2016arXiv

Partitioned Cross-Validation for Divide-and-Conquer Density Estimation

We present an efficient method to estimate cross-validation bandwidth parameters for kernel density estimation in very large datasets where ordinary cross-validation is rendered highly inefficient, both statistically and computationally. Our approach relies on calculating multiple cross-validation bandwidths on partitions of the data, followed by suitable scaling and averaging to return a partitioned cross-validation bandwidth for the entire dataset. The partitioned cross-validation approach produces substantial computational gains over ordinary cross-validation. We additionally show that partitioned cross-validation can be statistically efficient compared to ordinary cross-validation. We derive analytic expressions for the asymptotically optimal number of partitions and study its finite sample accuracy through a detailed simulation study. We additionally propose a permuted version of partitioned cross-validation which attains even higher efficiency. Theoretical properties of the estimators are studied and the methodology is applied to the Higgs Boson dataset with 11 million observations

preprint2016arXiv

Sub-optimality of some continuous shrinkage priors

Two-component mixture priors provide a traditional way to induce sparsity in high-dimensional Bayes models. However, several aspects of such a prior, including computational complexities in high-dimensions, interpretation of exact zeros and non-sparse posterior summaries under standard loss functions, has motivated an amazing variety of continuous shrinkage priors, which can be expressed as global-local scale mixtures of Gaussians. Interestingly, we demonstrate that many commonly used shrinkage priors, including the Bayesian Lasso, do not have adequate posterior concentration in high-dimensional settings.

preprint2015arXiv

Comment on Article by Dawid and Musio

Discussion of "Bayesian Model Selection Based on Proper Scoring Rules" by Dawid and Musio [arXiv:1409.5291].

preprint2015arXiv

Direct Evidence of Mg Incorporation Pathway in Vapor-Liquid-Solid Grown p-type Nonpolar GaN Nanowires

Doping of III-nitride based compound semiconductor nanowires is still a challenging issue to have a control over the dopant distribution in precise locations of the nanowire optoelectronic devices. Knowledge of the dopant incorporation and its pathways in nanowires for such devices is limited by the growth methods. We report the direct evidence of incorporation pathway for Mg dopants in p-type nonpolar GaN nanowires grown via vapour-liquid-solid (VLS) method in a chemical vapour deposition technique for the first time. Mg incorporation is confirmed using X-ray photoelectron (XPS) and electron energy loss spectroscopic (EELS) measurements. Energy filtered transmission electron microscopic (EFTEM) studies are used for finding the Mg incorporation pathway in the GaN nanowire. Photoluminescence studies on Mg doped GaN nanowires along with the electrical characterization on heterojunction formed between nanowires and n-Si confirm the activation of Mg atoms as p-type dopants in nonpolar GaN nanowires.

preprint2015arXiv

Optical Properties of Mono-Dispersed AlGaN Nanowires in the Single-Prong Growth Mechanism

Growth of mono-dispersed AlGaN nanowires of ternary wurtzite phase is reported using chemical vapour deposition technique in the vapour-liquid-solid process. The role of distribution of Au catalyst nanoparticles on the size and the shape of AlGaN nanowires are discussed. These variations in the morphology of the nanowires are understood invoking Ostwald ripening of Au catalyst nanoparticles at high temperature followed by the effect of single and multi-prong growth mechanism. Energy-filtered transmission electron microscopy is used as an evidence for the presence of Al in the as-prepared samples. A significant blue shift of the band gap, in the absence of quantum confinement effect in the nanowires with diameter about 100 nm, is used as a supportive evidence for the AlGaN alloy formation. Polarized resonance Raman spectroscopy with strong electron-phonon coupling along with optical confinement due to the dielectric contrast of nanowire with respect to that of surrounding media are adopted to understand the crystalline orientation of a single nanowire in the sub-diffraction limit of about 100 nm using 325 nm wavelength, for the first time. The results are compared with the structural analysis using high resolution transmission microscopic study.

preprint2015arXiv

Optimal Bayesian estimation in random covariate design with a rescaled Gaussian process prior

In Bayesian nonparametric models, Gaussian processes provide a popular prior choice for regression function estimation. Existing literature on the theoretical investigation of the resulting posterior distribution almost exclusively assume a fixed design for covariates. The only random design result we are aware of (van der Vaart & van Zanten, 2011) assumes the assigned Gaussian process to be supported on the smoothness class specified by the true function with probability one. This is a fairly restrictive assumption as it essentially rules out the Gaussian process prior with a squared exponential kernel when modeling rougher functions. In this article, we show that an appropriate rescaling of the above Gaussian process leads to a rate-optimal posterior distribution even when the covariates are independently realized from a known density on a compact set. The proofs are based on deriving sharp concentration inequalities for frequentist kernel estimators; the results might be of independent interest.

preprint2015arXiv

Optimal Bayesian estimation in stochastic block models

With the advent of structured data in the form of social networks, genetic circuits and protein interaction networks, statistical analysis of networks has gained popularity over recent years. Stochastic block model constitutes a classical cluster-exhibiting random graph model for networks. There is a substantial amount of literature devoted to proposing strategies for estimating and inferring parameters of the model, both from classical and Bayesian viewpoints. Unlike the classical counterpart, there is however a dearth of theoretical results on the accuracy of estimation in the Bayesian setting. In this article, we undertake a theoretical investigation of the posterior distribution of the parameters in a stochastic block model. In particular, we show that one obtains optimal rates of posterior convergence with routinely used multinomial-Dirichlet priors on cluster indicators and uniform priors on the probabilities of the random edge indicators. En route, we develop geometric embedding techniques to exploit the lower dimensional structure of the parameter space which may be of independent interest.

preprint2015arXiv

Optimal Gaussian approximations to the posterior for log-linear models with Diaconis-Ylvisaker priors

In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis-Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. Here we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis-Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.

preprint2015arXiv

Posterior contraction in Gaussian process regression using Wasserstein approximations

We study posterior rates of contraction in Gaussian process regression with unbounded covariate domain. Our argument relies on developing a Gaussian approximation to the posterior of the leading coefficients of a Karhunen--Loéve expansion of the Gaussian process. The salient feature of our result is deriving such an approximation in the $L^2$ Wasserstein distance and relating the speed of the approximation to the posterior contraction rate using a coupling argument. Specific illustrations are provided for the Gaussian or squared-exponential covariance kernel.

preprint2014arXiv

Anisotropic function estimation using multi-bandwidth Gaussian processes

In nonparametric regression problems involving multiple predictors, there is typically interest in estimating an anisotropic multivariate regression surface in the important predictors while discarding the unimportant ones. Our focus is on defining a Bayesian procedure that leads to the minimax optimal rate of posterior contraction (up to a log factor) adapting to the unknown dimension and anisotropic smoothness of the true surface. We propose such an approach based on a Gaussian process prior with dimension-specific scalings, which are assigned carefully-chosen hyperpriors. We additionally show that using a homogenous Gaussian process with a single bandwidth leads to a sub-optimal rate in anisotropic cases.

preprint2014arXiv

Dirichlet-Laplace priors for optimal shrinkage

Penalized regression methods, such as $L_1$ regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced through two-component mixture priors having a probability mass at zero, but such priors encounter daunting computational problems in high dimensions. This has motivated an amazing variety of continuous shrinkage priors, which can be expressed as global-local scale mixtures of Gaussians, facilitating computation. In sharp contrast to the frequentist literature, little is known about the properties of such priors and the convergence and concentration of the corresponding posterior distribution. In this article, we propose a new class of Dirichlet--Laplace (DL) priors, which possess optimal posterior concentration and lead to efficient posterior computation exploiting results from normalized random measure theory. Finite sample performance of Dirichlet--Laplace priors relative to alternatives is assessed in simulated and real data examples.

preprint2014arXiv

Posterior contraction in sparse Bayesian factor models for massive covariance matrices

Sparse Bayesian factor models are routinely implemented for parsimonious dependence modeling and dimensionality reduction in high-dimensional applications. We provide theoretical understanding of such Bayesian procedures in terms of posterior convergence rates in inferring high-dimensional covariance matrices where the dimension can be larger than the sample size. Under relevant sparsity assumptions on the true covariance matrix, we show that commonly-used point mass mixture priors on the factor loadings lead to consistent estimation in the operator norm even when $p\gg n$. One of our major contributions is to develop a new class of continuous shrinkage priors and provide insights into their concentration around sparse vectors. Using such priors for the factor loadings, we obtain similar rate of convergence as obtained with point mass mixture priors. To obtain the convergence rates, we construct test functions to separate points in the space of high-dimensional covariance matrices using insights from random matrix theory; the tools developed may be of independent interest. We also derive minimax rates and show that the Bayesian posterior rates of convergence coincide with the minimax rates upto a $\sqrt{\log n}$ term.

preprint2013arXiv

Bayesian factorizations of big sparse tensors

It has become routine to collect data that are structured as multiway arrays (tensors). There is an enormous literature on low rank and sparse matrix factorizations, but limited consideration of extensions to the tensor case in statistics. The most common low rank tensor factorization relies on parallel factor analysis (PARAFAC), which expresses a rank $k$ tensor as a sum of rank one tensors. When observations are only available for a tiny subset of the cells of a big tensor, the low rank assumption is not sufficient and PARAFAC has poor performance. We induce an additional layer of dimension reduction by allowing the effective rank to vary across dimensions of the table. For concreteness, we focus on a contingency table application. Taking a Bayesian approach, we place priors on terms in the factorization and develop an efficient Gibbs sampler for posterior computation. Theory is provided showing posterior concentration rates in high-dimensional settings, and the methods are shown to have excellent performance in simulations and several real data applications.

preprint2012arXiv

Bayesian shrinkage

Penalized regression methods, such as $L_1$ regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced through two-component mixture priors having a probability mass at zero, but such priors encounter daunting computational problems in high dimensions. This has motivated an amazing variety of continuous shrinkage priors, which can be expressed as global-local scale mixtures of Gaussians, facilitating computation. In sharp contrast to the corresponding frequentist literature, very little is known about the properties of such priors. Focusing on a broad class of shrinkage priors, we provide precise results on prior and posterior concentration. Interestingly, we demonstrate that most commonly used shrinkage priors, including the Bayesian Lasso, are suboptimal in high-dimensional settings. A new class of Dirichlet Laplace (DL) priors are proposed, which are optimal and lead to efficient posterior computation exploiting results from normalized random measure theory. Finite sample performance of Dirichlet Laplace priors relative to alternatives is assessed in simulations.

preprint2011arXiv

Posterior convergence rates in non-linear latent variable models

Non-linear latent variable models have become increasingly popular in a variety of applications. However, there has been little study on theoretical properties of these models. In this article, we study rates of posterior contraction in univariate density estimation for a class of non-linear latent variable models where unobserved U(0,1) latent variables are related to the response variables via a random non-linear regression with an additive error. Our approach relies on characterizing the space of densities induced by the above model as kernel convolutions with a general class of continuous mixing measures. The literature on posterior rates of contraction in density estimation almost entirely focuses on finite or countably infinite mixture models. We develop approximation results for our class of continuous mixing measures. Using an appropriate Gaussian process prior on the unknown regression function, we obtain the optimal frequentist rate up to a logarithmic factor under standard regularity conditions on the true density.

Anirban Bhattacharya

What is connected

Connect this record

See the researcher in context

Building this map preview

23 published item(s)

A Hybrid Approximation to the Marginal Likelihood

Bayesian Hierarchical Modeling on Covariance Valued Data

Evidence bounds in singular models: probabilistic and variational perspectives

Mass-shifting phenomenon of truncated multivariate normal priors

Nonasymptotic Laplace approximation under model misspecification

Nonparametric Bayesian Deconvolution of a Symmetric Unimodal Density

Bayesian fractional posteriors

Fast sampling with Gaussian scale-mixture priors in high-dimensional regression

Partitioned Cross-Validation for Divide-and-Conquer Density Estimation

Sub-optimality of some continuous shrinkage priors

Comment on Article by Dawid and Musio

Direct Evidence of Mg Incorporation Pathway in Vapor-Liquid-Solid Grown p-type Nonpolar GaN Nanowires

Optical Properties of Mono-Dispersed AlGaN Nanowires in the Single-Prong Growth Mechanism

Optimal Bayesian estimation in random covariate design with a rescaled Gaussian process prior

Optimal Bayesian estimation in stochastic block models

Optimal Gaussian approximations to the posterior for log-linear models with Diaconis-Ylvisaker priors

Posterior contraction in Gaussian process regression using Wasserstein approximations

Anisotropic function estimation using multi-bandwidth Gaussian processes

Dirichlet-Laplace priors for optimal shrinkage

Posterior contraction in sparse Bayesian factor models for massive covariance matrices

Bayesian factorizations of big sparse tensors

Bayesian shrinkage

Posterior convergence rates in non-linear latent variable models