Source author record

Kevin Roth

Kevin Roth appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computation Neurons and Cognition physics.soc-ph

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

How Good is the Bayes Posterior in Deep Neural Networks Really?

During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neural networks in industrial practice. In this work we cast doubt on the current understanding of Bayes posteriors in popular deep neural networks: we demonstrate through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD. Furthermore, we demonstrate that predictive performance is improved significantly through the use of a "cold posterior" that overcounts evidence. Such cold posteriors sharply deviate from the Bayesian paradigm but are commonly used as heuristic in Bayesian deep learning papers. We put forward several hypotheses that could explain cold posteriors and evaluate the hypotheses through experiments. Our work questions the goal of accurate posterior approximations in Bayesian deep learning: If the true Bayes posterior is poor, what is the use of more accurate approximations? Instead, we argue that it is timely to focus on understanding the origin of the improved performance of cold posteriors.

preprint2020arXiv

The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks

Variational Bayesian Inference is a popular methodology for approximating posterior distributions over Bayesian neural network weights. Recent work developing this class of methods has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. In contrast, here we share a curious experimental finding that suggests instead restricting the variational distribution to a more compact parameterization. For a variety of deep Bayesian neural networks trained using Gaussian mean-field variational inference, we find that the posterior standard deviations consistently exhibit strong low-rank structure after convergence. This means that by decomposing these variational parameters into a low-rank factorization, we can make our variational approximation more compact without decreasing the models' performance. Furthermore, we find that such factorized parameterizations improve the signal-to-noise ratio of stochastic gradient estimates of the variational lower bound, resulting in faster convergence.

preprint2017arXiv

Model of Brain Activation Predicts the Neural Collective Influence Map of the Brain

Efficient complex systems have a modular structure, but modularity does not guarantee robustness, because efficiency also requires an ingenious interplay of the interacting modular components. The human brain is the elemental paradigm of an efficient robust modular system interconnected as a network of networks (NoN). Understanding the emergence of robustness in such modular architectures from the interconnections of its parts is a long-standing challenge that has concerned many scientists. Current models of dependencies in NoN inspired by the power grid express interactions among modules with fragile couplings that amplify even small shocks, thus preventing functionality. Therefore, we introduce a model of NoN to shape the pattern of brain activations to form a modular environment that is robust. The model predicts the map of neural collective influencers (NCIs) in the brain, through the optimization of the influence of the minimal set of essential nodes responsible for broadcasting information to the whole-brain NoN. Our results suggest new intervention protocols to control brain activity by targeting influential neural nodes predicted by network theory.