Source author record

Robert Bamler

Robert Bamler appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning physics.chem-ph eess.IV Artificial Intelligence Computer Vision cond-mat.mtrl-sci cond-mat.quant-gas cond-mat.str-el Information Theory math.IT

Catalog footprint

What is connected

12works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

A Compact Representation for Bayesian Neural Networks By Removing Permutation Symmetry

Bayesian neural networks (BNNs) are a principled approach to modeling predictive uncertainties in deep learning, which are important in safety-critical applications. Since exact Bayesian inference over the weights in a BNN is intractable, various approximate inference methods exist, among which sampling methods such as Hamiltonian Monte Carlo (HMC) are often considered the gold standard. While HMC provides high-quality samples, it lacks interpretable summary statistics because its sample mean and variance is meaningless in neural networks due to permutation symmetry. In this paper, we first show that the role of permutations can be meaningfully quantified by a number of transpositions metric. We then show that the recently proposed rebasin method allows us to summarize HMC samples into a compact representation that provides a meaningful explicit uncertainty estimate for each weight in a neural network, thus unifying sampling methods with variational inference. We show that this compact representation allows us to compare trained BNNs directly in weight space across sampling methods and variational inference, and to efficiently prune neural networks trained without explicit Bayesian frameworks by exploiting uncertainty estimates from HMC.

preprint2022arXiv

Hybridizing Physical and Data-driven Prediction Methods for Physicochemical Properties

We present a generic way to hybridize physical and data-driven methods for predicting physicochemical properties. The approach `distills' the physical method's predictions into a prior model and combines it with sparse experimental data using Bayesian inference. We apply the new approach to predict activity coefficients at infinite dilution and obtain significant improvements compared to the data-driven and physical baselines and established ensemble methods from the machine learning literature.

preprint2022arXiv

Making Thermodynamic Models of Mixtures Predictive by Machine Learning: Matrix Completion of Pair Interactions

Predictive models of thermodynamic properties of mixtures are paramount in chemical engineering and chemistry. Classical thermodynamic models are successful in generalizing over (continuous) conditions like temperature and concentration. On the other hand, matrix completion methods (MCMs) from machine learning successfully generalize over (discrete) binary systems; these MCMs can make predictions without any data for a given binary system by implicitly learning commonalities across systems. In the present work, we combine the strengths of both worlds in a hybrid approach. The underlying idea is to predict the pair-interaction energies, as they are used in basically all physical models of liquid mixtures, by an MCM. As an example, we embed an MCM into UNIQUAC, a widely-used physical model for the Gibbs excess energy. We train the resulting hybrid model in a Bayesian machine-learning framework on experimental data for activity coefficients in binary systems of 1146 components from the Dortmund Data Bank. We thereby obtain, for the first time, a complete set of UNIQUAC parameters for all binary systems of these components, which allows us to predict, in principle, activity coefficients at arbitrary temperature and composition for any combination of these components, not only for binary but also for multicomponent systems. The hybrid model even outperforms the best available physical model for predicting activity coefficients, the modified UNIFAC (Dortmund) model.

preprint2022arXiv

Understanding Entropy Coding With Asymmetric Numeral Systems (ANS): a Statistician's Perspective

Entropy coding is the backbone data compression. Novel machine-learning based compression methods often use a new entropy coder called Asymmetric Numeral Systems (ANS) [Duda et al., 2015], which provides very close to optimal bitrates and simplifies [Townsend et al., 2019] advanced compression techniques such as bits-back coding. However, researchers with a background in machine learning often struggle to understand how ANS works, which prevents them from exploiting its full versatility. This paper is meant as an educational resource to make ANS more approachable by presenting it from a new perspective of latent variable models and the so-called bits-back trick. We guide the reader step by step to a complete implementation of ANS in the Python programming language, which we then generalize for more advanced use cases. We also present and empirically evaluate an open-source library of various entropy coders designed for both research and production use. Related teaching videos and problem sets are available online.

preprint2021arXiv

Improving Inference for Neural Image Compression

We consider the problem of lossy image compression with deep latent variable models. State-of-the-art methods build on hierarchical variational autoencoders (VAEs) and learn inference networks to predict a compressible latent representation of each data point. Drawing on the variational inference perspective on compression, we identify three approximation gaps which limit performance in the conventional approach: an amortization gap, a discretization gap, and a marginalization gap. We propose remedies for each of these three limitations based on ideas related to iterative inference, stochastic annealing for discrete optimization, and bits-back coding, resulting in the first application of bits-back coding to lossy compression. In our experiments, which include extensive baseline comparisons and ablation studies, we achieve new state-of-the-art performance on lossy image compression using an established VAE architecture, by changing only the inference method.

preprint2020arXiv

Extreme Classification via Adversarial Softmax Approximation

Training a classifier over a large number of classes, known as 'extreme classification', has become a topic of major interest with applications in technology, science, and e-commerce. Traditional softmax regression induces a gradient cost proportional to the number of classes $C$, which often is prohibitively expensive. A popular scalable softmax approximation relies on uniform negative sampling, which suffers from slow convergence due a poor signal-to-noise ratio. In this paper, we propose a simple training method for drastically enhancing the gradient signal by drawing negative samples from an adversarial model that mimics the data distribution. Our contributions are three-fold: (i) an adversarial sampling mechanism that produces negative samples at a cost only logarithmic in $C$, thus still resulting in cheap gradient updates; (ii) a mathematical proof that this adversarial sampling minimizes the gradient variance while any bias due to non-uniform sampling can be removed; (iii) experimental results on large scale data sets that show a reduction of the training time by an order of magnitude relative to several competitive baselines.

preprint2020arXiv

Machine Learning in Thermodynamics: Prediction of Activity Coefficients by Matrix Completion

Activity coefficients, which are a measure of the non-ideality of liquid mixtures, are a key property in chemical engineering with relevance to modeling chemical and phase equilibria as well as transport processes. Although experimental data on thousands of binary mixtures are available, prediction methods are needed to calculate the activity coefficients in many relevant mixtures that have not been explored to-date. In this report, we propose a probabilistic matrix factorization model for predicting the activity coefficients in arbitrary binary mixtures. Although no physical descriptors for the considered components were used, our method outperforms the state-of-the-art method that has been refined over three decades while requiring much less training effort. This opens perspectives to novel methods for predicting physico-chemical properties of binary mixtures with the potential to revolutionize modeling and simulation in chemical engineering.

preprint2020arXiv

Variational Bayesian Quantization

We propose a novel algorithm for quantizing continuous latent representations in trained models. Our approach applies to deep probabilistic models, such as variational autoencoders (VAEs), and enables both data and model compression. Unlike current end-to-end neural compression methods that cater the model to a fixed quantization scheme, our algorithm separates model design and training from quantization. Consequently, our algorithm enables "plug-and-play" compression with variable rate-distortion trade-off, using a single trained model. Our algorithm can be seen as a novel extension of arithmetic coding to the continuous domain, and uses adaptive quantization accuracy based on estimates of posterior uncertainty. Our experimental results demonstrate the importance of taking into account posterior uncertainties, and show that image compression with the proposed algorithm outperforms JPEG over a wide range of bit rates using only a single standard VAE. Further experiments on Bayesian neural word embeddings demonstrate the versatility of the proposed method.

preprint2019arXiv

Tightening Bounds for Variational Inference by Revisiting Perturbation Theory

Variational inference has become one of the most widely used methods in latent variable modeling. In its basic form, variational inference employs a fully factorized variational distribution and minimizes its KL divergence to the posterior. As the minimization can only be carried out approximately, this approximation induces a bias. In this paper, we revisit perturbation theory as a powerful way of improving the variational approximation. Perturbation theory relies on a form of Taylor expansion of the log marginal likelihood, vaguely in terms of the log ratio of the true posterior and its variational approximation. While first order terms give the classical variational bound, higher-order terms yield corrections that tighten it. However, traditional perturbation theory does not provide a lower bound, making it inapt for stochastic optimization. In this paper, we present a similar yet alternative way of deriving corrections to the ELBO that resemble perturbation theory, but that result in a valid bound. We show in experiments on Gaussian Processes and Variational Autoencoders that the new bounds are more mass covering, and that the resulting posterior covariances are closer to the true posterior and lead to higher likelihoods on held-out data.

preprint2018arXiv

Improving Optimization for Models With Continuous Symmetry Breaking

Many loss functions in representation learning are invariant under a continuous symmetry transformation. For example, the loss function of word embeddings (Mikolov et al., 2013) remains unchanged if we simultaneously rotate all word and context embedding vectors. We show that representation learning models for time series possess an approximate continuous symmetry that leads to slow convergence of gradient descent. We propose a new optimization algorithm that speeds up convergence using ideas from gauge theory in physics. Our algorithm leads to orders of magnitude faster convergence and to more interpretable representations, as we show for dynamic extensions of matrix factorization and word embedding models. We further present an example application of our proposed algorithm that translates modern words into their historic equivalents.

preprint2015arXiv

Equilibration and Approximate Conservation Laws: Dipole Oscillations and Perfect Drag of Ultracold Atoms in a Harmonic Trap

The presence of (approximate) conservation laws can prohibit the fast relaxation of interacting many-particle quantum systems. We investigate this physics by studying the center-of-mass oscillations of two species of fermionic ultracold atoms in a harmonic trap. If their trap frequencies are equal, a dynamical symmetry (spectrum generating algebra), closely related to Kohn's theorem, prohibits the relaxation of center-of-mass oscillations. A small detuning $δω$ of the trap frequencies for the two species breaks the dynamical symmetry and ultimately leads to a damping of dipole oscillations driven by inter-species interactions. Using memory-matrix methods, we calculate the relaxation as a function of frequency difference, particle number, temperature, and strength of inter-species interactions. When interactions dominate, there is almost perfect drag between the two species and the dynamical symmetry is approximately restored. The drag can either arise from Hartree potentials or from friction. In the latter case (hydrodynamic limit), the center-of-mass oscillations decay with a tiny rate, $1/τ\propto (δω)^2/Γ$, where $Γ$ is a single particle scattering rate.

preprint2013arXiv

Phase-Space Berry Phases in Chiral Magnets: Dzyaloshinskii-Moriya Interaction and the Charge of Skyrmions

The semiclassical motion of electrons in phase space, x=(R, k), is influenced by Berry phases described by a 6-component vector potential, A=(A^R, A^k). In chiral magnets Dzyaloshinskii-Moriya (DM) interactions induce slowly varying magnetic textures (helices and skyrmion lattices) for which all components of A are important inducing effectively a curvature in mixed position and momentum space. We show that for smooth textures and weak spin-orbit coupling phase space Berry curvatures determine the DM interactions and give important contributions to the charge. Using ab initio methods we calculate the strength of DM interactions in MnSi in good agreement with experiment and estimate the charge of skyrmions.

Robert Bamler

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

A Compact Representation for Bayesian Neural Networks By Removing Permutation Symmetry

Hybridizing Physical and Data-driven Prediction Methods for Physicochemical Properties

Making Thermodynamic Models of Mixtures Predictive by Machine Learning: Matrix Completion of Pair Interactions

Understanding Entropy Coding With Asymmetric Numeral Systems (ANS): a Statistician's Perspective

Improving Inference for Neural Image Compression

Extreme Classification via Adversarial Softmax Approximation

Machine Learning in Thermodynamics: Prediction of Activity Coefficients by Matrix Completion

Variational Bayesian Quantization

Tightening Bounds for Variational Inference by Revisiting Perturbation Theory

Improving Optimization for Models With Continuous Symmetry Breaking

Equilibration and Approximate Conservation Laws: Dipole Oscillations and Perfect Drag of Ultracold Atoms in a Harmonic Trap

Phase-Space Berry Phases in Chiral Magnets: Dzyaloshinskii-Moriya Interaction and the Charge of Skyrmions