Researcher profile

Pavel Sountsov

Pavel Sountsov contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

Focusing on Difficult Directions for Learning HMC Trajectory Lengths

Hamiltonian Monte Carlo (HMC) is a premier Markov Chain Monte Carlo (MCMC) algorithm for continuous target distributions. Its full potential can only be unleashed when its problem-dependent hyperparameters are tuned well. The adaptation of one such hyperparameter, trajectory length ($τ$), has been closely examined by many research programs with the No-U-Turn Sampler (NUTS) coming out as the preferred method in 2011. A decade later, the evolving hardware profile has lead to the proliferation of personal and cloud based SIMD hardware in the form of Graphics and Tensor Processing Units (GPUs, TPUs) which are hostile to certain algorithmic details of NUTS. This has opened up a hole in the MCMC toolkit for an algorithm that can learn $τ$ while maintaining good hardware utilization. In this work we build on recent advances along this direction and introduce SNAPER-HMC, a SIMD-accelerator-friendly adaptive-MCMC scheme for learning $τ$. The algorithm maximizes an upper bound on per-gradient effective sample size along an estimated principal component. We empirically show that SNAPER-HMC is stable when combined with mass-matrix adaptation, and is tolerant of certain pathological target distribution covariance spectra while providing excellent long and short run sampling efficiency. We provide a complete implementation for continuous multi-chain adaptive HMC combining trajectory learning with standard step-size and mass-matrix adaptation in one turnkey inference package.

preprint2022arXiv

MCMC Should Mix: Learning Energy-Based Model with Neural Transport Latent Space MCMC

Learning energy-based model (EBM) requires MCMC sampling of the learned model as an inner loop of the learning algorithm. However, MCMC sampling of EBMs in high-dimensional data space is generally not mixing, because the energy function, which is usually parametrized by a deep network, is highly multi-modal in the data space. This is a serious handicap for both theory and practice of EBMs. In this paper, we propose to learn an EBM with a flow-based model (or in general a latent variable model) serving as a backbone, so that the EBM is a correction or an exponential tilting of the flow-based model. We show that the model has a particularly simple form in the space of the latent variables of the backbone model, and MCMC sampling of the EBM in the latent space mixes well and traverses modes in the data space. This enables proper sampling and learning of EBMs.

preprint2020arXiv

Hamiltonian Monte Carlo Swindles

Hamiltonian Monte Carlo (HMC) is a powerful Markov chain Monte Carlo (MCMC) algorithm for estimating expectations with respect to continuous un-normalized probability distributions. MCMC estimators typically have higher variance than classical Monte Carlo with i.i.d. samples due to autocorrelations; most MCMC research tries to reduce these autocorrelations. In this work, we explore a complementary approach to variance reduction based on two classical Monte Carlo "swindles": first, running an auxiliary coupled chain targeting a tractable approximation to the target distribution, and using the auxiliary samples as control variates; and second, generating anti-correlated ("antithetic") samples by running two chains with flipped randomness. Both ideas have been explored previously in the context of Gibbs samplers and random-walk Metropolis algorithms, but we argue that they are ripe for adaptation to HMC in light of recent coupling results from the HMC theory literature. For many posterior distributions, we find that these swindles generate effective sample sizes orders of magnitude larger than plain HMC, as well as being more efficient than analogous swindles for Metropolis-adjusted Langevin algorithm and random-walk Metropolis.

preprint2020arXiv

tfp.mcmc: Modern Markov Chain Monte Carlo Tools Built for Modern Hardware

Markov chain Monte Carlo (MCMC) is widely regarded as one of the most important algorithms of the 20th century. Its guarantees of asymptotic convergence, stability, and estimator-variance bounds using only unnormalized probability functions make it indispensable to probabilistic programming. In this paper, we introduce the TensorFlow Probability MCMC toolkit, and discuss some of the considerations that motivated its design.