Source author record

Trevor Campbell

Trevor Campbell appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computation math.ST Statistics Theory Methodology Computer Vision Applications math.PR

Catalog footprint

What is connected

15works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Asymptotically exact variational flows via involutive MCMC kernels

Most expressive variational families -- such as normalizing flows -- lack practical convergence guarantees, as their theoretical assurances typically hold only at the intractable global optimum. In this work, we present a general recipe for constructing tuning-free, asymptotically exact variational flows on arbitrary state spaces from involutive MCMC kernels. The core methodological component is a novel representation of general involutive MCMC kernels as invertible, measurepreserving iterated random function systems, which act as the flow maps of our variational flows. This leads to three new variational families with provable total variation convergence. Our framework resolves key practical limitations of existing variational families with similar guarantees (e.g., MixFlows), while requiring substantially weaker theoretical assumptions. Finally, we demonstrate the competitive performance of our flows across tasks including posterior approximation, Monte Carlo estimates, and normalization constant estimation, outperforming or matching No-U-Turn sampler (NUTS) and black-box normalizing flows.

preprint2023arXiv

Bayesian inference via sparse Hamiltonian flows

A Bayesian coreset is a small, weighted subset of data that replaces the full dataset during Bayesian inference, with the goal of reducing computational cost. Although past work has shown empirically that there often exists a coreset with low inferential error, efficiently constructing such a coreset remains a challenge. Current methods tend to be slow, require a secondary inference step after coreset construction, and do not provide bounds on the data marginal evidence. In this work, we introduce a new method -- sparse Hamiltonian flows -- that addresses all three of these challenges. The method involves first subsampling the data uniformly, and then optimizing a Hamiltonian flow parametrized by coreset weights and including periodic momentum quasi-refreshment steps. Theoretical results show that the method enables an exponential compression of the dataset in a representative model, and that the quasi-refreshment steps reduce the KL divergence to the target. Real and synthetic experiments demonstrate that sparse Hamiltonian flows provide accurate posterior approximations with significantly reduced runtime compared with competing dynamical-system-based inference methods.

preprint2023arXiv

Fast Bayesian Coresets via Subsampling and Quasi-Newton Refinement

Bayesian coresets approximate a posterior distribution by building a small weighted subset of the data points. Any inference procedure that is too computationally expensive to be run on the full posterior can instead be run inexpensively on the coreset, with results that approximate those on the full data. However, current approaches are limited by either a significant run-time or the need for the user to specify a low-cost approximation to the full posterior. We propose a Bayesian coreset construction algorithm that first selects a uniformly random subset of data, and then optimizes the weights using a novel quasi-Newton method. Our algorithm is a simple to implement, black-box method, that does not require the user to specify a low-cost posterior approximation. It is the first to come with a general high-probability bound on the KL divergence of the output coreset posterior. Experiments demonstrate that our method provides significant improvements in coreset quality against alternatives with comparable construction times, with far less storage cost and user input required.

preprint2023arXiv

Parallel Tempering With a Variational Reference

Sampling from complex target distributions is a challenging task fundamental to Bayesian inference. Parallel tempering (PT) addresses this problem by constructing a Markov chain on the expanded state space of a sequence of distributions interpolating between the posterior distribution and a fixed reference distribution, which is typically chosen to be the prior. However, in the typical case where the prior and posterior are nearly mutually singular, PT methods are computationally prohibitive. In this work we address this challenge by constructing a generalized annealing path connecting the posterior to an adaptively tuned variational reference. The reference distribution is tuned to minimize the forward (inclusive) KL divergence to the posterior distribution using a simple, gradient-free moment-matching procedure. We show that our adaptive procedure converges to the forward KL minimizer, and that the forward KL divergence serves as a good proxy to a previously developed measure of PT performance. We also show that in the large-data limit in typical Bayesian models, the proposed method improves in performance, while traditional PT deteriorates arbitrarily. Finally, we introduce PT with two references -- one fixed, one variational -- with a novel split annealing path that ensures stable variational reference adaptation. The paper concludes with experiments that demonstrate the large empirical gains achieved by our method in a wide range of realistic Bayesian inference scenarios.

preprint2022arXiv

Conditional Permutation Invariant Flows

We present a novel, conditional generative probabilistic model of set-valued data with a tractable log density. This model is a continuous normalizing flow governed by permutation equivariant dynamics. These dynamics are driven by a learnable per-set-element term and pairwise interactions, both parametrized by deep neural networks. We illustrate the utility of this model via applications including (1) complex traffic scene generation conditioned on visually specified map information, and (2) object bounding box generation conditioned directly on images. We train our model by maximizing the expected likelihood of labeled conditional data under our flow, with the aid of a penalty that ensures the dynamics are smooth and hence efficiently solvable. Our method significantly outperforms non-permutation invariant baselines in terms of log likelihood and domain-specific metrics (offroad, collision, and combined infractions), yielding realistic samples that are difficult to distinguish from real data.

preprint2022arXiv

Local Exchangeability

Exchangeability -- in which the distribution of an infinite sequence is invariant to reorderings of its elements -- implies the existence of a simple conditional independence structure that may be leveraged in the design of statistical models and inference procedures. In this work, we study a relaxation of exchangeability in which this invariance need not hold precisely. We introduce the notion of local exchangeability -- where swapping data associated with nearby covariates causes a bounded change in the distribution. We prove that locally exchangeable processes correspond to independent observations from an underlying measure-valued stochastic process. Using this main probabilistic result, we show that the local empirical measure of a finite collection of observations provides an approximation of the underlying measure-valued process and Bayesian posterior predictive distributions. The paper concludes with applications of the main theoretical results to a model from Bayesian nonparametrics and covariate-dependent permutation tests.

preprint2020arXiv

Slice Sampling for General Completely Random Measures

Completely random measures provide a principled approach to creating flexible unsupervised models, where the number of latent features is infinite and the number of features that influence the data grows with the size of the data set. Due to the infinity the latent features, posterior inference requires either marginalization---resulting in dependence structures that prevent efficient computation via parallelization and conjugacy---or finite truncation, which arbitrarily limits the flexibility of the model. In this paper we present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables, enabling efficient, parallelized computation without sacrificing flexibility. In contrast to past work that achieved this on a model-by-model basis, we provide a general recipe that is applicable to the broad class of completely random measure-based priors. The efficacy of the proposed algorithm is evaluated on several popular nonparametric models, demonstrating a higher effective sample size per second compared to algorithms using marginalization as well as a higher predictive performance compared to models employing fixed truncations.

preprint2020arXiv

Validated Variational Inference via Practical Posterior Error Bounds

Variational inference has become an increasingly attractive fast alternative to Markov chain Monte Carlo methods for approximate Bayesian inference. However, a major obstacle to the widespread use of variational methods is the lack of post-hoc accuracy measures that are both theoretically justified and computationally efficient. In this paper, we provide rigorous bounds on the error of posterior mean and uncertainty estimates that arise from full-distribution approximations, as in variational inference. Our bounds are widely applicable, as they require only that the approximating and exact posteriors have polynomial moments. Our bounds are also computationally efficient for variational inference because they require only standard values from variational objectives, straightforward analytic calculations, and simple Monte Carlo estimates. We show that our analysis naturally leads to a new and improved workflow for validated variational inference. Finally, we demonstrate the utility of our proposed workflow and error bounds on a robust regression problem and on a real-data example with a widely used multilevel hierarchical model.

preprint2019arXiv

Truncated Random Measures

Completely random measures (CRMs) and their normalizations are a rich source of Bayesian nonparametric priors. Examples include the beta, gamma, and Dirichlet processes. In this paper we detail two major classes of sequential CRM representations---series representations and superposition representations---within which we organize both novel and existing sequential representations that can be used for simulation and posterior inference. These two classes and their constituent representations subsume existing ones that have previously been developed in an ad hoc manner for specific processes. Since a complete infinite-dimensional CRM cannot be used explicitly for computation, sequential representations are often truncated for tractability. We provide truncation error analyses for each type of sequential representation, as well as their normalized versions, thereby generalizing and improving upon existing truncation error bounds in the literature. We analyze the computational complexity of the sequential representations, which in conjunction with our error bounds allows us to directly compare representations and discuss their relative efficiency. We include numerous applications of our theoretical results to commonly-used (normalized) CRMs, demonstrating that our results enable a straightforward representation and analysis of CRMs that has not previously been available in a Bayesian nonparametric context.

preprint2018arXiv

Exchangeable Trait Allocations

Trait allocations are a class of combinatorial structures in which data may belong to multiple groups and may have different levels of belonging in each group. Often the data are also exchangeable, i.e., their joint distribution is invariant to reordering. In clustering---a special case of trait allocation---exchangeability implies the existence of both a de Finetti representation and an exchangeable partition probability function (EPPF), distributional representations useful for computational and theoretical purposes. In this work, we develop the analogous de Finetti representation and exchangeable trait probability function (ETPF) for trait allocations, along with a characterization of all trait allocations with an ETPF. Unlike previous feature allocation characterizations, our proofs fully capture single-occurrence "dust" groups. We further introduce a novel constrained version of the ETPF that we use to establish an intuitive connection between the probability functions for clustering, feature allocations, and trait allocations. As an application of our general theory, we characterize the distribution of all edge-exchangeable graphs, a class of recently-developed models that captures realistic sparse graph sequences.

preprint2016arXiv

Efficient Global Point Cloud Alignment using Bayesian Nonparametric Mixtures

Point cloud alignment is a common problem in computer vision and robotics, with applications ranging from 3D object recognition to reconstruction. We propose a novel approach to the alignment problem that utilizes Bayesian nonparametrics to describe the point cloud and surface normal densities, and branch and bound (BB) optimization to recover the relative transformation. BB uses a novel, refinable, near-uniform tessellation of rotation space using 4D tetrahedra, leading to more efficient optimization compared to the common axis-angle tessellation. We provide objective function bounds for pruning given the proposed tessellation, and prove that BB converges to the optimum of the cost function along with providing its computational complexity. Finally, we empirically demonstrate the efficiency of the proposed approach as well as its robustness to real-world conditions such as missing data and partial overlap.

preprint2016arXiv

Small-Variance Nonparametric Clustering on the Hypersphere

Structural regularities in man-made environments reflect in the distribution of their surface normals. Describing these surface normal distributions is important in many computer vision applications, such as scene understanding, plane segmentation, and regularization of 3D reconstructions. Based on the small-variance limit of Bayesian nonparametric von-Mises-Fisher (vMF) mixture distributions, we propose two new flexible and efficient k-means-like clustering algorithms for directional data such as surface normals. The first, DP-vMF-means, is a batch clustering algorithm derived from the Dirichlet process (DP) vMF mixture. Recognizing the sequential nature of data collection in many applications, we extend this algorithm to DDP-vMF-means, which infers temporally evolving cluster structure from streaming data. Both algorithms naturally respect the geometry of directional data, which lies on the unit sphere. We demonstrate their performance on synthetic directional data and real 3D surface normals from RGB-D sensors. While our experiments focus on 3D data, both algorithms generalize to high dimensional directional data such as protein backbone configurations and semantic word vectors.

preprint2015arXiv

Streaming, Distributed Variational Inference for Bayesian Nonparametrics

This paper presents a methodology for creating streaming, distributed inference algorithms for Bayesian nonparametric (BNP) models. In the proposed framework, processing nodes receive a sequence of data minibatches, compute a variational posterior for each, and make asynchronous streaming updates to a central model. In contrast to previous algorithms, the proposed framework is truly streaming, distributed, asynchronous, learning-rate-free, and truncation-free. The key challenge in developing the framework, arising from the fact that BNP models do not impose an inherent ordering on their components, is finding the correspondence between minibatch and central BNP posterior components before performing each update. To address this, the paper develops a combinatorial optimization problem over component correspondences, and provides an efficient solution technique. The paper concludes with an application of the methodology to the DP mixture model, with experimental results demonstrating its practical scalability and performance.

preprint2014arXiv

Approximate Decentralized Bayesian Inference

This paper presents an approximate method for performing Bayesian inference in models with conditional independence over a decentralized network of learning agents. The method first employs variational inference on each individual learning agent to generate a local approximate posterior, the agents transmit their local posteriors to other agents in the network, and finally each agent combines its set of received local posteriors. The key insight in this work is that, for many Bayesian models, approximate inference schemes destroy symmetry and dependencies in the model that are crucial to the correct application of Bayes' rule when combining the local posteriors. The proposed method addresses this issue by including an additional optimization step in the combination procedure that accounts for these broken dependencies. Experiments on synthetic and real data demonstrate that the decentralized method provides advantages in computational performance and predictive test likelihood over previous batch and distributed methods.

preprint2013arXiv

Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process Mixture

This paper presents a novel algorithm, based upon the dependent Dirichlet process mixture model (DDPMM), for clustering batch-sequential data containing an unknown number of evolving clusters. The algorithm is derived via a low-variance asymptotic analysis of the Gibbs sampling algorithm for the DDPMM, and provides a hard clustering with convergence guarantees similar to those of the k-means algorithm. Empirical results from a synthetic test with moving Gaussian clusters and a test with real ADS-B aircraft trajectory data demonstrate that the algorithm requires orders of magnitude less computational time than contemporary probabilistic and hard clustering algorithms, while providing higher accuracy on the examined datasets.

Trevor Campbell

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Asymptotically exact variational flows via involutive MCMC kernels

Bayesian inference via sparse Hamiltonian flows

Fast Bayesian Coresets via Subsampling and Quasi-Newton Refinement

Parallel Tempering With a Variational Reference

Conditional Permutation Invariant Flows

Local Exchangeability

Slice Sampling for General Completely Random Measures

Validated Variational Inference via Practical Posterior Error Bounds

Truncated Random Measures

Exchangeable Trait Allocations

Efficient Global Point Cloud Alignment using Bayesian Nonparametric Mixtures

Small-Variance Nonparametric Clustering on the Hypersphere

Streaming, Distributed Variational Inference for Bayesian Nonparametrics

Approximate Decentralized Bayesian Inference

Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process Mixture