Source author record

Gareth O. Roberts

Gareth O. Roberts appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation math.PR Methodology math.ST Statistics Theory math.NA Populations and Evolution Applications Machine Learning physics.data-an physics.soc-ph Social and Information Networks

Catalog footprint

What is connected

29works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Scalability of Metropolis-within-Gibbs schemes for high-dimensional Bayesian models

We study general coordinate-wise MCMC schemes (such as Metropolis-within-Gibbs samplers), which are commonly used to fit Bayesian non-conjugate hierarchical models. We relate their convergence properties to the ones of the corresponding (potentially not implementable) Gibbs sampler through the notion of conditional conductance. This allows us to study the performances of popular Metropolis-within-Gibbs schemes for non-conjugate hierarchical models, in high-dimensional regimes where both number of datapoints and parameters increase. Given random data-generating assumptions, we establish dimension-free convergence results, which are in close accordance with numerical evidences. Applications to Bayesian models for binary regression with unknown hyperparameters and discretely observed diffusions are also discussed. Motivated by such statistical applications, auxiliary results of independent interest on approximate conductances and perturbation of Markov operators are provided.

preprint2026arXiv

Sub-Cauchy Sampling: Escaping the Dark Side of the Moon

We introduce a Markov chain Monte Carlo algorithm based on Sub-Cauchy Projection, a geometric transformation that generalizes stereographic projection by mapping Euclidean space into a spherical cap of a hyper-sphere, referred to as the complement of the dark side of the moon. We prove that our proposed method is uniformly ergodic for sub-Cauchy targets, namely targets whose tails are at most as heavy as a multidimensional Cauchy distribution, and show empirically its performance for challenging high-dimensional problems. The simplicity and broad applicability of our approach open new opportunities for Bayesian modeling and computation with heavy-tailed distributions in settings where most existing methods are unreliable.

preprint2022arXiv

Dimension-free Mixing for High-dimensional Bayesian Variable Selection

Yang et al. (2016) proved that the symmetric random walk Metropolis--Hastings algorithm for Bayesian variable selection is rapidly mixing under mild high-dimensional assumptions. We propose a novel MCMC sampler using an informed proposal scheme, which we prove achieves a much faster mixing time that is independent of the number of covariates, under the same assumptions. To the best of our knowledge, this is the first high-dimensional result which rigorously shows that the mixing rate of informed MCMC methods can be fast enough to offset the computational cost of local posterior evaluation. Motivated by the theoretical analysis of our sampler, we further propose a new approach called "two-stage drift condition" to studying convergence rates of Markov chains on general state spaces, which can be useful for obtaining tight complexity bounds in high-dimensional settings. The practical advantages of our algorithm are illustrated by both simulation studies and real data analysis.

preprint2022arXiv

Optimal Scaling of MCMC Beyond Metropolis

The problem of optimally scaling the proposal distribution in a Markov chain Monte Carlo algorithm is critical to the quality of the generated samples. Much work has gone into obtaining such results for various Metropolis-Hastings (MH) algorithms. Recently, acceptance probabilities other than MH are being employed in problems with intractable target distributions. There is little resource available on tuning the Gaussian proposal distributions for this situation. We obtain optimal scaling results for a general class of acceptance functions, which includes Barker's and Lazy-MH. In particular, optimal values for the Barker's algorithm are derived and found to be significantly different from that obtained for the MH algorithm. Our theoretical conclusions are supported by numerical simulations indicating that when the optimal proposal variance is unknown, tuning to the optimal acceptance probability remains an effective strategy.

preprint2022arXiv

The computational cost of blocking for sampling discretely observed diffusions

Many approaches for conducting Bayesian inference on discretely observed diffusions involve imputing diffusion bridges between observations. This can be computationally challenging in settings in which the temporal horizon between subsequent observations is large, due to the poor scaling of algorithms for simulating bridges as observation distance increases. It is common in practical settings to use a blocking scheme, in which the path is split into a (user-specified) number of overlapping segments and a Gibbs sampler is employed to update segments in turn. Substituting the independent simulation of diffusion bridges for one obtained using blocking introduces an inherent trade-off: we are now imputing shorter bridges at the cost of introducing a dependency between subsequent iterations of the bridge sampler. This is further complicated by the fact that there are a number of possible ways to implement the blocking scheme, each of which introduces a different dependency structure between iterations. Although blocking schemes have had considerable empirical success in practice, there has been no analysis of this trade-off nor guidance to practitioners on the particular specifications that should be used to obtain a computationally efficient implementation. In this article we conduct this analysis and demonstrate that the expected computational cost of a blocked path-space rejection sampler applied to Brownian bridges scales asymptotically at a cubic rate with respect to the observation distance and that this rate is linear in the case of the Ornstein-Uhlenbeck process. Numerical experiments suggest applicability both of the results of our paper and of the guidance we provide beyond the class of linear diffusions considered.

preprint2021arXiv

Rao-Blackwellization in the MCMC era

Rao-Blackwellization is a notion often occurring in the MCMC literature, with possibly different meanings and connections with the original Rao--Blackwell theorem (Rao, 1945 and Blackwell,1947), including a reduction of the variance of the resulting Monte Carlo approximations. This survey reviews some of the meanings of the term.

preprint2020arXiv

An epidemic model for an evolving pathogen with strain-dependent immunity

Between pandemics, the influenza virus exhibits periods of incremental evolution via a process known as antigenic drift. This process gives rise to a sequence of strains of the pathogen that are continuously replaced by newer strains, preventing a build up of immunity in the host population. In this paper, a parsimonious epidemic model is defined that attempts to capture the dynamics of evolving strains within a host population. The `evolving strains' epidemic model has many properties that lie in-between the Susceptible-Infected-Susceptible and the Susceptible-Infected-Removed epidemic models, due to the fact that individuals can only be infected by each strain once, but remain susceptible to reinfection by newly emerged strains. Coupling results are used to identify key properties, such as the time to extinction. A range of reproduction numbers are explored to characterize the model, including a novel quasi-stationary reproduction number that can be used to describe the re-emergence of the pathogen into a population with `average' levels of strain immunity, analogous to the beginning of the winter peak in influenza. Finally the quasi-stationary distribution of the evolving strains model is explored via simulation.

preprint2020arXiv

Optimal Scaling of Random-Walk Metropolis Algorithms on General Target Distributions

One main limitation of the existing optimal scaling results for Metropolis--Hastings algorithms is that the assumptions on the target distribution are unrealistic. In this paper, we consider optimal scaling of random-walk Metropolis algorithms on general target distributions in high dimensions arising from practical MCMC models from Bayesian statistics. For optimal scaling by maximizing expected squared jumping distance (ESJD), we show the asymptotically optimal acceptance rate $0.234$ can be obtained under general realistic sufficient conditions on the target distribution. The new sufficient conditions are easy to be verified and may hold for some general classes of MCMC models arising from Bayesian statistics applications, which substantially generalize the product i.i.d. condition required in most existing literature of optimal scaling. Furthermore, we show one-dimensional diffusion limits can be obtained under slightly stronger conditions, which still allow dependent coordinates of the target distribution. We also connect the new diffusion limit results to complexity bounds of Metropolis algorithms in high dimensions.

preprint2020arXiv

Quasi-stationary Monte Carlo and the ScaLE Algorithm

This paper introduces a class of Monte Carlo algorithms which are based upon the simulation of a Markov process whose quasi-stationary distribution coincides with a distribution of interest. This differs fundamentally from, say, current Markov chain Monte Carlo methods which simulate a Markov chain whose stationary distribution is the target. We show how to approximate distributions of interest by carefully combining sequential Monte Carlo methods with methodology for the exact simulation of diffusions. The methodology introduced here is particularly promising in that it is applicable to the same class of problems as gradient based Markov chain Monte Carlo algorithms but entirely circumvents the need to conduct Metropolis-Hastings type accept/reject steps whilst retaining exactness: the paper gives theoretical guarantees ensuring the algorithm has the correct limiting target distribution. Furthermore, this methodology is highly amenable to big data problems. By employing a modification to existing na{\"ı}ve sub-sampling and control variate techniques it is possible to obtain an algorithm which is still exact but has sub-linear iterative cost as a function of data size.

preprint2019arXiv

An approximation scheme for quasi-stationary distributions of killed diffusions

In this paper we study the asymptotic behavior of the normalized weighted empirical occupation measures of a diffusion process on a compact manifold which is killed at a smooth rate and then regenerated at a random location, distributed according to the weighted empirical occupation measure. We show that the weighted occupation measures almost surely comprise an asymptotic pseudo-trajectory for a certain deterministic measure-valued semiflow, after suitably rescaling the time, and that with probability one they converge to the quasi-stationary distribution of the killed diffusion. These results provide theoretical justification for a scalable quasi-stationary Monte Carlo method for sampling from Bayesian posterior distributions.

preprint2018arXiv

Accelerating Parallel Tempering: Quantile Tempering Algorithm (QuanTA)

Using MCMC to sample from a target distribution, $π(x)$ on a $d$-dimensional state space can be a difficult and computationally expensive problem. Particularly when the target exhibits multimodality, then the traditional methods can fail to explore the entire state space and this results in a bias sample output. Methods to overcome this issue include the parallel tempering algorithm which utilises an augmented state space approach to help the Markov chain traverse regions of low probability density and reach other modes. This method suffers from the curse of dimensionality which dramatically slows the transfer of mixing information from the auxiliary targets to the target of interest as $d \rightarrow \infty$. This paper introduces a novel prototype algorithm, QuanTA, that uses a Gaussian motivated transformation in an attempt to accelerate the mixing through the temperature schedule of a parallel tempering algorithm. This new algorithm is accompanied by a comprehensive theoretical analysis quantifying the improved efficiency and scalability of the approach; concluding that under weak regularity conditions the new approach gives accelerated mixing through the temperature schedule. Empirical evidence of the effectiveness of this new algorithm is illustrated on canonical examples.

preprint2016arXiv

Fast Langevin based algorithm for MCMC in high dimensions

We introduce new Gaussian proposals to improve the efficiency of the standard Hastings-Metropolis algorithm in Markov chain Monte Carlo (MCMC) methods, used for the sampling from a target distribution in large dimension $d$. The improved complexity is $\mathcal{O}(d^{1/5})$ compared to the complexity $\mathcal{O}(d^{1/3})$ of the standard approach. We prove an asymptotic diffusion limit theorem and show that the relative efficiency of the algorithm can be characterised by its overall acceptance rate (with asymptotical value 0.704), independently of the target distribution. Numerical experiments confirm our theoretical findings.

preprint2016arXiv

Hitting Time and Convergence Rate Bounds for Symmetric Langevin Diffusions

We provide quantitative bounds on the convergence to stationarity of real-valued Langevin diffusions with symmetric target densities.

preprint2016arXiv

On the exact and $\varepsilon$-strong simulation of (jump) diffusions

This paper introduces a framework for simulating finite dimensional representations of (jump) diffusion sample paths over finite intervals, without discretisation error (exactly), in such a way that the sample path can be restored at any finite collection of time points. Within this framework we extend existing exact algorithms and introduce novel adaptive approaches. We consider an application of the methodology developed within this paper which allows the simulation of upper and lower bounding processes which almost surely constrain (jump) diffusion sample paths to any specified tolerance. We demonstrate the efficacy of our approach by showing that with finite computation it is possible to determine whether or not sample paths cross various irregular barriers, simulate to any specified tolerance the first hitting time of the irregular barrier and simulate killed diffusion sample paths.

preprint2016arXiv

Optimal scaling of the Random Walk Metropolis algorithm under Lp mean differentiability

This paper considers the optimal scaling problem for high-dimensional random walk Metropolis algorithms for densities which are differentiable in Lp mean but which may be irregular at some points (like the Laplace density for example) and/or are supported on an interval. Our main result is the weak convergence of the Markov chain (appropriately rescaled in time and space) to a Langevin diffusion process as the dimension d goes to infinity. Because the log-density might be non-differentiable, the limiting diffusion could be singular. The scaling limit is established under assumptions which are much weaker than the one used in the original derivation of [6]. This result has important practical implications for the use of random walk Metropolis algorithms in Bayesian frameworks based on sparsity inducing priors.

preprint2015arXiv

Stability of adversarial Markov chains, with an application to adaptive MCMC algorithms

We consider whether ergodic Markov chains with bounded step size remain bounded in probability when their transitions are modified by an adversary on a bounded subset. We provide counterexamples to show that the answer is no in general, and prove theorems to show that the answer is yes under various additional assumptions. We then use our results to prove convergence of various adaptive Markov chain Monte Carlo algorithms.

preprint2015arXiv

Stability of Noisy Metropolis-Hastings

Pseudo-marginal Markov chain Monte Carlo methods for sampling from intractable distributions have gained recent interest and have been theoretically studied in considerable depth. Their main appeal is that they are exact, in the sense that they target marginally the correct invariant distribution. However, the pseudo-marginal Markov chain can exhibit poor mixing and slow convergence towards its target. As an alternative, a subtly different Markov chain can be simulated, where better mixing is possible but the exactness property is sacrificed. This is the noisy algorithm, initially conceptualised as Monte Carlo within Metropolis (MCWM), which has also been studied but to a lesser extent. The present article provides a further characterisation of the noisy algorithm, with a focus on fundamental stability properties like positive recurrence and geometric ergodicity. Sufficient conditions for inheriting geometric ergodicity from a standard Metropolis-Hastings chain are given, as well as convergence of the invariant distribution towards the true target distribution.

preprint2014arXiv

Complexity Bounds for MCMC via Diffusion Limits

We connect known results about diffusion limits of Markov chain Monte Carlo (MCMC) algorithms to the Computer Science notion of algorithm complexity. Our main result states that any diffusion limit of a Markov process implies a corresponding complexity bound (in an appropriate metric). We then combine this result with previously-known MCMC diffusion limit results to prove that under appropriate assumptions, the Random-Walk Metropolis (RWM) algorithm in $d$ dimensions takes $O(d)$ iterations to converge to stationarity, while the Metropolis-Adjusted Langevin Algorithm (MALA) takes $O(d^{1/3})$ iterations to converge to stationarity.

preprint2014arXiv

Minimising MCMC variance via diffusion limits, with an application to simulated tempering

We derive new results comparing the asymptotic variance of diffusions by writing them as appropriate limits of discrete-time birth-death chains which themselves satisfy Peskun orderings. We then apply our results to simulated tempering algorithms to establish which choice of inverse temperatures minimises the asymptotic variance of all functionals and thus leads to the most efficient MCMC algorithm.

preprint2014arXiv

On the efficiency of pseudo-marginal random walk Metropolis algorithms

We examine the behaviour of the pseudo-marginal random walk Metropolis algorithm, where evaluations of the target density for the accept/reject probability are estimated rather than computed precisely. Under relatively general conditions on the target distribution, we obtain limiting formulae for the acceptance rate and for the expected squared jump distance, as the dimension of the target approaches infinity, under the assumption that the noise in the estimate of the log-target is additive and is independent of the position. For targets with independent and identically distributed components, we also obtain a limiting diffusion for the first component. We then consider the overall efficiency of the algorithm, in terms of both speed of mixing and computational time. Assuming the additive noise is Gaussian and is inversely proportional to the number of unbiased estimates that are used, we prove that the algorithm is optimally efficient when the variance of the noise is approximately 3.283 and the acceptance rate is approximately 7.001%. We also find that the optimal scaling is insensitive to the noise and that the optimal variance of the noise is insensitive to the scaling. The theory is illustrated with a simulation study using the particle marginal random walk Metropolis.

preprint2014arXiv

Systematic Physics Constrained Parameter Estimation of Stochastic Differential Equations

A systematic Bayesian framework is developed for physics constrained parameter inference ofstochastic differential equations (SDE) from partial observations. The physical constraints arederived for stochastic climate models but are applicable for many fluid systems. A condition isderived for global stability of stochastic climate models based on energy conservation. Stochasticclimate models are globally stable when a quadratic form, which is related to the cubic nonlinearoperator, is negative definite. A new algorithm for the efficient sampling of such negative definite matrices is developed and also for imputing unobserved data which improve the accuracy of theparameter estimates. The performance of this framework is evaluated on two conceptual climatemodels.

preprint2014arXiv

Unbiased Monte Carlo: posterior estimation for intractable/infinite-dimensional models

We provide a general methodology for unbiased estimation for intractable stochastic models. We consider situations where the target distribution can be written as an appropriate limit of distributions, and where conventional approaches require truncation of such a representation leading to a systematic bias. For example, the target distribution might be representable as the $L^2$-limit of a basis expansion in a suitable Hilbert space; or alternatively the distribution of interest might be representable as the weak limit of a sequence of random variables, as in MCMC. Our main motivation comes from infinite-dimensional models which can be parame- terised in terms of a series expansion of basis functions (such as that given by a Karhunen-Loeve expansion). We consider schemes for direct unbiased estimation along such an expansion, as well as those based on MCMC schemes which, due to their dimensionality, cannot be directly imple- mented, but which can be effectively estimated unbiasedly. For all our methods we give theory to justify the numerical stability for robust Monte Carlo implementation, and in some cases we illustrate using simulations. Interestingly the computational efficiency of our methods is usually comparable to simpler methods which are biased. Crucial to the effectiveness of our proposed methodology is the construction of appropriate couplings, many of which resonate strongly with the Monte Carlo constructions used in the coupling from the past algorithm and its variants.

preprint2013arXiv

Adaptive Gibbs samplers and related MCMC methods

We consider various versions of adaptive Gibbs and Metropolis-within-Gibbs samplers, which update their selection probabilities (and perhaps also their proposal distributions) on the fly during a run by learning as they go in an attempt to optimize the algorithm. We present a cautionary example of how even a simple-seeming adaptive Gibbs sampler may fail to converge. We then present various positive results guaranteeing convergence of adaptive Gibbs samplers under certain conditions.

preprint2012arXiv

Markov chain Monte Carlo for exact inference for diffusions

We develop exact Markov chain Monte Carlo methods for discretely-sampled, directly and indirectly observed diffusions. The qualification "exact" refers to the fact that the invariant and limiting distribution of the Markov chains is the posterior distribution of the parameters free of any discretisation error. The class of processes to which our methods directly apply are those which can be simulated using the most general to date exact simulation algorithm. The article introduces various methods to boost the performance of the basic scheme, including reparametrisations and auxiliary Poisson sampling. We contrast both theoretically and empirically how this new approach compares to irreducible high frequency imputation, which is the state-of-the-art alternative for the class of processes we consider, and we uncover intriguing connections. All methods discussed in the article are tested on typical examples.

preprint2011arXiv

CLTs and asymptotic variance of time-sampled Markov chains

For a Markov transition kernel $P$ and a probability distribution $ μ$ on nonnegative integers, a time-sampled Markov chain evolves according to the transition kernel $P_μ = \sum_k μ(k)P^k.$ In this note we obtain CLT conditions for time-sampled Markov chains and derive a spectral formula for the asymptotic variance. Using these results we compare efficiency of Barker's and Metropolis algorithms in terms of asymptotic variance.

preprint2010arXiv

Latent diffusion models for survival analysis

We consider Bayesian hierarchical models for survival analysis, where the survival times are modeled through an underlying diffusion process which determines the hazard rate. We show how these models can be efficiently treated by means of Markov chain Monte Carlo techniques.

preprint2010arXiv

Networks and the Epidemiology of Infectious Disease

The science of networks has revolutionised research into the dynamics of interacting elements. It could be argued that epidemiology in particular has embraced the potential of network theory more than any other discipline. Here we review the growing body of research concerning the spread of infectious diseases on networks, focusing on the interplay between network theory and epidemiology. The review is split into four main sections, which examine: the types of network relevant to epidemiology; the multitude of ways these networks can be characterised; the statistical methods that can be applied to infer the epidemiological parameters on a realised network; and finally simulation and analytical methods to determine epidemic dynamics on a given network. Given the breadth of areas covered and the ever-expanding number of publications, a comprehensive review of all work is impossible. Instead, we provide a personalised overview into the areas of network epidemiology that have seen the greatest progress in recent years or have the greatest potential to provide novel insights. As such, considerable importance is placed on analytical approaches and statistical methods which are both rapidly expanding fields. Throughout this review we restrict our attention to epidemiological issues.

preprint2010arXiv

Optimal tuning of the Hybrid Monte-Carlo Algorithm

We investigate the properties of the Hybrid Monte-Carlo algorithm (HMC) in high dimensions. HMC develops a Markov chain reversible w.r.t. a given target distribution $Π$ by using separable Hamiltonian dynamics with potential $-\logΠ$. The additional momentum variables are chosen at random from the Boltzmann distribution and the continuous-time Hamiltonian dynamics are then discretised using the leapfrog scheme. The induced bias is removed via a Metropolis-Hastings accept/reject rule. In the simplified scenario of independent, identically distributed components, we prove that, to obtain an $\mathcal{O}(1)$ acceptance probability as the dimension $d$ of the state space tends to $\infty$, the leapfrog step-size $h$ should be scaled as $h= l \times d^{-1/4}$. Therefore, in high dimensions, HMC requires $\mathcal{O}(d^{1/4})$ steps to traverse the state space. We also identify analytically the asymptotically optimal acceptance probability, which turns out to be 0.651 (to three decimal places). This is the choice which optimally balances the cost of generating a proposal, which {\em decreases} as $l$ increases, against the cost related to the average number of proposals required to obtain acceptance, which {\em increases} as $l$ increases.

preprint2010arXiv

The Random Walk Metropolis: Linking Theory and Practice Through a Case Study

The random walk Metropolis (RWM) is one of the most common Markov chain Monte Carlo algorithms in practical use today. Its theoretical properties have been extensively explored for certain classes of target, and a number of results with important practical implications have been derived. This article draws together a selection of new and existing key results and concepts and describes their implications. The impact of each new idea on algorithm efficiency is demonstrated for the practical example of the Markov modulated Poisson process (MMPP). A reparameterization of the MMPP which leads to a highly efficient RWM-within-Gibbs algorithm in certain circumstances is also presented.

Gareth O. Roberts

What is connected

Connect this record

See the researcher in context

Building this map preview

29 published item(s)

Scalability of Metropolis-within-Gibbs schemes for high-dimensional Bayesian models

Sub-Cauchy Sampling: Escaping the Dark Side of the Moon

Dimension-free Mixing for High-dimensional Bayesian Variable Selection

Optimal Scaling of MCMC Beyond Metropolis

The computational cost of blocking for sampling discretely observed diffusions

Rao-Blackwellization in the MCMC era

An epidemic model for an evolving pathogen with strain-dependent immunity

Optimal Scaling of Random-Walk Metropolis Algorithms on General Target Distributions

Quasi-stationary Monte Carlo and the ScaLE Algorithm

An approximation scheme for quasi-stationary distributions of killed diffusions

Accelerating Parallel Tempering: Quantile Tempering Algorithm (QuanTA)

Fast Langevin based algorithm for MCMC in high dimensions

Hitting Time and Convergence Rate Bounds for Symmetric Langevin Diffusions

On the exact and $\varepsilon$-strong simulation of (jump) diffusions

Optimal scaling of the Random Walk Metropolis algorithm under Lp mean differentiability

Stability of adversarial Markov chains, with an application to adaptive MCMC algorithms

Stability of Noisy Metropolis-Hastings

Complexity Bounds for MCMC via Diffusion Limits

Minimising MCMC variance via diffusion limits, with an application to simulated tempering

On the efficiency of pseudo-marginal random walk Metropolis algorithms

Systematic Physics Constrained Parameter Estimation of Stochastic Differential Equations

Unbiased Monte Carlo: posterior estimation for intractable/infinite-dimensional models

Adaptive Gibbs samplers and related MCMC methods

Markov chain Monte Carlo for exact inference for diffusions

CLTs and asymptotic variance of time-sampled Markov chains

Latent diffusion models for survival analysis

Networks and the Epidemiology of Infectious Disease

Optimal tuning of the Hybrid Monte-Carlo Algorithm

The Random Walk Metropolis: Linking Theory and Practice Through a Case Study