Source author record

Jimmy Olsson

Jimmy Olsson appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation math.ST Statistics Theory Machine Learning Methodology Discrete Mathematics math.CO math.PR physics.med-ph

Catalog footprint

What is connected

17works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

State and parameter learning with PaRIS particle Gibbs

Non-linear state-space models, also known as general hidden Markov models, are ubiquitous in statistical machine learning, being the most classical generative models for serial data and sequences in general. The particle-based, rapid incremental smoother PaRIS is a sequential Monte Carlo (SMC) technique allowing for efficient online approximation of expectations of additive functionals under the smoothing distribution in these models. Such expectations appear naturally in several learning contexts, such as likelihood estimation (MLE) and Markov score climbing (MSC). PARIS has linear computational complexity, limited memory requirements and comes with non-asymptotic bounds, convergence results and stability guarantees. Still, being based on self-normalised importance sampling, the PaRIS estimator is biased. Our first contribution is to design a novel additive smoothing algorithm, the Parisian particle Gibbs PPG sampler, which can be viewed as a PaRIS algorithm driven by conditional SMC moves, resulting in bias-reduced estimates of the targeted quantities. We substantiate the PPG algorithm with theoretical results, including new bounds on bias and variance as well as deviation inequalities. Our second contribution is to apply PPG in a learning framework, covering MLE and MSC as special examples. In this context, we establish, under standard assumptions, non-asymptotic bounds highlighting the value of bias reduction and the implicit Rao--Blackwellization of PPG. These are the first non-asymptotic results of this kind in this setting. We illustrate our theoretical results with numerical experiments supporting our claims.

preprint2022arXiv

A similarity-based Bayesian mixture-of-experts model

We present a new nonparametric mixture-of-experts model for multivariate regression problems, inspired by the probabilistic k-nearest neighbors algorithm. Using a conditionally specified model, predictions for out-of-sample inputs are based on similarities to each observed data point, yielding predictive distributions represented by Gaussian mixtures. Posterior inference is performed on the parameters of the mixture components as well as the distance metric using a mean-field variational Bayes algorithm accompanied with a stochastic gradient-based optimization procedure. The proposed method is especially advantageous in settings where inputs are of relatively high dimension in comparison to the data size, where input-output relationships are complex, and where predictive distributions may be skewed or multimodal. Computational studies on five datasets, of which two are synthetically generated, illustrate clear advantages of our mixture-of-experts method for high-dimensional inputs, outperforming competitor models both in terms of validation metrics and visual inspection.

preprint2022arXiv

Adaptive online variance estimation in particle filters: the ALVar estimator

We present a new approach-the ALVar estimator-to estimation of asymptotic variance in sequential Monte Carlo methods, or, particle filters. The method, which adjusts adaptively the lag of the estimator proposed in [Olsson, J. and Douc, R. (2019). Numerically stable online estimation of variance in particle filters. Bernoulli, 25(2), pp. 1504-1535] applies to very general distribution flows and particle filters, including auxiliary particle filters with adaptive resampling. The algorithm operates entirely online, in the sense that it is able to monitor the variance of the particle filter in real time and with, on the average, constant computational complexity and memory requirements per iteration. Crucially, it does not require the calibration of any algorithmic parameter. Estimating the variance only on the basis of the genealogy of the propagated particle cloud, without additional simulations, the routine requires only minor code additions to the underlying particle algorithm. Finally, we prove that the ALVar estimator is consistent for the true asymptotic variance as the number of particles tends to infinity and illustrate numerically its superiority to existing approaches.

preprint2022arXiv

BR-SNIS: Bias Reduced Self-Normalized Importance Sampling

Importance Sampling (IS) is a method for approximating expectations under a target distribution using independent samples from a proposal distribution and the associated importance weights. In many applications, the target distribution is known only up to a normalization constant, in which case self-normalized IS (SNIS) can be used. While the use of self-normalization can have a positive effect on the dispersion of the estimator, it introduces bias. In this work, we propose a new method, BR-SNIS, whose complexity is essentially the same as that of SNIS and which significantly reduces bias without increasing the variance. This method is a wrapper in the sense that it uses the same proposal samples and importance weights as SNIS, but makes clever use of iterated sampling--importance resampling (ISIR) to form a bias-reduced version of the estimator. We furnish the proposed algorithm with rigorous theoretical results, including new bias, variance and high-probability bounds, and these are illustrated by numerical examples.

preprint2022arXiv

Probabilistic Pareto plan generation for semiautomated multicriteria radiation therapy treatment planning

Objective: We propose a semiautomatic pipeline for radiation therapy treatment planning, combining ideas from machine learning-automated planning and multicriteria optimization (MCO). Approach: Using knowledge extracted from historically delivered plans, prediction models for spatial dose and dose statistics are trained and furthermore systematically modified to simulate changes in tradeoff priorities, creating a set of differently biased predictions. Based on the predictions, an MCO problem is subsequently constructed using previously developed dose mimicking functions, designed in such a way that its Pareto surface spans the range of clinically acceptable yet realistically achievable plans as exactly as possible. The result is an algorithm outputting a set of Pareto optimal plans, either fluence-based or machine parameter-based, which the user can navigate between in real time to make adjustments before a final deliverable plan is created. Main results: Numerical experiments performed on a dataset of prostate cancer patients show that one may often navigate to a better plan than one produced by a single-plan-output algorithm. Significance: We demonstrate the potential of merging MCO and a data-driven workflow to automate labor-intensive parts of the treatment planning process while maintaining a certain extent of manual control for the user.

preprint2021arXiv

Sequential sampling of junction trees for decomposable graphs

The junction-tree representation provides an attractive structural property for organizing a decomposable graph. In this study, we present two novel stochastic algorithms, which we call the junction-tree expander and junction-tree collapser for sequential sampling of junction trees for decomposable graphs. We show that recursive application of the junction-tree expander, expanding incrementally the underlying graph with one vertex at a time, has full support on the space of junction trees with any given number of underlying vertices. On the other hand, the junction-tree collapser provides a complementary operation for removing vertices in the underlying decomposable graph of a junction tree, while maintaining the junction tree property. A direct application of our suggested algorithms is demonstrated in a sequential-Monte-Carlo setting designed for sampling from distributions on spaces of decomposable graphs. Numerical studies illustrate the utility of the proposed algorithms for combinatorial computations on decomposable graphs and junction trees. All the methods proposed in the paper are implemented in the Python library trilearn.

preprint2016arXiv

An efficient particle-based online EM algorithm for general state-space models

Estimating the parameters of general state-space models is a topic of importance for many scientific and engineering disciplines. In this paper we present an online parameter estimation algorithm obtained by casting our recently proposed particle-based, rapid incremental smoother (PaRIS) into the framework of online expectation-maximization (EM) for state-space models proposed by Cappé (2011). Previous such particle-based implementations of online EM suffer typically from either the well-known degeneracy of the genealogical particle paths or a quadratic complexity in the number of particles. However, by using the computationally efficient and numerically stable PaRIS algorithm for estimating smoothed expectations of time-averaged sufficient statistics of the model we obtain a fast algorithm with very limited memory requirements and a computational complexity that grows only linearly with the number of particles. The efficiency of the algorithm is illustrated in a simulation study.

preprint2016arXiv

Efficient parameter inference in general hidden Markov models using the filter derivatives

Estimating online the parameters of general state-space hidden Markov models is a topic of importance in many scientific and engineering disciplines. In this paper we present an online parameter estimation algorithm obtained by casting our recently proposed particle-based, rapid incremental smoother (PaRIS) into the framework of recursive maximum likelihood estimation for general hidden Markov models. Previous such particle implementations suffer from either quadratic complexity in the number of particles or from the well-known degeneracy of the genealogical particle paths. By using the computational efficient and numerically stable PaRIS algorithm for estimating the needed prediction filter derivatives we obtain a fast algorithm with a computational complexity that grows only linearly with the number of particles. The efficiency and stability of the proposed algorithm are illustrated in a simulation study.

preprint2016arXiv

Posterior consistency for partially observed Markov models

In this work we establish the posterior consistency for a parametrized family of partially observed, fully dominated Markov models. As a main assumption, we suppose that the prior distribution assigns positive probability to all neighborhoods of the true parameter, for a distance induced by the expected Kullback-Leibler divergence between the family members' Markov transition densities. This assumption is easily checked in general. In addition, under some additional, mild assumptions we show that the posterior consistency is implied by the consistency of the maximum likelihood estimator. The latter has recently been established also for models with non-compact state space. The result is then extended to possibly non-compact parameter spaces and non-stationary observations. Finally, we check our assumptions on examples including the partially observed Gaussian linear model with correlated noise and a widely used stochastic volatility model.

preprint2014arXiv

Comparison of asymptotic variances of inhomogeneous Markov chains with application to Markov chain Monte Carlo methods

In this paper, we study the asymptotic variance of sample path averages for inhomogeneous Markov chains that evolve alternatingly according to two different $π$-reversible Markov transition kernels $P$ and $Q$. More specifically, our main result allows us to compare directly the asymptotic variances of two inhomogeneous Markov chains associated with different kernels $P_i$ and $Q_i$, $i\in\{0,1\}$, as soon as the kernels of each pair $(P_0,P_1)$ and $(Q_0,Q_1)$ can be ordered in the sense of lag-one autocovariance. As an important application, we use this result for comparing different data-augmentation-type Metropolis-Hastings algorithms. In particular, we compare some pseudo-marginal algorithms and propose a novel exact algorithm, referred to as the random refreshment algorithm, which is more efficient, in terms of asymptotic variance, than the Grouped Independence Metropolis-Hastings algorithm and has a computational complexity that does not exceed that of the Monte Carlo Within Metropolis algorithm.

preprint2014arXiv

Efficient particle-based online smoothing in general hidden Markov models: the PaRIS algorithm

This paper presents a novel algorithm, the particle-based, rapid incremental smoother (PaRIS), for efficient online approximation of smoothed expectations of additive state functionals in general hidden Markov models. The algorithm, which has a linear computational complexity under weak assumptions and very limited memory requirements, is furnished with a number of convergence results, including a central limit theorem. An interesting feature of PaRIS, which samples on-the-fly from the retrospective dynamics induced by the particle filter, is that it requires two or more backward draws per particle in order to cope with degeneracy of the sampled trajectories and to stay numerically stable in the long run with an asymptotic variance that grows only linearly with time.

preprint2014arXiv

Long-term stability of sequential Monte Carlo methods under verifiable conditions

This paper discusses particle filtering in general hidden Markov models (HMMs) and presents novel theoretical results on the long-term stability of bootstrap-type particle filters. More specifically, we establish that the asymptotic variance of the Monte Carlo estimates produced by the bootstrap filter is uniformly bounded in time. On the contrary to most previous results of this type, which in general presuppose that the state space of the hidden state process is compact (an assumption that is rarely satisfied in practice), our very mild assumptions are satisfied for a large class of HMMs with possibly noncompact state space. In addition, we derive a similar time uniform bound on the asymptotic $\mathsf{L}^p$ error. Importantly, our results hold for misspecified models; that is, we do not at all assume that the data entering into the particle filter originate from the model governing the dynamics of the particles or not even from an HMM.

preprint2014arXiv

On the use of Markov chain Monte Carlo methods for the sampling of mixture models

In this paper we study asymptotic properties of different data-augmentation-type Markov chain Monte Carlo algorithms sampling from mixture models comprising discrete as well as continuous random variables. Of particular interest to us is the situation where sampling from the conditional distribution of the continuous component given the discrete component is infeasible. In this context, we cast Carlin & Chib's pseudo-prior method into the framework of mixture models and discuss and compare different variants of this scheme. We propose a novel algorithm, the FCC sampler, which is less computationally demanding than any Metropolised Carlin & Chib-type algorithm. The significant gain of computational efficiency is however obtained at the cost of some asymptotic variance. The performance of the algorithm vis-à-vis alternative schemes is investigated theoretically, using some recent results obtained in [3] for inhomogeneous Markov chains evolving alternatingly according to two different reversible Markov transition kernels, as well as numerically.

preprint2012arXiv

Sequential Monte Carlo smoothing for general state space hidden Markov models

Computing smoothing distributions, the distributions of one or more states conditional on past, present, and future observations is a recurring problem when operating on general hidden Markov models. The aim of this paper is to provide a foundation of particle-based approximation of such distributions and to analyze, in a common unifying framework, different schemes producing such approximations. In this setting, general convergence results, including exponential deviation inequalities and central limit theorems, are established. In particular, time uniform bounds on the marginal smoothing error are obtained under appropriate mixing conditions on the transition kernel of the latent chain. In addition, we propose an algorithm approximating the joint smoothing distribution at a cost that grows only linearly with the number of particles.

preprint2011arXiv

Consistency of the maximum likelihood estimator for general hidden Markov models

Consider a parametrized family of general hidden Markov models, where both the observed and unobserved components take values in a complete separable metric space. We prove that the maximum likelihood estimator (MLE) of the parameter is strongly consistent under a rather minimal set of assumptions. As special cases of our main result, we obtain consistency in a large class of nonlinear state space models, as well as general results on linear Gaussian state space models and finite state models. A novel aspect of our approach is an information-theoretic technique for proving identifiability, which does not require an explicit representation for the relative entropy rate. Our method of proof could therefore form a foundation for the investigation of MLE consistency in more general dependent and non-Markovian time series. Also of independent interest is a general concentration inequality for $V$-uniformly ergodic Markov chains.

preprint2010arXiv

Metropolising forward particle filtering backward sampling and Rao-Blackwellisation of Metropolised particle smoothers

Smoothing in state-space models amounts to computing the conditional distribution of the latent state trajectory, given observations, or expectations of functionals of the state trajectory with respect to this distributions. For models that are not linear Gaussian or possess finite state space, smoothing distributions are in general infeasible to compute as they involve intergrals over a space of dimensionality at least equal to the number of observations. Recent years have seen an increased interest in Monte Carlo-based methods for smoothing, often involving particle filters. One such method is to approximate filter distributions with a particle filter, and then to simulate backwards on the trellis of particles using a backward kernel. We show that by supplementing this procedure with a Metropolis-Hastings step deciding whether to accept a proposed trajectory or not, one obtains a Markov chain Monte Carlo scheme whose stationary distribution is the exact smoothing distribution. We also show that in this procedure, backward sampling can be replaced by backward smoothing, which effectively means averaging over all possible trajectories. In an example we compare these approaches to a similar one recently proposed by Andrieu, Doucet and Holenstein, and show that the new methods can be more efficient in terms of precision (inverse variance) per computation time.

preprint2010arXiv

Particle-based likelihood inference in partially observed diffusion processes using generalised Poisson estimators

This paper concerns the use of the expectation-maximisation (EM) algorithm for inference in partially observed diffusion processes. In this context, a well known problem is that all except a few diffusion processes lack closed-form expressions of the transition densities. Thus, in order to estimate efficiently the EM intermediate quantity we construct, using novel techniques for unbiased estimation of diffusion transition densities, a random weight fixed-lag auxiliary particle smoother, which avoids the well known problem of particle trajectory degeneracy in the smoothing mode. The estimator is justified theoretically and demonstrated on a simulated example.

Jimmy Olsson

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

State and parameter learning with PaRIS particle Gibbs

A similarity-based Bayesian mixture-of-experts model

Adaptive online variance estimation in particle filters: the ALVar estimator

BR-SNIS: Bias Reduced Self-Normalized Importance Sampling

Probabilistic Pareto plan generation for semiautomated multicriteria radiation therapy treatment planning

Sequential sampling of junction trees for decomposable graphs

An efficient particle-based online EM algorithm for general state-space models

Efficient parameter inference in general hidden Markov models using the filter derivatives

Posterior consistency for partially observed Markov models

Comparison of asymptotic variances of inhomogeneous Markov chains with application to Markov chain Monte Carlo methods

Efficient particle-based online smoothing in general hidden Markov models: the PaRIS algorithm

Long-term stability of sequential Monte Carlo methods under verifiable conditions

On the use of Markov chain Monte Carlo methods for the sampling of mixture models

Sequential Monte Carlo smoothing for general state space hidden Markov models

Consistency of the maximum likelihood estimator for general hidden Markov models

Metropolising forward particle filtering backward sampling and Rao-Blackwellisation of Metropolised particle smoothers

Particle-based likelihood inference in partially observed diffusion processes using generalised Poisson estimators