Source author record

Simo Särkkä

Simo Särkkä appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

37works

20topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Dual-Level Models for Physics-Informed Multi-Step Time Series Forecasting

This paper develops an approach for multi-step forecasting of dynamical systems by integrating probabilistic input forecasting with physics-informed output prediction. Accurate multi-step forecasting of time series systems is important for the automatic control and optimization of physical processes, enabling more precise decision-making. While mechanistic-based and data-driven machine learning (ML) approaches have been employed for time series forecasting, they face significant limitations. Incomplete knowledge of process mathematical models limits mechanistic-based direct employment, while purely data-driven ML models struggle with dynamic environments, leading to poor generalization. To address these limitations, this paper proposes a dual-level strategy for physics-informed forecasting of dynamical systems. On the first level, input variables are forecast using a hybrid method that integrates a long short-term memory (LSTM) network into probabilistic state transition models (STMs). On the second level, these stochastically predicted inputs are sequentially fed into a physics-informed neural network (PINN) to generate multi-step output predictions. The experimental results of the paper demonstrate that the hybrid input forecasting models achieve a higher log-likelihood and lower mean squared errors (MSE) compared to conventional STMs. Furthermore, the PINNs driven by the input forecasting models outperform their purely data-driven counterparts in terms of MSE and log-likelihood, exhibiting stronger generalization and forecasting performance across multiple test cases.

preprint2022arXiv

De-Sequentialized Monte Carlo: a parallel-in-time particle smoother

Particle smoothers are SMC (Sequential Monte Carlo) algorithms designed to approximate the joint distribution of the states given observations from a state-space model. We propose dSMC (de-Sequentialized Monte Carlo), a new particle smoother that is able to process $T$ observations in $\mathcal{O}(\log T)$ time on parallel architecture. This compares favourably with standard particle smoothers, the complexity of which is linear in $T$. We derive $\mathcal{L}_p$ convergence results for dSMC, with an explicit upper bound, polynomial in $T$. We then discuss how to reduce the variance of the smoothing estimates computed by dSMC by (i) designing good proposal distributions for sampling the particles at the initialization of the algorithm, as well as by (ii) using lazy resampling to increase the number of particles used in dSMC. Finally, we design a particle Gibbs sampler based on dSMC, which is able to perform parameter inference in a state-space model at a $\mathcal{O}(\log(T))$ cost on parallel hardware.

preprint2022arXiv

Multidimensional Projection Filters via Automatic Differentiation and Sparse-Grid Integration

The projection filter is a technique for approximating the solutions of optimal filtering problems. In projection filters, the Kushner--Stratonovich stochastic partial differential equation that governs the propagation of the optimal filtering density is projected to a manifold of parametric densities, resulting in a finite-dimensional stochastic differential equation. Despite the fact that projection filters are capable of representing complicated probability densities, their current implementations are limited to Gaussian family or unidimensional filtering applications. This work considers a combination of numerical integration and automatic differentiation to construct projection filter algorithms for more generic problems. Specifically, we provide a detailed exposition of this combination for the manifold of the exponential family, and show how to apply the projection filter to multidimensional cases. We demonstrate numerically that based on comparison to a finite-difference solution to the Kushner--Stratonovich equation and a bootstrap particle filter with systematic resampling, the proposed algorithm retains an accurate approximation of the filtering density while requiring a comparatively low number of quadrature points. Due to the sparse-grid integration and automatic differentiation used to calculate the expected values of the natural statistics and the Fisher metric, the proposed filtering algorithms are highly scalable. They therefore are suitable to many applications in which the number of dimensions exceeds the practical limit of particle filters, but where the Gaussian-approximations are deemed unsatisfactory.

preprint2022arXiv

Online Pole Segmentation on Range Images for Long-term LiDAR Localization in Urban Environments

Robust and accurate localization is a basic requirement for mobile autonomous systems. Pole-like objects, such as traffic signs, poles, and lamps are frequently used landmarks for localization in urban environments due to their local distinctiveness and long-term stability. In this paper, we present a novel, accurate, and fast pole extraction approach based on geometric features that runs online and has little computational demands. Our method performs all computations directly on range images generated from 3D LiDAR scans, which avoids processing 3D point clouds explicitly and enables fast pole extraction for each scan. We further use the extracted poles as pseudo labels to train a deep neural network for online range image-based pole segmentation. We test both our geometric and learning-based pole extraction methods for localization on different datasets with different LiDAR scanners, routes, and seasonal changes. The experimental results show that our methods outperform other state-of-the-art approaches. Moreover, boosted with pseudo pole labels extracted from multiple datasets, our learning-based method can run across different datasets and achieve even better localization results compared to our geometry-based method. We released our pole datasets to the public for evaluating the performance of pole extractors, as well as the implementation of our approach.

preprint2022arXiv

System identification using Bayesian neural networks with nonparametric noise models

System identification is of special interest in science and engineering. This article is concerned with a system identification problem arising in stochastic dynamic systems, where the aim is to estimate the parameters of a system along with its unknown noise processes. In particular, we propose a Bayesian nonparametric approach for system identification in discrete time nonlinear random dynamical systems assuming only the order of the Markov process is known. The proposed method replaces the assumption of Gaussian distributed error components with a highly flexible family of probability density functions based on Bayesian nonparametric priors. Additionally, the functional form of the system is estimated by leveraging Bayesian neural networks which also leads to flexible uncertainty quantification. Asymptotically on the number of hidden neurons, the proposed model converges to full nonparametric Bayesian regression model. A Gibbs sampler for posterior inference is proposed and its effectiveness is illustrated on simulated and real time series.

preprint2022arXiv

Temporal Parallelisation of Dynamic Programming and Linear Quadratic Control

This paper proposes a general formulation for temporal parallelisation of dynamic programming for optimal control problems. We derive the elements and associative operators to be able to use parallel scans to solve these problems with logarithmic time complexity rather than linear time complexity. We apply this methodology to problems with finite state and control spaces, linear quadratic tracking control problems, and to a class of nonlinear control problems. The computational benefits of the parallel methods are demonstrated via numerical simulations run on a graphics processing unit.

preprint2022arXiv

The Coupled Rejection Sampler

We propose a coupled rejection-sampling method for sampling from couplings of arbitrary distributions. The method relies on accepting or rejecting coupled samples coming from dominating marginals. Contrary to existing acceptance-rejection coupling methods, the variance of the execution time of the proposed method is limited and stays finite as the two target marginals approach each other in the sense of the total variation norm. In the important special case of coupling multivariate Gaussians with different means and covariances, we derive positive lower bounds for the resulting coupling probability of our algorithm, and we then show how the coupling method can be optimized in closed form. Finally, we show how we can modify the coupled rejection-sampling method to propose from coupled ensemble of proposals, so as to asymptotically recover a maximal coupling. We then apply the method to the problem of coupling rare events samplers, derive a parallel coupled resampling algorithm to use in particle filtering, and show how the coupled rejection-sampler can be used to speed up unbiased MCMC methods based on couplings.

preprint2022arXiv

Uncertainty-aware deep learning methods for robust diabetic retinopathy classification

Automatic classification of diabetic retinopathy from retinal images has been widely studied using deep neural networks with impressive results. However, there is a clinical need for estimation of the uncertainty in the classifications, a shortcoming of modern neural networks. Recently, approximate Bayesian deep learning methods have been proposed for the task but the studies have only considered the binary referable/non-referable diabetic retinopathy classification applied to benchmark datasets. We present novel results by systematically investigating a clinical dataset and a clinically relevant 5-class classification scheme, in addition to benchmark datasets and the binary classification scheme. Moreover, we derive a connection between uncertainty measures and classifier risk, from which we develop a new uncertainty measure. We observe that the previously proposed entropy-based uncertainty measure generalizes to the clinical dataset on the binary classification scheme but not on the 5-class scheme, whereas our new uncertainty measure generalizes to the latter case.

preprint2021arXiv

Parallel Iterated Extended and Sigma-point Kalman Smoothers

The problem of Bayesian filtering and smoothing in nonlinear models with additive noise is an active area of research. Classical Taylor series as well as more recent sigma-point based methods are two well-known strategies to deal with these problems. However, these methods are inherently sequential and do not in their standard formulation allow for parallelization in the time domain. In this paper, we present a set of parallel formulas that replace the existing sequential ones in order to achieve lower time (span) complexity. Our experimental results done with a graphics processing unit (GPU) illustrate the efficiency of the proposed methods over their sequential counterparts.

preprint2020arXiv

Continuous-Discrete Filtering and Smoothing on Submanifolds of Euclidean Space

In this paper the issue of filtering and smoothing in continuous discrete time is studied when the state variable evolves in some submanifold of Euclidean space, which may not have the usual Lebesgue measure. Formal expressions for prediction and smoothing problems are derived, which agree with the classical results except that the formal adjoint of the generator is different in general. For approximate filtering and smoothing the projection approach is taken, where it turns out that the prediction and smoothing equations are the same as in the case when the state variable evolves in Euclidean space. The approach is used to develop projection filters and smoothers based on the von Mises-Fisher distribution.

preprint2020arXiv

Enhancing Industrial X-ray Tomography by Data-Centric Statistical Methods

X-ray tomography has applications in various industrial fields such as sawmill industry, oil and gas industry, chemical engineering, and geotechnical engineering. In this article, we study Bayesian methods for the X-ray tomography reconstruction. In Bayesian methods, the inverse problem of tomographic reconstruction is solved with help of a statistical prior distribution which encodes the possible internal structures by assigning probabilities for smoothness and edge distribution of the object. We compare Gaussian random field priors, that favour smoothness, to non-Gaussian total variation, Besov, and Cauchy priors which promote sharp edges and high-contrast and low-contrast areas in the object. We also present computational schemes for solving the resulting high-dimensional Bayesian inverse problem with 100,000-1,000,000 unknowns. In particular, we study the applicability of a no-U-turn variant of Hamiltonian Monte Carlo methods and of a more classical adaptive Metropolis-within-Gibbs algorithm for this purpose. These methods also enable full uncertainty quantification of the reconstructions. For faster computations, we use maximum a posteriori estimates with limited-memory BFGS optimisation algorithm. As the first industrial application, we consider sawmill industry X-ray log tomography. The logs have knots, rotten parts, and even possibly metallic pieces, making them good examples for non-Gaussian priors. Secondly, we study drill-core rock sample tomography, an example from oil and gas industry. We show that Cauchy priors produce smaller number of artefacts than other choices, especially with sparse high-noise measurements, and choosing Hamiltonian Monte Carlo enables systematic uncertainty quantification.

preprint2020arXiv

Improved Calibration of Numerical Integration Error in Sigma-Point Filters

The sigma-point filters, such as the UKF, which exploit numerical quadrature to obtain an additional order of accuracy in the moment transformation step, are popular alternatives to the ubiquitous EKF. The classical quadrature rules used in the sigma-point filters are motivated via polynomial approximation of the integrand, however in the applied context these assumptions cannot always be justified. As a result, quadrature error can introduce bias into estimated moments, for which there is no compensatory mechanism in the classical sigma-point filters. This can lead in turn to estimates and predictions that are poorly calibrated. In this article, we investigate the Bayes-Sard quadrature method in the context of sigma-point filters, which enables uncertainty due to quadrature error to be formalised within a probabilistic model. Our first contribution is to derive the well-known classical quadratures as special cases of the Bayes-Sard quadrature method. Then a general-purpose moment transform is developed and utilised in the design of novel sigma-point filters, so that uncertainty due to quadrature error is explicitly quantified. Numerical experiments on a challenging tracking example with misspecified initial conditions show that the additional uncertainty quantification built into our method leads to better-calibrated state estimates with improved RMSE.

preprint2020arXiv

Kernel-based interpolation at approximate Fekete points

We construct approximate Fekete point sets for kernel-based interpolation by maximising the determinant of a kernel Gram matrix obtained via truncation of an orthonormal expansion of the kernel. Uniform error estimates are proved for kernel interpolants at the resulting points. If the kernel is Gaussian we show that the approximate Fekete points in one dimension are the solution to a convex optimisation problem and that the interpolants converge with a super-exponential rate. Numerical examples are provided for the Gaussian kernel.

preprint2020arXiv

LSD$_2$ -- Joint Denoising and Deblurring of Short and Long Exposure Images with CNNs

The paper addresses the problem of acquiring high-quality photographs with handheld smartphone cameras in low-light imaging conditions. We propose an approach based on capturing pairs of short and long exposure images in rapid succession and fusing them into a single high-quality photograph. Unlike existing methods, we take advantage of both images simultaneously and perform a joint denoising and deblurring using a convolutional neural network. A novel approach is introduced to generate realistic short-long exposure image pairs. The method produces good images in extremely challenging conditions and outperforms existing denoising and deblurring methods. It also enables exposure fusion in the presence of motion blur.

preprint2020arXiv

Maximum likelihood estimation and uncertainty quantification for Gaussian process approximation of deterministic functions

Despite the ubiquity of the Gaussian process regression model, few theoretical results are available that account for the fact that parameters of the covariance kernel typically need to be estimated from the dataset. This article provides one of the first theoretical analyses in the context of Gaussian process regression with a noiseless dataset. Specifically, we consider the scenario where the scale parameter of a Sobolev kernel (such as a Matérn kernel) is estimated by maximum likelihood. We show that the maximum likelihood estimation of the scale parameter alone provides significant adaptation against misspecification of the Gaussian process model in the sense that the model can become "slowly" overconfident at worst, regardless of the difference between the smoothness of the data-generating function and that expected by the model. The analysis is based on a combination of techniques from nonparametric regression and scattered data interpolation. Empirical results are provided in support of the theoretical findings.

preprint2020arXiv

Non-Stationary Multi-layered Gaussian Priors for Bayesian Inversion

In this article, we study Bayesian inverse problems with multi-layered Gaussian priors. We first describe the conditionally Gaussian layers in terms of a system of stochastic partial differential equations. We build the computational inference method using a finite-dimensional Galerkin method. We show that the proposed approximation has a convergence-in-probability property to the solution of the original multi-layered model. We then carry out Bayesian inference using the preconditioned Crank--Nicolson algorithm which is modified to work with multi-layered Gaussian fields. We show via numerical experiments in signal deconvolution and computerized X-ray tomography problems that the proposed method can offer both smoothing and edge preservation at the same time.

preprint2020arXiv

On stability of a class of filters for non-linear stochastic systems

This article develops a comprehensive framework for stability analysis of a broad class of commonly used continuous and discrete time-filters for stochastic dynamic systems with non-linear state dynamics and linear measurements under certain strong assumptions. The class of filters encompasses the extended and unscented Kalman filters and most other Gaussian assumed density filters and their numerical integration approximations. The stability results are in the form of time-uniform mean square bounds and exponential concentration inequalities for the filtering error. In contrast to existing results, it is not always necessary for the model to be exponentially stable or fully observed. We review three classes of models that can be rigorously shown to satisfy the stringent assumptions of the stability theorems. Numerical experiments using synthetic data validate the derived error bounds.

preprint2020arXiv

Taylor Moment Expansion for Continuous-Discrete Gaussian Filtering and Smoothing

The paper is concerned with non-linear Gaussian filtering and smoothing in continuous-discrete state-space models, where the dynamic model is formulated as an Itô stochastic differential equation (SDE), and the measurements are obtained at discrete time instants. We propose novel Taylor moment expansion (TME) Gaussian filter and smoother which approximate the moments of the SDE with a temporal Taylor expansion. Differently from classical linearisation or Itô--Taylor approaches, the Taylor expansion is formed for the moment functions directly and in time variable, not by using a Taylor expansion on the non-linear functions in the model. We analyse the theoretical properties, including the positive definiteness of the covariance estimate and stability of the TME Gaussian filter and smoother. By numerical experiments, we demonstrate that the proposed TME Gaussian filter and smoother significantly outperform the state-of-the-art methods in terms of estimation accuracy and numerical stability.

preprint2020arXiv

Temporal Parallelization of Bayesian Smoothers

This paper presents algorithms for temporal parallelization of Bayesian smoothers. We define the elements and the operators to pose these problems as the solutions to all-prefix-sums operations for which efficient parallel scan-algorithms are available. We present the temporal parallelization of the general Bayesian filtering and smoothing equations and specialize them to linear/Gaussian models. The advantage of the proposed algorithms is that they reduce the linear complexity of standard smoothing algorithms with respect to time to logarithmic.

preprint2020arXiv

Worst-case optimal approximation with increasingly flat Gaussian kernels

We study worst-case optimal approximation of positive linear functionals in reproducing kernel Hilbert spaces induced by increasingly flat Gaussian kernels. This provides a new perspective and some generalisations to the problem of interpolation with increasingly flat radial basis functions. When the evaluation points are fixed and unisolvent, we show that the worst-case optimal method converges to a polynomial method. In an additional one-dimensional extension, we allow also the points to be selected optimally and show that in this case convergence is to the unique Gaussian quadrature type method that achieves the maximal polynomial degree of exactness. The proofs are based on an explicit characterisation of the reproducing kernel Hilbert space of the Gaussian kernel in terms of exponentially damped polynomials.

preprint2019arXiv

Hilbert Space Methods for Reduced-Rank Gaussian Process Regression

This paper proposes a novel scheme for reduced-rank Gaussian process regression. The method is based on an approximate series expansion of the covariance function in terms of an eigenfunction expansion of the Laplace operator in a compact subset of $\mathbb{R}^d$. On this approximate eigenbasis the eigenvalues of the covariance function can be expressed as simple functions of the spectral density of the Gaussian process, which allows the GP inference to be solved under a computational cost scaling as $\mathcal{O}(nm^2)$ (initial) and $\mathcal{O}(m^3)$ (hyperparameter learning) with $m$ basis functions and $n$ data points. Furthermore, the basis functions are independent of the parameters of the covariance function, which allows for very fast hyperparameter learning. The approach also allows for rigorous error analysis with Hilbert space theory, and we show that the approximation becomes exact when the size of the compact subset and the number of eigenfunctions go to infinity. We also show that the convergence rate of the truncation error is independent of the input dimensionality provided that the differentiability order of the covariance function is increases appropriately, and for the squared exponential covariance function it is always bounded by ${\sim}1/m$ regardless of the input dimensionality. The expansion generalizes to Hilbert spaces with an inner product which is defined as an integral over a specified input density. The method is compared to previously proposed methods theoretically and through empirical tests with simulated and real data.

preprint2016arXiv

Computationally Efficient Bayesian Learning of Gaussian Process State Space Models

Gaussian processes allow for flexible specification of prior assumptions of unknown dynamics in state space models. We present a procedure for efficient Bayesian learning in Gaussian process state space models, where the representation is formed by projecting the problem onto a set of approximate eigenfunctions derived from the prior covariance structure. Learning under this family of models can be conducted using a carefully crafted particle MCMC algorithm. This scheme is computationally efficient and yet allows for a fully Bayesian treatment of the problem. Compared to conventional system identification tools or existing learning methods, we show competitive performance and reliable quantification of uncertainties in the model.

preprint2016arXiv

Regularizing Solutions to the MEG Inverse Problem Using Space-Time Separable Covariance Functions

In magnetoencephalography (MEG) the conventional approach to source reconstruction is to solve the underdetermined inverse problem independently over time and space. Here we present how the conventional approach can be extended by regularizing the solution in space and time by a Gaussian process (Gaussian random field) model. Assuming a separable covariance function in space and time, the computational complexity of the proposed model becomes (without any further assumptions or restrictions) $\mathcal{O}(t^3 + n^3 + m^2n)$, where $t$ is the number of time steps, $m$ is the number of sources, and $n$ is the number of sensors. We apply the method to both simulated and empirical data, and demonstrate the efficiency and generality of our Bayesian source reconstruction approach which subsumes various classical approaches in the literature.

preprint2015arXiv

A Bayesian Particle Filtering Method For Brain Source Localisation

In this paper, we explore the multiple source localisation problem in the cerebral cortex using magnetoencephalography (MEG) data. We model neural currents as point-wise dipolar sources which dynamically evolve over time, then model dipole dynamics using a probabilistic state space model in which dipole locations are strictly constrained to lie within the cortex. Based on the proposed models, we develop a Bayesian particle filtering algorithm for localisation of both known and unknown numbers of dipoles. The algorithm consists of a region of interest (ROI) estimation step for initial dipole number estimation, a Gibbs multiple particle filter (GMPF) step for individual dipole state estimation, and a selection criterion step for selecting the final estimates. The estimated results from the ROI estimation are used to adaptively adjust particle filter's sample size to reduce the overall computational cost. The proposed models and the algorithm are tested in numerical experiments. Results are compared with existing particle filtering methods. The numerical results show that the proposed methods can achieve improved performance metrics in terms of dipole number estimation and dipole localisation.

preprint2015arXiv

Combining Particle MCMC with Rao-Blackwellized Monte Carlo Data Association for Parameter Estimation in Multiple Target Tracking

We consider state and parameter estimation in multiple target tracking problems with data association uncertainties and unknown number of targets. We show how the problem can be recast into a conditionally linear Gaussian state-space model with unknown parameters and present an algorithm for computationally efficient inference on the resulting model. The proposed algorithm is based on combining the Rao-Blackwellized Monte Carlo data association algorithm with particle Markov chain Monte Carlo algorithms to jointly estimate both parameters and data associations. Both particle marginal Metropolis-Hastings and particle Gibbs variants of particle MCMC are considered. We demonstrate the performance of the method both using simulated data and in a real-data case study of using multiple target tracking to estimate the brown bear population in Finland.

preprint2015arXiv

Nonlinear State Space Model Identification Using a Regularized Basis Function Expansion

This paper is concerned with black-box identification of nonlinear state space models. By using a basis function expansion within the state space model, we obtain a flexible structure. The model is identified using an expectation maximization approach, where the states and the parameters are updated iteratively in such a way that a maximum likelihood estimate is obtained. We use recent particle methods with sound theoretical properties to infer the states, whereas the model parameters can be updated using closed-form expressions by exploiting the fact that our model is linear in the parameters. Not to over-fit the flexible model to the data, we also propose a regularization scheme without increasing the computational burden. Importantly, this opens up for systematic use of regularization in nonlinear state space models. We conclude by evaluating our proposed approach on one simulation example and two real-data problems.

preprint2015arXiv

On the relation between Gaussian process quadratures and sigma-point methods

This article is concerned with Gaussian process quadratures, which are numerical integration methods based on Gaussian process regression methods, and sigma-point methods, which are used in advanced non-linear Kalman filtering and smoothing algorithms. We show that many sigma-point methods can be interpreted as Gaussian quadrature based methods with suitably selected covariance functions. We show that this interpretation also extends to more general multivariate Gauss--Hermite integration methods and related spherical cubature rules. Additionally, we discuss different criteria for selecting the sigma-point locations: exactness for multivariate polynomials up to a given order, minimum average error, and quasi-random point sets. The performance of the different methods is tested in numerical experiments.

preprint2015arXiv

Probability Measures for Numerical Solutions of Differential Equations

In this paper, we present a formal quantification of epistemic uncertainty induced by numerical solutions of ordinary and partial differential equation models. Numerical solutions of differential equations contain inherent uncertainties due to the finite dimensional approximation of an unknown and implicitly defined function. When statistically analysing models based on differential equations describing physical, or other naturally occurring, phenomena, it is therefore important to explicitly account for the uncertainty introduced by the numerical method. This enables objective determination of its importance relative to other uncertainties, such as those caused by data contaminated with noise or model error induced by missing physical or inadequate descriptors. To this end we show that a wide variety of existing solvers can be randomised, inducing a probability measure over the solutions of such differential equations. These measures exhibit contraction to a Dirac measure around the true unknown solution, where the rates of convergence are consistent with the underlying deterministic numerical method. Ordinary differential equations and elliptic partial differential equations are used to illustrate the approach to quantifying uncertainty in both the statistical analysis of the forward and inverse problems.

preprint2015arXiv

Rao-Blackwellized particle smoothers for conditionally linear Gaussian models

Sequential Monte Carlo (SMC) methods, such as the particle filter, are by now one of the standard computational techniques for addressing the filtering problem in general state-space models. However, many applications require post-processing of data offline. In such scenarios the smoothing problem--in which all the available data is used to compute state estimates--is of central interest. We consider the smoothing problem for a class of conditionally linear Gaussian models. We present a forward-backward-type Rao-Blackwellized particle smoother (RBPS) that is able to exploit the tractable substructure present in these models. Akin to the well known Rao-Blackwellized particle filter, the proposed RBPS marginalizes out a conditionally tractable subset of state variables, effectively making use of SMC only for the "intractable part" of the model. Compared to existing RBPS, two key features of the proposed method are: (i) it does not require structural approximations of the model, and (ii) the aforementioned marginalization is done both in the forward direction and in the backward direction.

preprint2015arXiv

Sigma-Point Filtering and Smoothing Based Parameter Estimation in Nonlinear Dynamic Systems

We consider approximate maximum likelihood parameter estimation in nonlinear state-space models. We discuss both direct optimization of the likelihood and expectation--maximization (EM). For EM, we also give closed-form expressions for the maximization step in a class of models that are linear in parameters and have additive noise. To obtain approximations to the filtering and smoothing distributions needed in the likelihood-maximization methods, we focus on using Gaussian filtering and smoothing algorithms that employ sigma-points to approximate the required integrals. We discuss different sigma-point schemes based on the third, fifth, seventh, and ninth order unscented transforms and the Gauss--Hermite quadrature rule. We compare the performance of the methods in two simulated experiments: a univariate nonlinear growth model as well as tracking of a maneuvering target. In the experiments, we also compare against approximate likelihood estimates obtained by particle filtering and extended Kalman filtering based methods. The experiments suggest that the higher-order unscented transforms may in some cases provide more accurate estimates

preprint2014arXiv

Adaptive Metropolis Algorithm Using Variational Bayesian Adaptive Kalman Filter

Markov chain Monte Carlo (MCMC) methods are powerful computational tools for analysis of complex statistical problems. However, their computational efficiency is highly dependent on the chosen proposal distribution, which is generally difficult to find. One way to solve this problem is to use adaptive MCMC algorithms which automatically tune the statistics of a proposal distribution during the MCMC run. A new adaptive MCMC algorithm, called the variational Bayesian adaptive Metropolis (VBAM) algorithm, is developed. The VBAM algorithm updates the proposal covariance matrix using the variational Bayesian adaptive Kalman filter (VB-AKF). A strong law of large numbers for the VBAM algorithm is proven. The empirical convergence results for three simulated examples and for two real data examples are also provided.

preprint2014arXiv

Batch Nonlinear Continuous-Time Trajectory Estimation as Exactly Sparse Gaussian Process Regression

In this paper, we revisit batch state estimation through the lens of Gaussian process (GP) regression. We consider continuous-discrete estimation problems wherein a trajectory is viewed as a one-dimensional GP, with time as the independent variable. Our continuous-time prior can be defined by any nonlinear, time-varying stochastic differential equation driven by white noise; this allows the possibility of smoothing our trajectory estimates using a variety of vehicle dynamics models (e.g., `constant-velocity'). We show that this class of prior results in an inverse kernel matrix (i.e., covariance matrix between all pairs of measurement times) that is exactly sparse (block-tridiagonal) and that this can be exploited to carry out GP regression (and interpolation) very efficiently. When the prior is based on a linear, time-varying stochastic differential equation and the measurement model is also linear, this GP approach is equivalent to classical, discrete-time smoothing (at the measurement times); when a nonlinearity is present, we iterate over the whole trajectory to maximize accuracy. We test the approach experimentally on a simultaneous trajectory estimation and mapping problem using a mobile robot dataset.

preprint2014arXiv

Gaussian filtering and variational approximations for Bayesian smoothing in continuous-discrete stochastic dynamic systems

The Bayesian smoothing equations are generally intractable for systems described by nonlinear stochastic differential equations and discrete-time measurements. Gaussian approximations are a computationally efficient way to approximate the true smoothing distribution. In this work, we present a comparison between two Gaussian approximation methods. The Gaussian filtering based Gaussian smoother uses a Gaussian approximation for the filtering distribution to form an approximation for the smoothing distribution. The variational Gaussian smoother is based on minimizing the Kullback-Leibler divergence of the approximate smoothing distribution with respect to the true distribution. The results suggest that for highly nonlinear systems, the variational Gaussian smoother can be used to iteratively improve the Gaussian filtering based smoothing solution. We also present linearization and sigma-point methods to approximate the intractable Gaussian expectations in the Variational Gaussian smoothing equations. In addition, we extend the variational Gaussian smoother for certain class of systems with singular diffusion matrix.

preprint2014arXiv

Moment Conditions for Convergence of Particle Filters with Unbounded Importance Weights

In this paper, we derive moment conditions for particle filter importance weights, which ensure that the particle filter estimates of the expectations of bounded Borel functions converge in mean square and $L^4$ sense, and that the empirical measure of the particle filter converges weakly to the true filtering measure. The result extends the previously derived conditions by not requiring the boundedness of the importance weights, but only boundedness of second or fourth order moments. We show that the boundedness of the second order moments of the weights implies the convergence of the estimates bounded functions in the mean square sense, and the $L^4$ convergence as well as the empirical measure convergence are assured by the boundedness of the fourth order moments of the weights. We also present an example class of models and importance distributions where the moment conditions hold, but the boundedness does not. The unboundedness in these models is caused by point-singularities in the weights which still leave the weight moments bounded. We show by using simulated data that the particle filter for this kind of model also performs well in practice.

preprint2014arXiv

Series Expansion Approximations of Brownian Motion for Non-Linear Kalman Filtering of Diffusion Processes

In this paper, we describe a novel application of sigma-point methods to continuous-discrete filtering. In principle, the nonlinear continuous- discrete filtering problem can be solved exactly. In practice, the solution contains terms that are computationally intractible. Assumed density filtering methods attempt to match statistics of the filtering distribution to some set of more tractible probability distributions. We describe a novel method that decomposes the Brownian motion driving the signal in a generalised Fourier series, which is truncated after a number of terms. This approximation to Brownian can be described using a relatively small number of Fourier coefficients, and allows us to compute statistics of the filtering distribution with a single application of a sigma-point method. Assumed density filters that exist in the literature usually rely on discretisation of the signal dynamics followed by iterated application of a sigma point transform (or a limiting case thereof). Iterating the transform in this manner can lead to loss of information about the filtering distri- bution in highly nonlinear settings. We demonstrate that our method is better equipped to cope with such problems.

preprint2014arXiv

Sparse approximations of fractional Matérn fields

We consider a fast approximation method for a solution of a certain stochastic non-local pseudodifferential equation. This equation defines a Matérn class random field. The approximation method is based on the spectral compactness of the solution. We approximate the pseudodifferential operator with a Taylor expansion. By truncating the expansion, we can construct an approximation with Gaussian Markov random fields. We show that the solution of the truncated version can be constructed with an over-determined system of stochastic matrix equations with sparse matrices. We solve the system of equations with a sparse Cholesky decomposition. We consider the convergence of the discrete approximation of the solution to the continuous one. Finally numerical examples are given.

preprint2013arXiv

Infinite-dimensional Bayesian filtering for detection of quasi-periodic phenomena in spatio-temporal data

This paper introduces a spatio-temporal resonator model and an inference method for detection and estimation of nearly periodic temporal phenomena in spatio-temporal data. The model is derived as a spatial extension of a stochastic harmonic resonator model, which can be formulated in terms of a stochastic differential equation (SDE). The spatial structure is included by introducing linear operators, which affect both the oscillations and damping, and by choosing the appropriate spatial covariance structure of the driving time-white noise process. With the choice of the linear operators as partial differential operators, the resonator model becomes a stochastic partial differential equation (SPDE), which is compatible with infinite-dimensional Kalman filtering. The resulting infinite-dimensional Kalman filtering problem allows for a computationally efficient solution as the computational cost scales linearly with measurements in the temporal dimension. This framework is applied to weather prediction and to physiological noise elimination in fMRI brain data.

Simo Särkkä

What is connected

Connect this record

See the researcher in context

Building this map preview

37 published item(s)

Dual-Level Models for Physics-Informed Multi-Step Time Series Forecasting

De-Sequentialized Monte Carlo: a parallel-in-time particle smoother

Multidimensional Projection Filters via Automatic Differentiation and Sparse-Grid Integration

Online Pole Segmentation on Range Images for Long-term LiDAR Localization in Urban Environments

System identification using Bayesian neural networks with nonparametric noise models

Temporal Parallelisation of Dynamic Programming and Linear Quadratic Control

The Coupled Rejection Sampler

Uncertainty-aware deep learning methods for robust diabetic retinopathy classification

Parallel Iterated Extended and Sigma-point Kalman Smoothers

Continuous-Discrete Filtering and Smoothing on Submanifolds of Euclidean Space

Enhancing Industrial X-ray Tomography by Data-Centric Statistical Methods

Improved Calibration of Numerical Integration Error in Sigma-Point Filters

Kernel-based interpolation at approximate Fekete points

LSD$_2$ -- Joint Denoising and Deblurring of Short and Long Exposure Images with CNNs

Maximum likelihood estimation and uncertainty quantification for Gaussian process approximation of deterministic functions

Non-Stationary Multi-layered Gaussian Priors for Bayesian Inversion

On stability of a class of filters for non-linear stochastic systems

Taylor Moment Expansion for Continuous-Discrete Gaussian Filtering and Smoothing

Temporal Parallelization of Bayesian Smoothers

Worst-case optimal approximation with increasingly flat Gaussian kernels

Hilbert Space Methods for Reduced-Rank Gaussian Process Regression

Computationally Efficient Bayesian Learning of Gaussian Process State Space Models

Regularizing Solutions to the MEG Inverse Problem Using Space-Time Separable Covariance Functions

A Bayesian Particle Filtering Method For Brain Source Localisation

Combining Particle MCMC with Rao-Blackwellized Monte Carlo Data Association for Parameter Estimation in Multiple Target Tracking

Nonlinear State Space Model Identification Using a Regularized Basis Function Expansion

On the relation between Gaussian process quadratures and sigma-point methods

Probability Measures for Numerical Solutions of Differential Equations

Rao-Blackwellized particle smoothers for conditionally linear Gaussian models

Sigma-Point Filtering and Smoothing Based Parameter Estimation in Nonlinear Dynamic Systems

Adaptive Metropolis Algorithm Using Variational Bayesian Adaptive Kalman Filter

Batch Nonlinear Continuous-Time Trajectory Estimation as Exactly Sparse Gaussian Process Regression

Gaussian filtering and variational approximations for Bayesian smoothing in continuous-discrete stochastic dynamic systems

Moment Conditions for Convergence of Particle Filters with Unbounded Importance Weights

Series Expansion Approximations of Brownian Motion for Non-Linear Kalman Filtering of Diffusion Processes

Sparse approximations of fractional Matérn fields

Infinite-dimensional Bayesian filtering for detection of quasi-periodic phenomena in spatio-temporal data