Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
21works
0followers
20topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

21 published item(s)

preprint2026arXiv

Dual-Level Models for Physics-Informed Multi-Step Time Series Forecasting

This paper develops an approach for multi-step forecasting of dynamical systems by integrating probabilistic input forecasting with physics-informed output prediction. Accurate multi-step forecasting of time series systems is important for the automatic control and optimization of physical processes, enabling more precise decision-making. While mechanistic-based and data-driven machine learning (ML) approaches have been employed for time series forecasting, they face significant limitations. Incomplete knowledge of process mathematical models limits mechanistic-based direct employment, while purely data-driven ML models struggle with dynamic environments, leading to poor generalization. To address these limitations, this paper proposes a dual-level strategy for physics-informed forecasting of dynamical systems. On the first level, input variables are forecast using a hybrid method that integrates a long short-term memory (LSTM) network into probabilistic state transition models (STMs). On the second level, these stochastically predicted inputs are sequentially fed into a physics-informed neural network (PINN) to generate multi-step output predictions. The experimental results of the paper demonstrate that the hybrid input forecasting models achieve a higher log-likelihood and lower mean squared errors (MSE) compared to conventional STMs. Furthermore, the PINNs driven by the input forecasting models outperform their purely data-driven counterparts in terms of MSE and log-likelihood, exhibiting stronger generalization and forecasting performance across multiple test cases.

preprint2022arXiv

De-Sequentialized Monte Carlo: a parallel-in-time particle smoother

Particle smoothers are SMC (Sequential Monte Carlo) algorithms designed to approximate the joint distribution of the states given observations from a state-space model. We propose dSMC (de-Sequentialized Monte Carlo), a new particle smoother that is able to process $T$ observations in $\mathcal{O}(\log T)$ time on parallel architecture. This compares favourably with standard particle smoothers, the complexity of which is linear in $T$. We derive $\mathcal{L}_p$ convergence results for dSMC, with an explicit upper bound, polynomial in $T$. We then discuss how to reduce the variance of the smoothing estimates computed by dSMC by (i) designing good proposal distributions for sampling the particles at the initialization of the algorithm, as well as by (ii) using lazy resampling to increase the number of particles used in dSMC. Finally, we design a particle Gibbs sampler based on dSMC, which is able to perform parameter inference in a state-space model at a $\mathcal{O}(\log(T))$ cost on parallel hardware.

preprint2022arXiv

Multidimensional Projection Filters via Automatic Differentiation and Sparse-Grid Integration

The projection filter is a technique for approximating the solutions of optimal filtering problems. In projection filters, the Kushner--Stratonovich stochastic partial differential equation that governs the propagation of the optimal filtering density is projected to a manifold of parametric densities, resulting in a finite-dimensional stochastic differential equation. Despite the fact that projection filters are capable of representing complicated probability densities, their current implementations are limited to Gaussian family or unidimensional filtering applications. This work considers a combination of numerical integration and automatic differentiation to construct projection filter algorithms for more generic problems. Specifically, we provide a detailed exposition of this combination for the manifold of the exponential family, and show how to apply the projection filter to multidimensional cases. We demonstrate numerically that based on comparison to a finite-difference solution to the Kushner--Stratonovich equation and a bootstrap particle filter with systematic resampling, the proposed algorithm retains an accurate approximation of the filtering density while requiring a comparatively low number of quadrature points. Due to the sparse-grid integration and automatic differentiation used to calculate the expected values of the natural statistics and the Fisher metric, the proposed filtering algorithms are highly scalable. They therefore are suitable to many applications in which the number of dimensions exceeds the practical limit of particle filters, but where the Gaussian-approximations are deemed unsatisfactory.

preprint2022arXiv

Online Pole Segmentation on Range Images for Long-term LiDAR Localization in Urban Environments

Robust and accurate localization is a basic requirement for mobile autonomous systems. Pole-like objects, such as traffic signs, poles, and lamps are frequently used landmarks for localization in urban environments due to their local distinctiveness and long-term stability. In this paper, we present a novel, accurate, and fast pole extraction approach based on geometric features that runs online and has little computational demands. Our method performs all computations directly on range images generated from 3D LiDAR scans, which avoids processing 3D point clouds explicitly and enables fast pole extraction for each scan. We further use the extracted poles as pseudo labels to train a deep neural network for online range image-based pole segmentation. We test both our geometric and learning-based pole extraction methods for localization on different datasets with different LiDAR scanners, routes, and seasonal changes. The experimental results show that our methods outperform other state-of-the-art approaches. Moreover, boosted with pseudo pole labels extracted from multiple datasets, our learning-based method can run across different datasets and achieve even better localization results compared to our geometry-based method. We released our pole datasets to the public for evaluating the performance of pole extractors, as well as the implementation of our approach.

preprint2022arXiv

System identification using Bayesian neural networks with nonparametric noise models

System identification is of special interest in science and engineering. This article is concerned with a system identification problem arising in stochastic dynamic systems, where the aim is to estimate the parameters of a system along with its unknown noise processes. In particular, we propose a Bayesian nonparametric approach for system identification in discrete time nonlinear random dynamical systems assuming only the order of the Markov process is known. The proposed method replaces the assumption of Gaussian distributed error components with a highly flexible family of probability density functions based on Bayesian nonparametric priors. Additionally, the functional form of the system is estimated by leveraging Bayesian neural networks which also leads to flexible uncertainty quantification. Asymptotically on the number of hidden neurons, the proposed model converges to full nonparametric Bayesian regression model. A Gibbs sampler for posterior inference is proposed and its effectiveness is illustrated on simulated and real time series.

preprint2022arXiv

Temporal Parallelisation of Dynamic Programming and Linear Quadratic Control

This paper proposes a general formulation for temporal parallelisation of dynamic programming for optimal control problems. We derive the elements and associative operators to be able to use parallel scans to solve these problems with logarithmic time complexity rather than linear time complexity. We apply this methodology to problems with finite state and control spaces, linear quadratic tracking control problems, and to a class of nonlinear control problems. The computational benefits of the parallel methods are demonstrated via numerical simulations run on a graphics processing unit.

preprint2022arXiv

The Coupled Rejection Sampler

We propose a coupled rejection-sampling method for sampling from couplings of arbitrary distributions. The method relies on accepting or rejecting coupled samples coming from dominating marginals. Contrary to existing acceptance-rejection coupling methods, the variance of the execution time of the proposed method is limited and stays finite as the two target marginals approach each other in the sense of the total variation norm. In the important special case of coupling multivariate Gaussians with different means and covariances, we derive positive lower bounds for the resulting coupling probability of our algorithm, and we then show how the coupling method can be optimized in closed form. Finally, we show how we can modify the coupled rejection-sampling method to propose from coupled ensemble of proposals, so as to asymptotically recover a maximal coupling. We then apply the method to the problem of coupling rare events samplers, derive a parallel coupled resampling algorithm to use in particle filtering, and show how the coupled rejection-sampler can be used to speed up unbiased MCMC methods based on couplings.

preprint2022arXiv

Uncertainty-aware deep learning methods for robust diabetic retinopathy classification

Automatic classification of diabetic retinopathy from retinal images has been widely studied using deep neural networks with impressive results. However, there is a clinical need for estimation of the uncertainty in the classifications, a shortcoming of modern neural networks. Recently, approximate Bayesian deep learning methods have been proposed for the task but the studies have only considered the binary referable/non-referable diabetic retinopathy classification applied to benchmark datasets. We present novel results by systematically investigating a clinical dataset and a clinically relevant 5-class classification scheme, in addition to benchmark datasets and the binary classification scheme. Moreover, we derive a connection between uncertainty measures and classifier risk, from which we develop a new uncertainty measure. We observe that the previously proposed entropy-based uncertainty measure generalizes to the clinical dataset on the binary classification scheme but not on the 5-class scheme, whereas our new uncertainty measure generalizes to the latter case.

preprint2021arXiv

Parallel Iterated Extended and Sigma-point Kalman Smoothers

The problem of Bayesian filtering and smoothing in nonlinear models with additive noise is an active area of research. Classical Taylor series as well as more recent sigma-point based methods are two well-known strategies to deal with these problems. However, these methods are inherently sequential and do not in their standard formulation allow for parallelization in the time domain. In this paper, we present a set of parallel formulas that replace the existing sequential ones in order to achieve lower time (span) complexity. Our experimental results done with a graphics processing unit (GPU) illustrate the efficiency of the proposed methods over their sequential counterparts.

preprint2020arXiv

Continuous-Discrete Filtering and Smoothing on Submanifolds of Euclidean Space

In this paper the issue of filtering and smoothing in continuous discrete time is studied when the state variable evolves in some submanifold of Euclidean space, which may not have the usual Lebesgue measure. Formal expressions for prediction and smoothing problems are derived, which agree with the classical results except that the formal adjoint of the generator is different in general. For approximate filtering and smoothing the projection approach is taken, where it turns out that the prediction and smoothing equations are the same as in the case when the state variable evolves in Euclidean space. The approach is used to develop projection filters and smoothers based on the von Mises-Fisher distribution.

preprint2020arXiv

Enhancing Industrial X-ray Tomography by Data-Centric Statistical Methods

X-ray tomography has applications in various industrial fields such as sawmill industry, oil and gas industry, chemical engineering, and geotechnical engineering. In this article, we study Bayesian methods for the X-ray tomography reconstruction. In Bayesian methods, the inverse problem of tomographic reconstruction is solved with help of a statistical prior distribution which encodes the possible internal structures by assigning probabilities for smoothness and edge distribution of the object. We compare Gaussian random field priors, that favour smoothness, to non-Gaussian total variation, Besov, and Cauchy priors which promote sharp edges and high-contrast and low-contrast areas in the object. We also present computational schemes for solving the resulting high-dimensional Bayesian inverse problem with 100,000-1,000,000 unknowns. In particular, we study the applicability of a no-U-turn variant of Hamiltonian Monte Carlo methods and of a more classical adaptive Metropolis-within-Gibbs algorithm for this purpose. These methods also enable full uncertainty quantification of the reconstructions. For faster computations, we use maximum a posteriori estimates with limited-memory BFGS optimisation algorithm. As the first industrial application, we consider sawmill industry X-ray log tomography. The logs have knots, rotten parts, and even possibly metallic pieces, making them good examples for non-Gaussian priors. Secondly, we study drill-core rock sample tomography, an example from oil and gas industry. We show that Cauchy priors produce smaller number of artefacts than other choices, especially with sparse high-noise measurements, and choosing Hamiltonian Monte Carlo enables systematic uncertainty quantification.

preprint2020arXiv

Improved Calibration of Numerical Integration Error in Sigma-Point Filters

The sigma-point filters, such as the UKF, which exploit numerical quadrature to obtain an additional order of accuracy in the moment transformation step, are popular alternatives to the ubiquitous EKF. The classical quadrature rules used in the sigma-point filters are motivated via polynomial approximation of the integrand, however in the applied context these assumptions cannot always be justified. As a result, quadrature error can introduce bias into estimated moments, for which there is no compensatory mechanism in the classical sigma-point filters. This can lead in turn to estimates and predictions that are poorly calibrated. In this article, we investigate the Bayes-Sard quadrature method in the context of sigma-point filters, which enables uncertainty due to quadrature error to be formalised within a probabilistic model. Our first contribution is to derive the well-known classical quadratures as special cases of the Bayes-Sard quadrature method. Then a general-purpose moment transform is developed and utilised in the design of novel sigma-point filters, so that uncertainty due to quadrature error is explicitly quantified. Numerical experiments on a challenging tracking example with misspecified initial conditions show that the additional uncertainty quantification built into our method leads to better-calibrated state estimates with improved RMSE.

preprint2020arXiv

Kernel-based interpolation at approximate Fekete points

We construct approximate Fekete point sets for kernel-based interpolation by maximising the determinant of a kernel Gram matrix obtained via truncation of an orthonormal expansion of the kernel. Uniform error estimates are proved for kernel interpolants at the resulting points. If the kernel is Gaussian we show that the approximate Fekete points in one dimension are the solution to a convex optimisation problem and that the interpolants converge with a super-exponential rate. Numerical examples are provided for the Gaussian kernel.

preprint2020arXiv

LSD$_2$ -- Joint Denoising and Deblurring of Short and Long Exposure Images with CNNs

The paper addresses the problem of acquiring high-quality photographs with handheld smartphone cameras in low-light imaging conditions. We propose an approach based on capturing pairs of short and long exposure images in rapid succession and fusing them into a single high-quality photograph. Unlike existing methods, we take advantage of both images simultaneously and perform a joint denoising and deblurring using a convolutional neural network. A novel approach is introduced to generate realistic short-long exposure image pairs. The method produces good images in extremely challenging conditions and outperforms existing denoising and deblurring methods. It also enables exposure fusion in the presence of motion blur.

preprint2020arXiv

Maximum likelihood estimation and uncertainty quantification for Gaussian process approximation of deterministic functions

Despite the ubiquity of the Gaussian process regression model, few theoretical results are available that account for the fact that parameters of the covariance kernel typically need to be estimated from the dataset. This article provides one of the first theoretical analyses in the context of Gaussian process regression with a noiseless dataset. Specifically, we consider the scenario where the scale parameter of a Sobolev kernel (such as a Matérn kernel) is estimated by maximum likelihood. We show that the maximum likelihood estimation of the scale parameter alone provides significant adaptation against misspecification of the Gaussian process model in the sense that the model can become "slowly" overconfident at worst, regardless of the difference between the smoothness of the data-generating function and that expected by the model. The analysis is based on a combination of techniques from nonparametric regression and scattered data interpolation. Empirical results are provided in support of the theoretical findings.

preprint2020arXiv

Non-Stationary Multi-layered Gaussian Priors for Bayesian Inversion

In this article, we study Bayesian inverse problems with multi-layered Gaussian priors. We first describe the conditionally Gaussian layers in terms of a system of stochastic partial differential equations. We build the computational inference method using a finite-dimensional Galerkin method. We show that the proposed approximation has a convergence-in-probability property to the solution of the original multi-layered model. We then carry out Bayesian inference using the preconditioned Crank--Nicolson algorithm which is modified to work with multi-layered Gaussian fields. We show via numerical experiments in signal deconvolution and computerized X-ray tomography problems that the proposed method can offer both smoothing and edge preservation at the same time.

preprint2020arXiv

On stability of a class of filters for non-linear stochastic systems

This article develops a comprehensive framework for stability analysis of a broad class of commonly used continuous and discrete time-filters for stochastic dynamic systems with non-linear state dynamics and linear measurements under certain strong assumptions. The class of filters encompasses the extended and unscented Kalman filters and most other Gaussian assumed density filters and their numerical integration approximations. The stability results are in the form of time-uniform mean square bounds and exponential concentration inequalities for the filtering error. In contrast to existing results, it is not always necessary for the model to be exponentially stable or fully observed. We review three classes of models that can be rigorously shown to satisfy the stringent assumptions of the stability theorems. Numerical experiments using synthetic data validate the derived error bounds.

preprint2020arXiv

Taylor Moment Expansion for Continuous-Discrete Gaussian Filtering and Smoothing

The paper is concerned with non-linear Gaussian filtering and smoothing in continuous-discrete state-space models, where the dynamic model is formulated as an Itô stochastic differential equation (SDE), and the measurements are obtained at discrete time instants. We propose novel Taylor moment expansion (TME) Gaussian filter and smoother which approximate the moments of the SDE with a temporal Taylor expansion. Differently from classical linearisation or Itô--Taylor approaches, the Taylor expansion is formed for the moment functions directly and in time variable, not by using a Taylor expansion on the non-linear functions in the model. We analyse the theoretical properties, including the positive definiteness of the covariance estimate and stability of the TME Gaussian filter and smoother. By numerical experiments, we demonstrate that the proposed TME Gaussian filter and smoother significantly outperform the state-of-the-art methods in terms of estimation accuracy and numerical stability.

preprint2020arXiv

Temporal Parallelization of Bayesian Smoothers

This paper presents algorithms for temporal parallelization of Bayesian smoothers. We define the elements and the operators to pose these problems as the solutions to all-prefix-sums operations for which efficient parallel scan-algorithms are available. We present the temporal parallelization of the general Bayesian filtering and smoothing equations and specialize them to linear/Gaussian models. The advantage of the proposed algorithms is that they reduce the linear complexity of standard smoothing algorithms with respect to time to logarithmic.

preprint2020arXiv

Worst-case optimal approximation with increasingly flat Gaussian kernels

We study worst-case optimal approximation of positive linear functionals in reproducing kernel Hilbert spaces induced by increasingly flat Gaussian kernels. This provides a new perspective and some generalisations to the problem of interpolation with increasingly flat radial basis functions. When the evaluation points are fixed and unisolvent, we show that the worst-case optimal method converges to a polynomial method. In an additional one-dimensional extension, we allow also the points to be selected optimally and show that in this case convergence is to the unique Gaussian quadrature type method that achieves the maximal polynomial degree of exactness. The proofs are based on an explicit characterisation of the reproducing kernel Hilbert space of the Gaussian kernel in terms of exponentially damped polynomials.

preprint2019arXiv

Hilbert Space Methods for Reduced-Rank Gaussian Process Regression

This paper proposes a novel scheme for reduced-rank Gaussian process regression. The method is based on an approximate series expansion of the covariance function in terms of an eigenfunction expansion of the Laplace operator in a compact subset of $\mathbb{R}^d$. On this approximate eigenbasis the eigenvalues of the covariance function can be expressed as simple functions of the spectral density of the Gaussian process, which allows the GP inference to be solved under a computational cost scaling as $\mathcal{O}(nm^2)$ (initial) and $\mathcal{O}(m^3)$ (hyperparameter learning) with $m$ basis functions and $n$ data points. Furthermore, the basis functions are independent of the parameters of the covariance function, which allows for very fast hyperparameter learning. The approach also allows for rigorous error analysis with Hilbert space theory, and we show that the approximation becomes exact when the size of the compact subset and the number of eigenfunctions go to infinity. We also show that the convergence rate of the truncation error is independent of the input dimensionality provided that the differentiability order of the covariance function is increases appropriately, and for the squared exponential covariance function it is always bounded by ${\sim}1/m$ regardless of the input dimensionality. The expansion generalizes to Hilbert spaces with an inner product which is defined as an integral over a specified input density. The method is compared to previously proposed methods theoretically and through empirical tests with simulated and real data.