Researcher profile

Ajay Jasra

Ajay Jasra contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

New Trends in the Stability of Sinkhorn Semigroups

Entropic optimal transport problems play an increasingly important role in machine learning and generative modelling. In contrast with optimal transport maps which often have limited applicability in high dimensions, Schrodinger bridges can be solved using the celebrated Sinkhorn's algorithm, a.k.a. the iterative proportional fitting procedure. The stability properties of Sinkhorn bridges when the number of iterations tends to infinity is a very active research area in applied probability and machine learning. Traditional proofs of convergence are mainly based on nonlinear versions of Perron-Frobenius theory and related Hilbert projective metric techniques, gradient descent, Bregman divergence techniques and Hamilton-Jacobi-Bellman equations, including propagation of convexity profiles based on coupling diffusions by reflection methods. The objective of this review article is to present, in a self-contained manner, recently developed Sinkhorn/Gibbs-type semigroup analysis based upon contraction coefficients and Lyapunov-type operator-theoretic techniques. These powerful, off-the-shelf semigroup methods are based upon transportation cost inequalities (e.g. log-Sobolev, Talagrand quadratic inequality, curvature estimates), $ϕ$-divergences, Kantorovich-type criteria and Dobrushin contraction-type coefficients on weighted Banach spaces as well as Wasserstein distances. This novel semigroup analysis allows one to unify and simplify many arguments in the stability of Sinkhorn algorithm. It also yields new contraction estimates w.r.t. generalized $ϕ$-entropies, as well as weighted total variation norms, Kantorovich criteria and Wasserstein distances.

preprint2023arXiv

Antithetic Multilevel Particle Filters

In this paper we consider the filtering of partially observed multi-dimensional diffusion processes that are observed regularly at discrete times. This is a challenging problem which requires the use of advanced numerical schemes based upon time-discretization of the diffusion process and then the application of particle filters. Perhaps the state-of-the-art method for moderate dimensional problems is the multilevel particle filter of \cite{mlpf}. This is a method that combines multilevel Monte Carlo and particle filters. The approach in that article is based intrinsically upon an Euler discretization method. We develop a new particle filter based upon the antithetic truncated Milstein scheme of \cite{ml_anti}. We show that for a class of diffusion problems, for $ε>0$ given, that the cost to produce a mean square error (MSE) in estimation of the filter, of $\mathcal{O}(ε^2)$ is $\mathcal{O}(ε^{-2}\log(ε)^2)$. In the case of multidimensional diffusions with non-constant diffusion coefficient, the method of \cite{mlpf} has a cost of $\mathcal{O}(ε^{-2.5})$ to achieve the same MSE. We support our theory with numerical results in several examples.

preprint2022arXiv

A Lagged Particle Filter for Stable Filtering of certain High-Dimensional State-Space Models

We consider the problem of high-dimensional filtering of state-space models (SSMs) at discrete times. This problem is particularly challenging as analytical solutions are typically not available and many numerical approximation methods can have a cost that scales exponentially with the dimension of the hidden state. Inspired by lag-approximation methods for the smoothing problem, we introduce a lagged approximation of the smoothing distribution that is necessarily biased. For certain classes of SSMs, particularly those that forget the initial condition exponentially fast in time, the bias of our approximation is shown to be uniformly controlled in the dimension and exponentially small in time. We develop a sequential Monte Carlo (SMC) method to recursively estimate expectations with respect to our biased filtering distributions. Moreover, we prove for a class of class of SSMs that can contain dependencies amongst coordinates that as the dimension $d\rightarrow\infty$ the cost to achieve a stable mean square error in estimation, for classes of expectations, is of $\mathcal{O}(Nd^2)$ per-unit time, where $N$ is the number of simulated samples in the SMC algorithm. Our methodology is implemented on several challenging high-dimensional examples including the conservative shallow-water model.

preprint2022arXiv

Convergence Speed and Approximation Accuracy of Numerical MCMC

When implementing Markov Chain Monte Carlo (MCMC) algorithms, perturbation caused by numerical errors is sometimes inevitable. This paper studies how perturbation of MCMC affects the convergence speed and Monte Carlo estimation accuracy. Our results show that when the original Markov chain converges to stationarity fast enough and the perturbed transition kernel is a good approximation to the original transition kernel, the corresponding perturbed sampler has similar convergence speed and high approximation accuracy as well. We discuss two different analysis frameworks: ergodicity and spectral gap, both are widely used in the literature. Our results can be easily extended to obtain non-asymptotic error bounds for MCMC estimators. We also demonstrate how to apply our convergence and approximation results to the analysis of specific sampling algorithms, including Random walk Metropolis and Metropolis adjusted Langevin algorithm with perturbed target densities, and parallel tempering Monte Carlo with perturbed densities. Finally we present some simple numerical examples to verify our theoretical claims.

preprint2022arXiv

Unbiased Estimation of the Vanilla and Deterministic Ensemble Kalman-Bucy Filters

In this article we consider the development of an unbiased estimator for the ensemble Kalman--Bucy filter (EnKBF). The EnKBF is a continuous-time filtering methodology which can be viewed as a continuous-time analogue of the famous discrete-time ensemble Kalman filter. Our unbiased estimators will be motivated from recent work [Rhee \& Glynn 2010, [31]] which introduces randomization as a means to produce unbiased and finite variance estimators. The randomization enters through both the level of discretization, and through the number of samples at each level. Our estimator will be specific to linear and Gaussian settings, where we know that the EnKBF is consistent, in the particle limit $N \rightarrow \infty$, with the KBF. We highlight this for two particular variants of the EnKBF, i.e. the deterministic and vanilla variants, and demonstrate this on a linear Ornstein--Uhlenbeck process. We compare this with the EnKBF and the multilevel (MLEnKBF), for experiments with varying dimension size. We also provide a proof of the multilevel deterministic EnKBF, which provides a guideline for some of the unbiased methods.

preprint2022arXiv

Unbiased Parameter Inference for a Class of Partially Observed Levy-Process Models

We consider the problem of static Bayesian inference for partially observed Levy-process models. We develop a methodology which allows one to infer static parameters and some states of the process, without a bias from the time-discretization of the afore-mentioned Levy process. The unbiased method is exceptionally amenable to parallel implementation and can be computationally efficient relative to competing approaches. We implement the method on S & P 500 log-return daily data and compare it to some Markov chain Monte Carlo (MCMC) algorithm.

preprint2021arXiv

A 4D-Var Method with Flow-Dependent Background Covariances for the Shallow-Water Equations

The 4D-Var method for filtering partially observed nonlinear chaotic dynamical systems consists of finding the maximum a-posteriori (MAP) estimator of the initial condition of the system given observations over a time window, and propagating it forward to the current time via the model dynamics. This method forms the basis of most currently operational weather forecasting systems. In practice the optimization becomes infeasible if the time window is too long due to the non-convexity of the cost function, the effect of model errors, and the limited precision of the ODE solvers. Hence the window has to be kept sufficiently short, and the observations in the previous windows can be taken into account via a Gaussian background (prior) distribution. The choice of the background covariance matrix is an important question that has received much attention in the literature. In this paper, we define the background covariances in a principled manner, based on observations in the previous $b$ assimilation windows, for a parameter $b\ge 1$. The method is at most $b$ times more computationally expensive than using fixed background covariances, requires little tuning, and greatly improves the accuracy of 4D-Var. As a concrete example, we focus on the shallow-water equations. The proposed method is compared against state-of-the-art approaches in data assimilation and is shown to perform favourably on simulated data. We also illustrate our approach on data from the recent tsunami of 2011 in Fukushima, Japan.

preprint2021arXiv

Log-Normalization Constant Estimation using the Ensemble Kalman-Bucy Filter with Application to High-Dimensional Models

In this article we consider the estimation of the log-normalization constant associated to a class of continuous-time filtering models. In particular, we consider ensemble Kalman-Bucy filter based estimates based upon several nonlinear Kalman-Bucy diffusions. Based upon new conditional bias results for the mean of the afore-mentioned methods, we analyze the empirical log-scale normalization constants in terms of their $\mathbb{L}_n-$errors and conditional bias. Depending on the type of nonlinear Kalman-Bucy diffusion, we show that these are of order $(\sqrt{t/N}) + t/N$ or $1/\sqrt{N}$ ($\mathbb{L}_n-$errors) and of order $[t+\sqrt{t}]/N$ or $1/N$ (conditional bias), where $t$ is the time horizon and $N$ is the ensemble size. Finally, we use these results for online static parameter estimation for above filtering models and implement the methodology for both linear and nonlinear models.

preprint2021arXiv

On Unbiased Estimation for Discretized Models

In this article, we consider computing expectations w.r.t. probability measures which are subject to discretization error. Examples include partially observed diffusion processes or inverse problems, where one may have to discretize time and/or space, in order to practically work with the probability of interest. Given access only to these discretizations, we consider the construction of unbiased Monte Carlo estimators of expectations w.r.t. such target probability distributions. It is shown how to obtain such estimators using a novel adaptation of randomization schemes and Markov simulation methods. Under appropriate assumptions, these estimators possess finite variance and finite expected cost. There are two important consequences of this approach: (i) unbiased inference is achieved at the canonical complexity rate, and (ii) the resulting estimators can be generated independently, thereby allowing strong scaling to arbitrarily many parallel processors. Several algorithms are presented, and applied to some examples of Bayesian inference problems, with both simulated and real observed data.

preprint2021arXiv

Unbiased inference for discretely observed hidden Markov model diffusions

We develop a Bayesian inference method for diffusions observed discretely and with noise, which is free of discretisation bias. Unlike existing unbiased inference methods, our method does not rely on exact simulation techniques. Instead, our method uses standard time-discretised approximations of diffusions, such as the Euler--Maruyama scheme. Our approach is based on particle marginal Metropolis--Hastings, a particle filter, randomised multilevel Monte Carlo, and importance sampling type correction of approximate Markov chain Monte Carlo. The resulting estimator leads to inference without a bias from the time-discretisation as the number of Markov chain iterations increases. We give convergence results and recommend allocations for algorithm inputs. Our method admits a straightforward parallelisation, and can be computationally efficient. The user-friendly approach is illustrated on three examples, where the underlying diffusion is an Ornstein--Uhlenbeck process, a geometric Brownian motion, and a 2d non-reversible Langevin equation.

preprint2020arXiv

A practical and efficient approach for Bayesian quantum state estimation

Bayesian inference is a powerful paradigm for quantum state tomography, treating uncertainty in meaningful and informative ways. Yet the numerical challenges associated with sampling from complex probability distributions hampers Bayesian tomography in practical settings. In this Article, we introduce an improved, self-contained approach for Bayesian quantum state estimation. Leveraging advances in machine learning and statistics, our formulation relies on highly efficient preconditioned Crank--Nicolson sampling and a pseudo-likelihood. We theoretically analyze the computational cost, and provide explicit examples of inference for both actual and simulated datasets, illustrating improved performance with respect to existing approaches.

preprint2020arXiv

A Wasserstein Coupled Particle Filter for Multilevel Estimation

In this paper, we consider the filtering problem for partially observed diffusions, which are regularly observed at discrete times. We are concerned with the case when one must resort to time-discretization of the diffusion process if the transition density is not available in an appropriate form. In such cases, one must resort to advanced numerical algorithms such as particle filters to consistently estimate the filter. It is also well known that the particle filter can be enhanced by considering hierarchies of discretizations and the multilevel Monte Carlo (MLMC) method, in the sense of reducing the computational effort to achieve a given mean square error (MSE). A variety of multilevel particle filters (MLPF) have been suggested in the literature, e.g., in Jasra et al., SIAM J, Numer. Anal., 55, 3068--3096. Here we introduce a new alternative that involves a resampling step based on the optimal Wasserstein coupling. We prove a central limit theorem (CLT) for the new method. On considering the asymptotic variance, we establish that in some scenarios, there is a reduction, relative to the approach in the aforementioned paper by Jasra et al., in computational effort to achieve a given MSE. These findings are confirmed in numerical examples. We also consider filtering diffusions with unstable dynamics; we empirically show that in such cases a change of measure technique seems to be required to maintain our findings.

preprint2020arXiv

Multi-Index Sequential Monte Carlo Methods for partially observed Stochastic Partial Differential Equations

In this paper we consider sequential joint state and static parameter estimation given discrete time observations associated to a partially observed stochastic partial differential equation (SPDE). It is assumed that one can only estimate the hidden state using a discretization of the model. In this context, it is known that the multi-index Monte Carlo (MIMC) method of [11] can be used to improve over direct Monte Carlo from the most precise discretizaton. However, in the context of interest, it cannot be directly applied, but rather must be used within another advanced method such as sequential Monte Carlo (SMC). We show how one can use the MIMC method by renormalizing the MI identity and approximating the resulting identity using the SMC$^2$ method of [5]. We prove that our approach can reduce the cost to obtain a given mean square error (MSE), relative to just using SMC$^2$ on the most precise discretization. We demonstrate this with some numerical examples.

preprint2020arXiv

Multilevel Particle Filters for the Non-Linear Filtering Problem in Continuous Time

In the following article we consider the numerical approximation of the non-linear filter in continuous-time, where the observations and signal follow diffusion processes. Given access to high-frequency, but discrete-time observations, we resort to a first order time discretization of the non-linear filter, followed by an Euler discretization of the signal dynamics. In order to approximate the associated discretized non-linear filter, one can use a particle filter (PF). Under assumptions, this can achieve a mean square error of $\mathcal{O}(ε^2)$, for $ε>0$ arbitrary, such that the associated cost is $\mathcal{O}(ε^{-4})$. We prove, under assumptions, that the multilevel particle filter (MLPF) of Jasra et al (2017) can achieve a mean square error of $\mathcal{O}(ε^2)$, for cost $\mathcal{O}(ε^{-3})$. This is supported by numerical simulations in several examples.

preprint2020arXiv

Unbiased Estimation of the Gradient of the Log-Likelihood in Inverse Problems

We consider the problem of estimating a parameter associated to a Bayesian inverse problem. Treating the unknown initial condition as a nuisance parameter, typically one must resort to a numerical approximation of gradient of the log-likelihood and also adopt a discretization of the problem in space and/or time. We develop a new methodology to unbiasedly estimate the gradient of the log-likelihood with respect to the unknown parameter, i.e. the expectation of the estimate has no discretization bias. Such a property is not only useful for estimation in terms of the original stochastic model of interest, but can be used in stochastic gradient algorithms which benefit from unbiased estimates. Under appropriate assumptions, we prove that our estimator is not only unbiased but of finite variance. In addition, when implemented on a single processor, we show that the cost to achieve a given level of error is comparable to multilevel Monte Carlo methods, both practically and theoretically. However, the new algorithm provides the possibility for parallel computation on arbitrarily many processors without any loss of efficiency, asymptotically. In practice, this means any precision can be achieved in a fixed, finite constant time, provided that enough processors are available.

preprint2020arXiv

Unbiased Estimation of the Solution to Zakai's Equation

In the following article we consider the non-linear filtering problem in continuous-time and in particular the solution to Zakai's equation or the normalizing constant. We develop a methodology to produce finite variance, almost surely unbiased estimators of the solution to Zakai's equation. That is, given access to only a first order discretization of solution to the Zakai equation, we present a method which can remove this discretization bias. The approach, under assumptions, is proved to have finite variance and is numerically compared to using a particular multilevel Monte Carlo method.

preprint2020arXiv

Unbiased Filtering of a Class of Partially Observed Diffusions

In this article we consider a Monte Carlo-based method to filter partially observed diffusions observed at regular and discrete times. Given access only to Euler discretizations of the diffusion process, we present a new procedure which can return online estimates of the filtering distribution with no discretization bias and finite variance. Our approach is based upon a novel double application of the randomization methods of Rhee & Glynn (2015) along with the multilevel particle filter (MLPF) approach of Jasra et al (2017). A numerical comparison of our new approach with the MLPF, on a single processor, shows that similar errors are possible for a mild increase in computational cost. However, the new method scales strongly to arbitrarily many processors.

preprint2020arXiv

Uncertainty modelling and computational aspects of data association

A novel solution to the smoothing problem for multi-object dynamical systems is proposed and evaluated. The systems of interest contain an unknown and varying number of dynamical objects that are partially observed under noisy and corrupted observations. An alternative representation of uncertainty is considered in order to account for the lack of information about the different aspects of this type of complex system. The corresponding statistical model can be formulated as a hierarchical model consisting of conditionally-independent hidden Markov models. This particular structure is leveraged to propose an efficient method in the context of Markov chain Monte Carlo (MCMC) by relying on an approximate solution to the corresponding filtering problem, in a similar fashion to particle MCMC. This approach is shown to outperform existing algorithms in a range of scenarios.

preprint2018arXiv

Central Limit Theorems for Coupled Particle Filters

In this article we prove a new central limit theorem (CLT) for coupled particle filters (CPFs). CPFs are used for the sequential estimation of the difference of expectations w.r.t. filters which are in some sense close. Examples include the estimation of the filtering distribution associated to different parameters (finite difference estimation) and filters associated to partially observed discretized diffusion processes (PODDP) and the implementation of the multilevel Monte Carlo (MLMC) identity. We develop new theory for CPFs and based upon several results, we propose a new CPF which approximates the maximal coupling (MCPF) of a pair of predictor distributions. In the context of ML estimation associated to PODDP with discretization $Δ_l$ we show that the MCPF and the approach in Jasra et al. (2018) have, under assumptions, an asymptotic variance that is upper-bounded by an expression that is (almost) $\mathcal{O}(Δ_l)$, uniformly in time. The $\mathcal{O}(Δ_l)$ rate preserves the so-called forward rate of the diffusion in some scenarios which is not the case for the CPF in Jasra et al (2017).