Source author record

Prashant G. Mehta

Prashant G. Mehta appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC math.PR eess.SY math.NA Systems and Control Machine Learning Computer Science and Game Theory math.CO math.ST Multiagent Systems nlin.AO Statistics Theory

Catalog footprint

What is connected

25works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Transformer-like Inference from Optimal Control

Decoder-only transformers compute the conditional probability of the next token from a sequence of past observations. This paper derives, from first principles, inference architectures that solve the same prediction problem - and in doing so, recovers transformer-like layer operations as a consequence of optimal control theory. The framework is developed for two model classes: a nonlinear model of discrete-valued processes, directly motivated by the transformer, and a linear Gaussian model as a tractable baseline. For both model classes, the prediction objective is reformulated as an optimal control problem whose solution yields an explicit inference algorithm, the dual filter, with a layer structure that mirrors the layer structure of a decoder-only transformer. Numerical experiments provide a comparison of the optimal control to attention weights from a trained transformer. These experiments reveal that when the embedding dimension is insufficient, the transformer implicitly exploits non-Markovian structure.

preprint2022arXiv

Controlled Interacting Particle Algorithms for Simulation-based Reinforcement Learning

This paper is concerned with optimal control problems for control systems in continuous time, and interacting particle system methods designed to construct approximate control solutions. Particular attention is given to the linear quadratic (LQ) control problem. There is a growing interest in re-visiting this classical problem, in part due to the successes of reinforcement learning (RL). The main question of this body of research (and also of our paper) is to approximate the optimal control law {\em without} explicitly solving the Riccati equation. A novel simulation-based algorithm, namely a dual ensemble Kalman filter (EnKF), is introduced. The algorithm is used to obtain formulae for optimal control, expressed entirely in terms of the EnKF particles. An extension to the nonlinear case is also presented. The theoretical results and algorithms are illustrated with numerical experiments.

preprint2022arXiv

Duality for Nonlinear Filtering I: Observability

This paper is concerned with the development and use of duality theory for a hidden Markov model (HMM) with white noise observations. The main contribution of this work is to introduce a backward stochastic differential equation (BSDE) as a dual control system. A key outcome is that stochastic observability (resp. detectability) of the HMM is expressed in dual terms: as controllability (resp. stabilizability) of the dual control system. All aspects of controllability, namely, definition of controllable space and controllability gramian, along with their properties and explicit formulae, are discussed. The proposed duality is shown to be an exact extension of the classical duality in linear systems theory. One can then relate and compare the linear and the nonlinear systems. A side-by-side summary of this relationship is given in a tabular form (Table~II).

preprint2022arXiv

Duality for Nonlinear Filtering II: Optimal Control

This paper is concerned with the development and use of duality theory for a nonlinear filtering model with white noise observations. The main contribution of this paper is to introduce a stochastic optimal control problem as a dual to the nonlinear filtering problem. The mathematical statement of the dual relationship between the two problems is given in the form of a duality principle. The constraint for the optimal control problem is the backward stochastic differential equation (BSDE) introduced in the companion paper. The optimal control solution is obtained from an application of the maximum principle, and subsequently used to derive the equation of the nonlinear filter. The proposed duality is shown to be an exact extension of the classical Kalman-Bucy duality, and different from other types of optimal control and variational formulations given in literature.

preprint2022arXiv

How does a Rational Agent Act in an Epidemic?

Evolution of disease in a large population is a function of the top-down policy measures from a centralized planner, as well as the self-interested decisions (to be socially active) of individual agents in a large heterogeneous population. This paper is concerned with understanding the latter based on a mean-field type optimal control model. Specifically, the model is used to investigate the role of partial information on an agent's decision-making, and study the impact of such decisions by a large number of agents on the spread of the virus in the population. The motivation comes from the presymptomatic and asymptomatic spread of the COVID-19 virus where an agent unwittingly spreads the virus. We show that even in a setting with fully rational agents, limited information on the viral state can result in an epidemic growth.

preprint2022arXiv

Optimality vs Stability Trade-off in Ensemble Kalman Filters

This paper is concerned with optimality and stability analysis of a family of ensemble Kalman filter (EnKF) algorithms. EnKF is commonly used as an alternative to the Kalman filter for high-dimensional problems, where storing the covariance matrix is computationally expensive. The algorithm consists of an ensemble of interacting particles driven by a feedback control law. The control law is designed such that, in the linear Gaussian setting and asymptotic limit of infinitely many particles, the mean and covariance of the particles follow the exact mean and covariance of the Kalman filter. The problem of finding a control law that is exact does not have a unique solution, reminiscent of the problem of finding a transport map between two distributions. A unique control law can be identified by introducing control cost functions, that are motivated by the optimal transportation problem or Schrödinger bridge problem. The objective of this paper is to study the relationship between optimality and long-term stability of a family of exact control laws. Remarkably, the control law that is optimal in the optimal transportation sense leads to an EnKF algorithm that is not stable.

preprint2021arXiv

Feedback Particle Filter for Collective Inference

The purpose of this paper is to describe the feedback particle filter algorithm for problems where there are a large number ($M$) of non-interacting agents (targets) with a large number ($M$) of non-agent specific observations (measurements) that originate from these agents. In its basic form, the problem is characterized by data association uncertainty whereby the association between the observations and agents must be deduced in addition to the agent state. In this paper, the large-$M$ limit is interpreted as a problem of collective inference. This viewpoint is used to derive the equation for the empirical distribution of the hidden agent states. A feedback particle filter (FPF) algorithm for this problem is presented and illustrated via numerical simulations. Results are presented for the Euclidean and the finite state-space cases, both in continuous-time settings. The classical FPF algorithm is shown to be the special case (with $M=1$) of these more general results. The simulations help show that the algorithm well approximates the empirical distribution of the hidden states for large $M$.

preprint2021arXiv

Optimal Transportation Methods in Nonlinear Filtering: The feedback particle filter

Feedback particle filter (FPF) is a Monte-Carlo (MC) algorithm to approximate the solution of a stochastic filtering problem. In contrast to conventional particle filters, the Bayesian update step in FPF is implemented via a mean-field type feedback control law. The objective for this paper is to situate the development of FPF and related controlled interacting particle system algorithms within the framework of optimal transportation theory. Starting from the simplest setting of the Bayes' update formula, a coupling viewpoint is introduced to construct particle filters. It is shown that the conventional importance sampling resampling particle filter implements an independent coupling. Design of optimal couplings is introduced first for the simple Gaussian settings and subsequently extended to derive the FPF algorithm. The final half of the paper provides a review of some of the salient aspects of the FPF algorithm including the feedback structure, algorithms for gain function design, and comparison with conventional particle filters. The comparison serves to illustrate the benefit of feedback in particle filtering.

preprint2020arXiv

A Dual Characterization of Observability for Stochastic Systems

This paper is concerned with a characterization of the observability for a continuous-time hidden Markov model where the state evolves as a general continuous-time Markov process and the observation process is modeled as nonlinear function of the state corrupted by the Gaussian measurement noise. The main technical tool is based on the recently discovered duality relationship between minimum variance estimation and stochastic optimal control: The observability is defined as a dual of the controllability for a certain backward stochastic differential equation. Based on the dual formulation, a test for observability is presented and related to literature. The proposed duality-based framework allows one to easily relate and compare the linear and the nonlinear systems. A side-by-side summary of this relationship is given in a tabular form (Table~1)

preprint2020arXiv

Convex Q-Learning, Part 1: Deterministic Optimal Control

It is well known that the extension of Watkins' algorithm to general function approximation settings is challenging: does the projected Bellman equation have a solution? If so, is the solution useful in the sense of generating a good policy? And, if the preceding questions are answered in the affirmative, is the algorithm consistent? These questions are unanswered even in the special case of Q-function approximations that are linear in the parameter. The challenge seems paradoxical, given the long history of convex analytic approaches to dynamic programming. The paper begins with a brief survey of linear programming approaches to optimal control, leading to a particular over parameterization that lends itself to applications in reinforcement learning. The main conclusions are summarized as follows: (i) The new class of convex Q-learning algorithms is introduced based on the convex relaxation of the Bellman equation. Convergence is established under general conditions, including a linear function approximation for the Q-function. (ii) A batch implementation appears similar to the famed DQN algorithm (one engine behind AlphaZero). It is shown that in fact the algorithms are very different: while convex Q-learning solves a convex program that approximates the Bellman equation, theory for DQN is no stronger than for Watkins' algorithm with function approximation: (a) it is shown that both seek solutions to the same fixed point equation, and (b) the ODE approximations for the two algorithms coincide, and little is known about the stability of this ODE. These results are obtained for deterministic nonlinear systems with total cost criterion. Many extensions are proposed, including kernel implementation, and extension to MDP models.

preprint2020arXiv

Deep FPF: Gain function approximation in high-dimensional setting

In this paper, we present a novel approach to approximate the gain function of the feedback particle filter (FPF). The exact gain function is the solution of a Poisson equation involving a probability-weighted Laplacian. The numerical problem is to approximate the exact gain function using only finitely many particles sampled from the probability distribution. Inspired by the recent success of the deep learning methods, we represent the gain function as a gradient of the output of a neural network. Thereupon considering a certain variational formulation of the Poisson equation, an optimization problem is posed for learning the weights of the neural network. A stochastic gradient algorithm is described for this purpose. The proposed approach has two significant properties/advantages: (i) The stochastic optimization algorithm allows one to process, in parallel, only a batch of samples (particles) ensuring good scaling properties with the number of particles; (ii) The remarkable representation power of neural networks means that the algorithm is potentially applicable and useful to solve high-dimensional problems. We numerically establish these two properties and provide extensive comparison to the existing approaches.

preprint2020arXiv

On the Lyapunov Foster criterion and Poincaré inequality for Reversible Markov Chains

This paper presents an elementary proof of stochastic stability of a discrete-time reversible Markov chain starting from a Foster-Lyapunov drift condition. Besides its relative simplicity, there are two salient features of the proof: (i) it relies entirely on functional-analytic non-probabilistic arguments; and (ii) it makes explicit the connection between a Foster-Lyapunov function and Poincaré inequality. The proof is used to derive an explicit bound for the spectral gap. An extension to the non-reversible case is also presented.

preprint2016arXiv

Attitude Estimation with Feedback Particle Filter

This paper presents theory, application, and comparisons of the feedback particle filter (FPF) algorithm for the problem of attitude estimation. The paper builds upon our recent work on the exact FPF solution of the continuous-time nonlinear filtering problem on compact Lie groups. In this paper, the details of the FPF algorithm are presented for the problem of attitude estimation - a nonlinear filtering problem on SO(3). The quaternions are employed for computational purposes. The algorithm requires a numerical solution of the filter gain function, and two methods are applied for this purpose. Comparisons are also provided between the FPF and some popular algorithms for attitude estimation on SO(3), including the invariant EKF, the multiplicative EKF, and the unscented Kalman filter. Simulation results are presented that help illustrate the comparisons.

preprint2016arXiv

Error Estimates for the Kernel Gain Function Approximation in the Feedback Particle Filter

This paper is concerned with the analysis of the kernel-based algorithm for gain function approximation in the feedback particle filter. The exact gain function is the solution of a Poisson equation involving a probability-weighted Laplacian. The kernel-based method -- introduced in our prior work -- allows one to approximate this solution using {\em only} particles sampled from the probability distribution. This paper describes new representations and algorithms based on the kernel-based method. Theory surrounding the approximation is improved and a novel formula for the gain function approximation is derived. A procedure for carrying out error analysis of the approximation is introduced. Certain asymptotic estimates for bias and variance are derived for the general nonlinear non-Gaussian case. Comparison with the constant gain function approximation is provided. The results are illustrated with the aid of some numerical experiments.

preprint2016arXiv

Gain Function Approximation in the Feedback Particle Filter

This paper is concerned with numerical algorithms for gain function approximation in the feedback particle filter. The exact gain function is the solution of a Poisson equation involving a probability-weighted Laplacian. The problem is to approximate this solution using only particles sampled from the probability distribution. Two algorithms are presented: a Galerkin algorithm and a kernel-based algorithm. Both the algorithms are adapted to the samples and do not require approximation of the probability distribution as an intermediate step. The paper contains error analysis for the algorithms as well as some comparative numerical results for a non-Gaussian distribution. These algorithms are also applied and illustrated for a simple nonlinear filtering example.

preprint2015arXiv

An Optimal Transport Formulation of the Linear Feedback Particle Filter

Feedback particle filter (FPF) is an algorithm to numerically approximate the solution of the nonlinear filtering problem in continuous time. The algorithm implements a feedback control law for a system of particles such that the empirical distribution of particles approximates the posterior distribution. However, it has been noted in the literature that the feedback control law is not unique. To find a unique control law, the filtering task is formulated here as an optimal transportation problem between the prior and the posterior distributions. Based on this formulation, a time stepping optimization procedure is proposed for the optimal control design. A key difference between the optimal control law and the one in the original FPF, is the replacement of noise term with a deterministic term. This difference serves to decreases the simulation variance, as illustrated with a simple numerical example.

preprint2015arXiv

Feedback Particle Filter on Matrix Lie Groups

This paper is concerned with the problem of continuous-time nonlinear filtering for stochastic processes on a compact and connected matrix Lie group without boundary, e.g. SO(n) and SE(n), in the presence of real-valued observations. This problem is important to numerous applications in attitude estimation, visual tracking and robotic localization. The main contribution of this paper is to derive the feedback particle filter (FPF) algorithm for this problem. In its general form, the FPF provides a coordinate-free description of the filter that furthermore satisfies the geometric constraints of the manifold. The particle dynamics are encapsulated in a Stratonovich stochastic differential equation that preserves the feedback structure of the original Euclidean FPF. Specific examples for SO(2) and SO(3) are provided to help illustrate the filter using the phase and the quaternion coordinates, respectively.

preprint2014arXiv

Poisson's equation in nonlinear filtering

The aim of this paper is to provide a variational interpretation of the nonlinear filter in continuous time. A time-stepping procedure is introduced, consisting of successive minimization problems in the space of probability densities. The weak form of the nonlinear filter is derived via analysis of the first-order optimality conditions for these problems. The derivation shows the nonlinear filter dynamics may be regarded as a gradient flow, or a steepest descent, for a certain energy functional with respect to the Kullback-Leibler divergence. The second part of the paper is concerned with derivation of the feedback particle filter algorithm, based again on the analysis of the first variation. The algorithm is shown to be exact. That is, the posterior distribution of the particle matches exactly the true posterior, provided the filter is initialized with the true prior.

preprint2014arXiv

Probabilistic Data Association-Feedback Particle Filter for Multiple Target Tracking Applications

This paper is concerned with the problem of tracking single or multiple targets with multiple non-target specific observations (measurements). For such filtering problems with data association uncertainty, a novel feedback control-based particle filter algorithm is introduced. The algorithm is referred to as the probabilistic data association-feedback particle filter (PDA-FPF). The proposed filter is shown to represent a generalization to the nonlinear non-Gaussian case of the classical Kalman filter-based probabilistic data association filter (PDAF). One remarkable conclusion is that the proposed PDA-FPF algorithm retains the innovation error-based feedback structure of the classical PDAF algorithm, even in the nonlinear non-Gaussian case. The theoretical results are illustrated with the aid of numerical examples motivated by multiple target tracking applications.

preprint2013arXiv

Feedback Particle Filter

A new formulation of the particle filter for nonlinear filtering is presented, based on concepts from optimal control, and from the mean-field game theory. The optimal control is chosen so that the posterior distribution of a particle matches as closely as possible the posterior distribution of the true state given the observations. This is achieved by introducing a cost function, defined by the Kullback-Leibler (K-L) divergence between the actual posterior, and the posterior of any particle. The optimal control input is characterized by a certain Euler-Lagrange (E-L) equation, and is shown to admit an innovation error-based feedback structure. For diffusions with continuous observations, the value of the optimal control solution is ideal. The two posteriors match exactly, provided they are initialized with identical priors. The feedback particle filter is defined by a family of stochastic systems, each evolving under this optimal control law. A numerical algorithm is introduced and implemented in two general examples, and a neuroscience application involving coupled oscillators. Some preliminary numerical comparisons between the feed- back particle filter and the bootstrap particle filter are described.

preprint2013arXiv

Interacting Multiple Model-Feedback Particle Filter for Stochastic Hybrid Systems

In this paper, a novel feedback control-based particle filter algorithm for the continuous-time stochastic hybrid system estimation problem is presented. This particle filter is referred to as the interacting multiple model-feedback particle filter (IMM-FPF), and is based on the recently developed feedback particle filter. The IMM-FPF is comprised of a series of parallel FPFs, one for each discrete mode, and an exact filter recursion for the mode association probability. The proposed IMM-FPF represents a generalization of the Kalmanfilter based IMM algorithm to the general nonlinear filtering problem. The remarkable conclusion of this paper is that the IMM-FPF algorithm retains the innovation error-based feedback structure even for the nonlinear problem. The interaction/merging process is also handled via a control-based approach. The theoretical results are illustrated with the aid of a numerical example problem for a maneuvering target tracking application.

preprint2013arXiv

Joint Probabilistic Data Association-Feedback Particle Filter for Multiple Target Tracking Applications

This paper introduces a novel feedback-control based particle filter for the solution of the filtering problem with data association uncertainty. The particle filter is referred to as the joint probabilistic data association-feedback particle filter (JPDA-FPF). The JPDA-FPF is based on the feedback particle filter introduced in our earlier papers. The remarkable conclusion of our paper is that the JPDA-FPF algorithm retains the innovation error-based feedback structure of the feedback particle filter, even with data association uncertainty in the general nonlinear case. The theoretical results are illustrated with the aid of two numerical example problems drawn from multiple target tracking applications.

preprint2013arXiv

Multivariable Feedback Particle Filter

In recent work it is shown that importance sampling can be avoided in the particle filter through an innovation structure inspired by traditional nonlinear filtering combined with Mean-Field Game formalisms. The resulting feedback particle filter (FPF) offers significant variance improvements; in particular, the algorithm can be applied to systems that are not stable. The filter comes with an up-front computational cost to obtain the filter gain. This paper describes new representations and algorithms to compute the gain in the general multivariable setting. The main contributions are, (i) Theory surrounding the FPF is improved: Consistency is established in the multivariate setting, as well as well-posedness of the associated PDE to obtain the filter gain. (ii) The gain can be expressed as the gradient of a function, which is precisely the solution to Poisson's equation for a related MCMC diffusion (the Smoluchowski equation). This provides a bridge to MCMC as well as to approximate optimal filtering approaches such as TD-learning, which can in turn be used to approximate the gain. (iii) Motivated by a weak formulation of Poisson's equation, a Galerkin finite-element algorithm is proposed for approximation of the gain. Its performance is illustrated in numerical experiments.

preprint2010arXiv

Stability Margin Scaling Laws for Distributed Formation Control as a Function of Network Structure

We consider the problem of distributed formation control of a large number of vehicles. An individual vehicle in the formation is assumed to be a fully actuated point mass. A distributed control law is examined: the control action on an individual vehicle depends on (i) its own velocity and (ii) the relative position measurements with a small subset of vehicles (neighbors) in the formation. The neighbors are defined according to an information graph. In this paper we describe a methodology for modeling, analysis, and distributed control design of such vehicular formations whose information graph is a D-dimensional lattice. The modeling relies on an approximation based on a partial differential equation (PDE) that describes the spatio-temporal evolution of position errors in the formation. The analysis and control design is based on the PDE model. We deduce asymptotic formulae for the closed-loop stability margin (absolute value of the real part of the least stable eigenvalue) of the controlled formation. The stability margin is shown to approach 0 as the number of vehicles N goes to infinity. The exponent on the scaling law for the stability margin is influenced by the dimension and the structure of the information graph. We show that the scaling law can be improved by employing a higher dimensional information graph. Apart from analysis, the PDE model is used for a mistuning-based design of control gains to maximize the stability margin. Mistuning here refers to small perturbation of control gains from their nominal symmetric values. We show that the mistuned design can have a significantly better stability margin even with a small amount of perturbation. The results of the analysis with the PDE model are corroborated with numerical computation of eigenvalues with the state-space model of the formation.

preprint2008arXiv

Mistuning-based Control Design to Improve Closed-Loop Stability of Vehicular Platoons

We consider a decentralized bidirectional control of a platoon of N identical vehicles moving in a straight line. The control objective is for each vehicle to maintain a constant velocity and inter-vehicular separation using only the local information from itself and its two nearest neighbors. Each vehicle is modeled as a double integrator. To aid the analysis, we use continuous approximation to derive a partial differential equation (PDE) approximation of the discrete platoon dynamics. The PDE model is used to explain the progressive loss of closed-loop stability with increasing number of vehicles, and to devise ways to combat this loss of stability. If every vehicle uses the same controller, we show that the least stable closed-loop eigenvalue approaches zero as O(1/N^2) in the limit of a large number (N) of vehicles. We then show how to ameliorate this loss of stability by small amounts of "mistuning", i.e., changing the controller gains from their nominal values. We prove that with arbitrary small amounts of mistuning, the asymptotic behavior of the least stable closed loop eigenvalue can be improved to O(1/N) All the conclusions drawn from analysis of the PDE model are corroborated via numerical calculations of the state-space platoon model.

Prashant G. Mehta

What is connected

Connect this record

See the researcher in context

Building this map preview

25 published item(s)

Transformer-like Inference from Optimal Control

Controlled Interacting Particle Algorithms for Simulation-based Reinforcement Learning

Duality for Nonlinear Filtering I: Observability

Duality for Nonlinear Filtering II: Optimal Control

How does a Rational Agent Act in an Epidemic?

Optimality vs Stability Trade-off in Ensemble Kalman Filters

Feedback Particle Filter for Collective Inference

Optimal Transportation Methods in Nonlinear Filtering: The feedback particle filter

A Dual Characterization of Observability for Stochastic Systems

Convex Q-Learning, Part 1: Deterministic Optimal Control

Deep FPF: Gain function approximation in high-dimensional setting

On the Lyapunov Foster criterion and Poincaré inequality for Reversible Markov Chains

Attitude Estimation with Feedback Particle Filter

Error Estimates for the Kernel Gain Function Approximation in the Feedback Particle Filter

Gain Function Approximation in the Feedback Particle Filter

An Optimal Transport Formulation of the Linear Feedback Particle Filter

Feedback Particle Filter on Matrix Lie Groups

Poisson's equation in nonlinear filtering

Probabilistic Data Association-Feedback Particle Filter for Multiple Target Tracking Applications

Feedback Particle Filter

Interacting Multiple Model-Feedback Particle Filter for Stochastic Hybrid Systems

Joint Probabilistic Data Association-Feedback Particle Filter for Multiple Target Tracking Applications

Multivariable Feedback Particle Filter

Stability Margin Scaling Laws for Distributed Formation Control as a Function of Network Structure

Mistuning-based Control Design to Improve Closed-Loop Stability of Vehicular Platoons