Source author record

Siddhartha Mishra

Siddhartha Mishra appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Numerical Analysis Machine Learning math.AP math.DS physics.comp-ph physics.flu-dyn Artificial Intelligence Computation and Language physics.class-ph

Catalog footprint

What is connected

21works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Geometry Aware Operator Transformer as an Efficient and Accurate Neural Surrogate for PDEs on Arbitrary Domains

The very challenging task of learning solution operators of PDEs on arbitrary domains accurately and efficiently is of vital importance to engineering and industrial simulations. Despite the existence of many operator learning algorithms to approximate such PDEs, we find that accurate models are not necessarily computationally efficient and vice versa. We address this issue by proposing a geometry aware operator transformer (GAOT) for learning PDEs on arbitrary domains. GAOT combines novel multiscale attentional graph neural operator encoders and decoders, together with geometry embeddings and (vision) transformer processors to accurately map information about the domain and the inputs into a robust approximation of the PDE solution. Multiple innovations in the implementation of GAOT also ensure computational efficiency and scalability. We demonstrate this significant gain in both accuracy and efficiency of GAOT over several baselines on a large number of learning tasks from a diverse set of PDEs, including achieving state of the art performance on three large scale three-dimensional industrial CFD datasets.

preprint2022arXiv

Agnostic Physics-Driven Deep Learning

This work establishes that a physical system can perform statistical learning without gradient computations, via an Agnostic Equilibrium Propagation (Aeqprop) procedure that combines energy minimization, homeostatic control, and nudging towards the correct response. In Aeqprop, the specifics of the system do not have to be known: the procedure is based only on external manipulations, and produces a stochastic gradient descent without explicit gradient computations. Thanks to nudging, the system performs a true, order-one gradient step for each training sample, in contrast with order-zero methods like reinforcement or evolutionary strategies, which rely on trial and error. This procedure considerably widens the range of potential hardware for statistical learning to any system with enough controllable parameters, even if the details of the system are poorly known. Aeqprop also establishes that in natural (bio)physical systems, genuine gradient-based statistical learning may result from generic, relatively simple mechanisms, without backpropagation and its requirement for analytic knowledge of partial derivatives.

preprint2022arXiv

Error analysis for deep neural network approximations of parametric hyperbolic conservation laws

We derive rigorous bounds on the error resulting from the approximation of the solution of parametric hyperbolic scalar conservation laws with ReLU neural networks. We show that the approximation error can be made as small as desired with ReLU neural networks that overcome the curse of dimensionality. In addition, we provide an explicit upper bound on the generalization error in terms of the training error, number of training samples and the neural network size. The theoretical results are illustrated by numerical experiments.

preprint2022arXiv

Error estimates for DeepOnets: A deep learning framework in infinite dimensions

DeepONets have recently been proposed as a framework for learning nonlinear operators mapping between infinite dimensional Banach spaces. We analyze DeepONets and prove estimates on the resulting approximation and generalization errors. In particular, we extend the universal approximation property of DeepONets to include measurable mappings in non-compact spaces. By a decomposition of the error into encoding, approximation and reconstruction errors, we prove both lower and upper bounds on the total error, relating it to the spectral decay properties of the covariance operators, associated with the underlying measures. We derive almost optimal error bounds with very general affine reconstructors and with random sensor locations as well as bounds on the generalization error, using covering number arguments. We illustrate our general framework with four prototypical examples of nonlinear operators, namely those arising in a nonlinear forced ODE, an elliptic PDE with variable coefficients and nonlinear parabolic and hyperbolic PDEs. While the approximation of arbitrary Lipschitz operators by DeepONets to accuracy $ε$ is argued to suffer from a "curse of dimensionality" (requiring a neural networks of exponential size in $1/ε$), in contrast, for all the above concrete examples of interest, we rigorously prove that DeepONets can break this curse of dimensionality (achieving accuracy $ε$ with neural networks of size that can grow algebraically in $1/ε$). Thus, we demonstrate the efficient approximation of a potentially large class of operators with this machine learning framework.

preprint2022arXiv

Graph-Coupled Oscillator Networks

We propose Graph-Coupled Oscillator Networks (GraphCON), a novel framework for deep learning on graphs. It is based on discretizations of a second-order system of ordinary differential equations (ODEs), which model a network of nonlinear controlled and damped oscillators, coupled via the adjacency structure of the underlying graph. The flexibility of our framework permits any basic GNN layer (e.g. convolutional or attentional) as the coupling function, from which a multi-layer deep neural network is built up via the dynamics of the proposed ODEs. We relate the oversmoothing problem, commonly encountered in GNNs, to the stability of steady states of the underlying ODE and show that zero-Dirichlet energy steady states are not stable for our proposed ODEs. This demonstrates that the proposed framework mitigates the oversmoothing problem. Moreover, we prove that GraphCON mitigates the exploding and vanishing gradients problem to facilitate training of deep multi-layer GNNs. Finally, we show that our approach offers competitive performance with respect to the state-of-the-art on a variety of graph-based learning tasks.

preprint2022arXiv

Long Expressive Memory for Sequence Modeling

We propose a novel method called Long Expressive Memory (LEM) for learning long-term sequential dependencies. LEM is gradient-based, it can efficiently process sequential tasks with very long-term dependencies, and it is sufficiently expressive to be able to learn complicated input-output maps. To derive LEM, we consider a system of multiscale ordinary differential equations, as well as a suitable time-discretization of this system. For LEM, we derive rigorous bounds to show the mitigation of the exploding and vanishing gradients problem, a well-known challenge for gradient-based recurrent sequential learning methods. We also prove that LEM can approximate a large class of dynamical systems to high accuracy. Our empirical results, ranging from image and time-series classification through dynamical systems prediction to speech recognition and language modeling, demonstrate that LEM outperforms state-of-the-art recurrent neural networks, gated recurrent units, and long short-term memory models.

preprint2022arXiv

On Bayesian data assimilation for PDEs with ill-posed forward problems

We study Bayesian data assimilation (filtering) for time-evolution PDEs, for which the underlying forward problem may be very unstable or ill-posed. Such PDEs, which include the Navier-Stokes equations of fluid dynamics, are characterized by a high sensitivity of solutions to perturbations of the initial data, a lack of rigorous global well-posedness results as well as possible non-convergence of numerical approximations. Under very mild and readily verifiable general hypotheses on the forward solution operator of such PDEs, we prove that the posterior measure expressing the solution of the Bayesian filtering problem is stable with respect to perturbations of the noisy measurements, and we provide quantitative estimates on the convergence of approximate Bayesian filtering distributions computed from numerical approximations. For the Navier-Stokes equations, our results imply uniform stability of the filtering problem even at arbitrarily small viscosity, when the underlying forward problem may become ill-posed, as well as the compactness of numerical approximants in a suitable metric on time-parametrized probability measures.

preprint2022arXiv

Physics Informed Neural Networks (PINNs)for approximating nonlinear dispersive PDEs

We propose a novel algorithm, based on physics-informed neural networks (PINNs) to efficiently approximate solutions of nonlinear dispersive PDEs such as the KdV-Kawahara, Camassa-Holm and Benjamin-Ono equations. The stability of solutions of these dispersive PDEs is leveraged to prove rigorous bounds on the resulting error. We present several numerical experiments to demonstrate that PINNs can approximate solutions of these dispersive PDEs very accurately

preprint2022arXiv

Variable-Input Deep Operator Networks

Existing architectures for operator learning require that the number and locations of sensors (where the input functions are evaluated) remain the same across all training and test samples, significantly restricting the range of their applicability. We address this issue by proposing a novel operator learning framework, termed Variable-Input Deep Operator Network (VIDON), which allows for random sensors whose number and locations can vary across samples. VIDON is invariant to permutations of sensor locations and is proved to be universal in approximating a class of continuous operators. We also prove that VIDON can efficiently approximate operators arising in PDEs. Numerical experiments with a diverse set of PDEs are presented to illustrate the robust performance of VIDON in learning operators.

preprint2022arXiv

Word2Box: Capturing Set-Theoretic Semantics of Words using Box Embeddings

Learning representations of words in a continuous space is perhaps the most fundamental task in NLP, however words interact in ways much richer than vector dot product similarity can provide. Many relationships between words can be expressed set-theoretically, for example, adjective-noun compounds (eg. "red cars"$\subseteq$"cars") and homographs (eg. "tongue"$\cap$"body" should be similar to "mouth", while "tongue"$\cap$"language" should be similar to "dialect") have natural set-theoretic interpretations. Box embeddings are a novel region-based representation which provide the capability to perform these set-theoretic operations. In this work, we provide a fuzzy-set interpretation of box embeddings, and learn box representations of words using a set-theoretic training objective. We demonstrate improved performance on various word similarity tasks, particularly on less common words, and perform a quantitative and qualitative analysis exploring the additional unique expressivity provided by Word2Box.

preprint2022arXiv

wPINNs: Weak Physics informed neural networks for approximating entropy solutions of hyperbolic conservation laws

Physics informed neural networks (PINNs) require regularity of solutions of the underlying PDE to guarantee accurate approximation. Consequently, they may fail at approximating discontinuous solutions of PDEs such as nonlinear hyperbolic equations. To ameliorate this, we propose a novel variant of PINNs, termed as weak PINNs (wPINNs) for accurate approximation of entropy solutions of scalar conservation laws. wPINNs are based on approximating the solution of a min-max optimization problem for a residual, defined in terms of Kruzkhov entropies, to determine parameters for the neural networks approximating the entropy solution as well as test functions. We prove rigorous bounds on the error incurred by wPINNs and illustrate their performance through numerical experiments to demonstrate that wPINNs can approximate entropy solutions accurately.

preprint2020arXiv

A Multi-level procedure for enhancing accuracy of machine learning algorithms

We propose a multi-level method to increase the accuracy of machine learning algorithms for approximating observables in scientific computing, particularly those that arise in systems modeled by differential equations. The algorithm relies on judiciously combining a large number of computationally cheap training data on coarse resolutions with a few expensive training samples on fine grid resolutions. Theoretical arguments for lowering the generalization error, based on reducing the variance of the underlying maps, are provided and numerical evidence, indicating significant gains over underlying single-level machine learning algorithms, are presented. Moreover, we also apply the multi-level algorithm in the context of forward uncertainty quantification and observe a considerable speed-up over competing algorithms.

preprint2020arXiv

Enhancing accuracy of deep learning algorithms by training with low-discrepancy sequences

We propose a deep supervised learning algorithm based on low-discrepancy sequences as the training set. By a combination of theoretical arguments and extensive numerical experiments we demonstrate that the proposed algorithm significantly outperforms standard deep learning algorithms that are based on randomly chosen training data, for problems in moderately high dimensions. The proposed algorithm provides an efficient method for building inexpensive surrogates for many underlying maps in the context of scientific computing.

preprint2019arXiv

Deep learning observables in computational fluid dynamics

Many large scale problems in computational fluid dynamics such as uncertainty quantification, Bayesian inversion, data assimilation and PDE constrained optimization are considered very challenging computationally as they require a large number of expensive (forward) numerical solutions of the corresponding PDEs. We propose a machine learning algorithm, based on deep artificial neural networks, that predicts the underlying \emph{input parameters to observable} map from a few training samples (computed realizations of this map). By a judicious combination of theoretical arguments and empirical observations, we find suitable network architectures and training hyperparameters that result in robust and efficient neural network approximations of the parameters to observable map. Numerical experiments are presented to demonstrate low prediction errors for the trained network networks, even when the network has been trained with a few samples, at a computational cost which is several orders of magnitude lower than the underlying PDE solver. Moreover, we combine the proposed deep learning algorithm with Monte Carlo (MC) and Quasi-Monte Carlo (QMC) methods to efficiently compute uncertainty propagation for nonlinear PDEs. Under the assumption that the underlying neural networks generalize well, we prove that the deep learning MC and QMC algorithms are guaranteed to be faster than the baseline (quasi-) Monte Carlo methods. Numerical experiments demonstrating one to two orders of magnitude speed up over baseline QMC and MC algorithms, for the intricate problem of computing probability distributions of the observable, are also presented.

preprint2019arXiv

Statistical solutions of the incompressible Euler equations

We propose and study the framework of dissipative statistical solutions for the incompressible Euler equations. Statistical solutions are time-parameterized probability measures on the space of square-integrable functions, whose time-evolution is determined from the underlying Euler equations. We prove partial well-posedness results for dissipative statistical solutions and propose a Monte Carlo type algorithm, based on spectral viscosity spatial discretizations, to approximate them. Under verifiable hypotheses on the computations, we prove that the approximations converge to a statistical solution in a suitable topology. In particular, multi-point statistical quantities of interest converge on increasing resolution. We present several numerical experiments to illustrate the theory.

preprint2014arXiv

Computation of measure-valued solutions for the incompressible Euler equations

We combine the spectral (viscosity) method and ensemble averaging to propose an algorithm that computes admissible measure valued solutions of the incompressible Euler equations. The resulting approximate young measures are proved to converge (with increasing numerical resolution) to a measure valued solution. We present numerical experiments demonstrating the robustness and efficiency of the proposed algorithm, as well as the appropriateness of measure valued solutions as a solution framework for the Euler equations. Furthermore, we report an extensive computational study of the two dimensional vortex sheet, which indicates that the computed measure valued solution is non-atomic and implies possible non-uniqueness of weak solutions constructed by Delort.

preprint2013arXiv

Numerical methods with controlled dissipation for small-scale dependent shocks

We provide a `user guide' to the literature of the past twenty years concerning the modeling and approximation of discontinuous solutions to nonlinear hyperbolic systems that admit small-scale dependent shock waves. We cover several classes of problems and solutions: nonclassical undercompressive shocks, hyperbolic systems in nonconservative form, boundary layer problems. We review the relevant models arising in continuum physics and describe the numerical methods that have been proposed to capture small-scale dependent solutions. In agreement with the general well-posedness theory, small-scale dependent solutions are characterized by a kinetic relation, a family of paths, or an admissible boundary set. We provide a review of numerical methods (front tracking schemes, finite difference schemes, finite volume schemes), which, at the discrete level, reproduce the effect of the physically-meaningful dissipation mechanisms of interest in the applications. An essential role is played by the equivalent equation associated with discrete schemes, which is found to be relevant even for solutions containing shock waves.

preprint2011arXiv

Accurate numerical schemes for approximating initial-boundary value problems for systems of conservation laws

Solutions of initial-boundary value problems for systems of conservation laws depend on the underlying viscous mechanism, namely different viscosity operators lead to different limit solutions. Standard numerical schemes for approximating conservation laws do not take into account this fact and converge to solutions that are not necessarily physically relevant. We design numerical schemes that incorporate explicit information about the underlying viscosity mechanism and approximate the physically relevant solution. Numerical experiments illustrating the robust performance of these schemes are presented.

preprint2011arXiv

Entropy Stable Numerical Schemes for Two-Fluid Plasma Equations

Two-fluid ideal plasma equations are a generalized form of the ideal MHD equations in which electrons and ions are considered as separate species. The design of efficient numerical schemes for the these equations is complicated on account of their non-linear nature and the presence of stiff source terms, especially for high charge to mass ratios and for low Larmor radii. In this article, we design entropy stable finite difference schemes for the two-fluid equations by combining entropy conservative fluxes and suitable numerical diffusion operators. Furthermore, to overcome the time step restrictions imposed by the stiff source terms, we devise time-stepping routines based on implicit-explicit (IMEX)-Runge Kutta (RK) schemes. The special structure of the two-fluid plasma equations is exploited by us to design IMEX schemes in which only local (in each cell) linear equations need to be solved at each time step. Benchmark numerical experiments are presented to illustrate the robustness and accuracy of these schemes.

preprint2011arXiv

Higher order finite difference schemes for the magnetic induction equations

We describe high order accurate and stable finite difference schemes for the initial-boundary value problem associated with the magnetic induction equations. These equations model the evolution of a magnetic field due to a given velocity field. The finite difference schemes are based on Summation by Parts (SBP) operators for spatial derivatives and a Simultaneous Approximation Term (SAT) technique for imposing boundary conditions. We present various numerical experiments that demonstrate both the stability as well as high order of accuracy of the schemes.

preprint2009arXiv

On the upstream mobility scheme for two-phase flow in porous media

When neglecting capillarity, two-phase incompressible flow in porous media is modelled as a scalar nonlinear hyperbolic conservation law. A change in the rock type results in a change of the flux function. Discretizing in one-dimensional with a finite volume method, we investigate two numerical fluxes, an extension of the Godunov flux and the upstream mobility flux, the latter being widely used in hydrogeology and petroleum engineering. Then, in the case of a changing rock type, one can give examples when the upstream mobility flux does not give the right answer.

Siddhartha Mishra

What is connected

Connect this record

See the researcher in context

Building this map preview

21 published item(s)

Geometry Aware Operator Transformer as an Efficient and Accurate Neural Surrogate for PDEs on Arbitrary Domains

Agnostic Physics-Driven Deep Learning

Error analysis for deep neural network approximations of parametric hyperbolic conservation laws

Error estimates for DeepOnets: A deep learning framework in infinite dimensions

Graph-Coupled Oscillator Networks

Long Expressive Memory for Sequence Modeling

On Bayesian data assimilation for PDEs with ill-posed forward problems

Physics Informed Neural Networks (PINNs)for approximating nonlinear dispersive PDEs

Variable-Input Deep Operator Networks

Word2Box: Capturing Set-Theoretic Semantics of Words using Box Embeddings

wPINNs: Weak Physics informed neural networks for approximating entropy solutions of hyperbolic conservation laws

A Multi-level procedure for enhancing accuracy of machine learning algorithms

Enhancing accuracy of deep learning algorithms by training with low-discrepancy sequences

Deep learning observables in computational fluid dynamics

Statistical solutions of the incompressible Euler equations

Computation of measure-valued solutions for the incompressible Euler equations

Numerical methods with controlled dissipation for small-scale dependent shocks

Accurate numerical schemes for approximating initial-boundary value problems for systems of conservation laws

Entropy Stable Numerical Schemes for Two-Fluid Plasma Equations

Higher order finite difference schemes for the magnetic induction equations

On the upstream mobility scheme for two-phase flow in porous media