Source author record

Ashesh Chattopadhyay

Ashesh Chattopadhyay appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning physics.ao-ph physics.flu-dyn physics.comp-ph nlin.CD Computation Computer Vision math.DS math.NA Numerical Analysis physics.geo-ph

Catalog footprint

What is connected

9works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Generative forecasting with joint probability models

Chaotic dynamical systems exhibit strong sensitivity to initial conditions and often contain unresolved multiscale processes, making deterministic forecasting fundamentally limited. Generative models offer an appealing alternative by learning distributions over plausible system evolutions; yet, most existing approaches focus on next-step conditional prediction rather than the structure of the underlying dynamics. In this work, we reframe forecasting as a fully generative problem by learning the joint probability distribution of lagged system states over short temporal windows and obtaining forecasts through marginalization. This new perspective allows the model to capture nonlinear temporal dependencies, represent multistep trajectory segments, and produce next-step predictions consistent with the learned joint distribution. We also introduce a general, model-agnostic training and inference framework for joint generative forecasting and show how it enables assessment of forecast robustness and reliability using three complementary uncertainty quantification metrics (ensemble variance, short-horizon autocorrelation, and cumulative Wasserstein drift), without access to ground truth. We evaluate the performance of the proposed method on two canonical chaotic dynamical systems, the Lorenz-63 system and the Kuramoto-Sivashinsky equation, and show that joint generative models yield improved short-term predictive skill, preserve attractor geometry, and achieve substantially more accurate long-range statistical behaviour than conventional conditional next-step models.

preprint2022arXiv

Discovery of interpretable structural model errors by combining Bayesian sparse regression and data assimilation: A chaotic Kuramoto-Sivashinsky test case

Models of many engineering and natural systems are imperfect. The discrepancy between the mathematical representations of a true physical system and its imperfect model is called the model error. These model errors can lead to substantial differences between the numerical solutions of the model and the state of the system, particularly in those involving nonlinear, multi-scale phenomena. Thus, there is increasing interest in reducing model errors, particularly by leveraging the rapidly growing observational data to understand their physics and sources. Here, we introduce a framework named MEDIDA: Model Error Discovery with Interpretability and Data Assimilation. MEDIDA only requires a working numerical solver of the model and a small number of noise-free or noisy sporadic observations of the system. In MEDIDA, first the model error is estimated from differences between the observed states and model-predicted states (the latter are obtained from a number of one-time-step numerical integrations from the previous observed states). If observations are noisy, a data assimilation (DA) technique such as ensemble Kalman filter (EnKF) is employed to provide the analysis state of the system, which is then used to estimate the model error. Finally, an equation-discovery technique, here the relevance vector machine (RVM), a sparsity-promoting Bayesian method, is used to identify an interpretable, parsimonious, and closed-form representation of the model error. Using the chaotic Kuramoto-Sivashinsky (KS) system as the test case, we demonstrate the excellent performance of MEDIDA in discovering different types of structural/parametric model errors, representing different types of missing physics, using noise-free and noisy observations.

preprint2022arXiv

FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators

FourCastNet, short for Fourier Forecasting Neural Network, is a global data-driven weather forecasting model that provides accurate short to medium-range global predictions at $0.25^{\circ}$ resolution. FourCastNet accurately forecasts high-resolution, fast-timescale variables such as the surface wind speed, precipitation, and atmospheric water vapor. It has important implications for planning wind energy resources, predicting extreme weather events such as tropical cyclones, extra-tropical cyclones, and atmospheric rivers. FourCastNet matches the forecasting accuracy of the ECMWF Integrated Forecasting System (IFS), a state-of-the-art Numerical Weather Prediction (NWP) model, at short lead times for large-scale variables, while outperforming IFS for variables with complex fine-scale structure, including precipitation. FourCastNet generates a week-long forecast in less than 2 seconds, orders of magnitude faster than IFS. The speed of FourCastNet enables the creation of rapid and inexpensive large-ensemble forecasts with thousands of ensemble-members for improving probabilistic forecasting. We discuss how data-driven deep learning models such as FourCastNet are a valuable addition to the meteorology toolkit to aid and augment NWP models.

preprint2022arXiv

Long-term stability and generalization of observationally-constrained stochastic data-driven models for geophysical turbulence

Recent years have seen a surge in interest in building deep learning-based fully data-driven models for weather prediction. Such deep learning models if trained on observations can mitigate certain biases in current state-of-the-art weather models, some of which stem from inaccurate representation of subgrid-scale processes. However, these data-driven models, being over-parameterized, require a lot of training data which may not be available from reanalysis (observational data) products. Moreover, an accurate, noise-free, initial condition to start forecasting with a data-driven weather model is not available in realistic scenarios. Finally, deterministic data-driven forecasting models suffer from issues with long-term stability and unphysical climate drift, which makes these data-driven models unsuitable for computing climate statistics. Given these challenges, previous studies have tried to pre-train deep learning-based weather forecasting models on a large amount of imperfect long-term climate model simulations and then re-train them on available observational data. In this paper, we propose a convolutional variational autoencoder-based stochastic data-driven model that is pre-trained on an imperfect climate model simulation from a 2-layer quasi-geostrophic flow and re-trained, using transfer learning, on a small number of noisy observations from a perfect simulation. This re-trained model then performs stochastic forecasting with a noisy initial condition sampled from the perfect simulation. We show that our ensemble-based stochastic data-driven model outperforms a baseline deterministic encoder-decoder-based convolutional model in terms of short-term skills while remaining stable for long-term climate simulations yielding accurate climatology.

preprint2021arXiv

Stable a posteriori LES of 2D turbulence using convolutional neural networks: Backscattering analysis and generalization to higher Re via transfer learning

There is a growing interest in developing data-driven subgrid-scale (SGS) models for large-eddy simulation (LES) using machine learning (ML). In a priori (offline) tests, some recent studies have found ML-based data-driven SGS models that are trained on high-fidelity data (e.g., from direct numerical simulation, DNS) to outperform baseline physics-based models and accurately capture the inter-scale transfers, both forward (diffusion) and backscatter. While promising, instabilities in a posteriori (online) tests and inabilities to generalize to a different flow (e.g., with a higher Reynolds number, Re) remain as major obstacles in broadening the applications of such data-driven SGS models. For example, many of the same aforementioned studies have found instabilities that required often ad-hoc remedies to stabilize the LES at the expense of reducing accuracy. Here, using 2D decaying turbulence as the testbed, we show that deep fully convolutional neural networks (CNNs) can accurately predict the SGS forcing terms and the inter-scale transfers in a priori tests, and if trained with enough samples, lead to stable and accurate a posteriori LES-CNN. Further analysis attributes these instabilities to the disproportionally lower accuracy of the CNNs in capturing backscattering when the training set is small. We also show that transfer learning, which involves re-training the CNN with a small amount of data (e.g., 1%) from the new flow, enables accurate and stable a posteriori LES-CNN for flows with 16x higher Re (as well as higher grid resolution if needed). These results show the promise of CNNs with transfer learning to provide stable, accurate, and generalizable LES for practical use.

preprint2020arXiv

Analog forecasting of extreme-causing weather patterns using deep learning

Numerical weather prediction (NWP) models require ever-growing computing time/resources, but still, have difficulties with predicting weather extremes. Here we introduce a data-driven framework that is based on analog forecasting (prediction using past similar patterns) and employs a novel deep learning pattern-recognition technique (capsule neural networks, CapsNets) and impact-based auto-labeling strategy. CapsNets are trained on mid-tropospheric large-scale circulation patterns (Z500) labeled $0-4$ depending on the existence and geographical region of surface temperature extremes over North America several days ahead. The trained networks predict the occurrence/region of cold or heat waves, only using Z500, with accuracies (recalls) of $69\%-45\%$ $(77\%-48\%)$ or $62\%-41\%$ $(73\%-47\%)$ $1-5$ days ahead. CapsNets outperform simpler techniques such as convolutional neural networks and logistic regression. Using both temperature and Z500, accuracies (recalls) with CapsNets increase to $\sim 80\%$ $(88\%)$, showing the promises of multi-modal data-driven frameworks for accurate/fast extreme weather predictions, which can augment NWP efforts in providing early warnings.

preprint2020arXiv

Data-driven subgrid-scale modeling of forced Burgers turbulence using deep learning with generalization to higher Reynolds numbers via transfer learning

Developing data-driven subgrid-scale (SGS) models for large eddy simulations (LES) has received substantial attention recently. Despite some success, particularly in a priori (offline) tests, challenges have been identified that include numerical instabilities in a posteriori (online) tests and generalization (i.e., extrapolation) of trained data-driven SGS models, for example to higher Reynolds numbers. Here, using the stochastically forced Burgers turbulence as the test-bed, we show that deep neural networks trained using properly pre-conditioned (augmented) data yield stable and accurate a posteriori LES models. Furthermore, we show that transfer learning enables accurate/stable generalization to a flow with 10x higher Reynolds number.

preprint2019arXiv

Data-driven prediction of a multi-scale Lorenz 96 chaotic system using deep learning methods: Reservoir computing, ANN, and RNN-LSTM

In this paper, the performance of three deep learning methods for predicting short-term evolution and for reproducing the long-term statistics of a multi-scale spatio-temporal Lorenz 96 system is examined. The methods are: echo state network (a type of reservoir computing, RC-ESN), deep feed-forward artificial neural network (ANN), and recurrent neural network with long short-term memory (RNN-LSTM). This Lorenz 96 system has three tiers of nonlinearly interacting variables representing slow/large-scale ($X$), intermediate ($Y$), and fast/small-scale ($Z$) processes. For training or testing, only $X$ is available; $Y$ and $Z$ are never known or used. We show that RC-ESN substantially outperforms ANN and RNN-LSTM for short-term prediction, e.g., accurately forecasting the chaotic trajectories for hundreds of numerical solver's time steps, equivalent to several Lyapunov timescales. The RNN-LSTM and ANN show some prediction skills as well; RNN-LSTM bests ANN. Furthermore, even after losing the trajectory, data predicted by RC-ESN and RNN-LSTM have probability density functions (PDFs) that closely match the true PDF, even at the tails. The PDF of the data predicted using ANN, however, deviates from the true PDF. Implications, caveats, and applications to data-driven and data-assisted surrogate modeling of complex nonlinear dynamical systems such as weather/climate are discussed.

preprint2018arXiv

A test case for application of convolutional neural networks to spatio-temporal climate data: Re-identifying clustered weather patterns

Convolutional neural networks (CNNs) can potentially provide powerful tools for classifying and identifying patterns in climate and environmental data. However, because of the inherent complexities of such data, which are often spatio-temporal, chaotic, and non-stationary, the CNN algorithms must be designed/evaluated for each specific dataset and application. Yet to start, CNN, a supervised technique, requires a large labeled dataset. Labeling demands (human) expert time, which combined with the limited number of relevant examples in this area, can discourage using CNNs for new problems. To address these challenges, here we (1) Propose an effective auto-labeling strategy based on using an unsupervised clustering algorithm and evaluating the performance of CNNs in re-identifying these clusters; (2) Use this approach to label thousands of daily large-scale weather patterns over North America in the outputs of a fully-coupled climate model and show the capabilities of CNNs in re-identifying the 4 clustered regimes. The deep CNN trained with $1000$ samples or more per cluster has an accuracy of $90\%$ or better. Accuracy scales monotonically but nonlinearly with the size of the training set, e.g. reaching $94\%$ with $3000$ training samples per cluster. Effects of architecture and hyperparameters on the performance of CNNs are examined and discussed.

Ashesh Chattopadhyay

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Generative forecasting with joint probability models

Discovery of interpretable structural model errors by combining Bayesian sparse regression and data assimilation: A chaotic Kuramoto-Sivashinsky test case

FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators

Long-term stability and generalization of observationally-constrained stochastic data-driven models for geophysical turbulence

Stable a posteriori LES of 2D turbulence using convolutional neural networks: Backscattering analysis and generalization to higher Re via transfer learning

Analog forecasting of extreme-causing weather patterns using deep learning

Data-driven subgrid-scale modeling of forced Burgers turbulence using deep learning with generalization to higher Reynolds numbers via transfer learning

Data-driven prediction of a multi-scale Lorenz 96 chaotic system using deep learning methods: Reservoir computing, ANN, and RNN-LSTM

A test case for application of convolutional neural networks to spatio-temporal climate data: Re-identifying clustered weather patterns