Researcher profile

Pierre Gentine

Pierre Gentine contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

Earth-o1: A Grid-free Observation-native Atmospheric World Model

Despite the unprecedented volume of multimodal data provided by modern Earth observation systems, our ability to model atmospheric dynamics remains constrained. Traditional modeling frameworks force heterogeneous measurements into predefined spatial grids, inherently limiting the full exploitation of raw sensor data and creating severe computational bottlenecks. Here we present Earth-o1, an observation-native atmospheric world model that overcomes these structural limitations. Rather than relying on conventional atmospheric dynamical modeling systems or traditional data assimilation, Earth-o1 directly learns the continuous, three-dimensional physical evolution of the Earth system from ungridded observational data. By integrating diverse sensor inputs into a unified, grid-free dynamical field, the model autonomously advances the atmospheric state in space and time. We show that this fundamentally distinct paradigm enables direct, real-time forecasting and cross-sensor inference without the overhead of explicit numerical solvers. In hindcast evaluations, Earth-o1 achieves surface forecast skill comparable to the operational Integrated Forecasting System (IFS). These results establish that continuous, observation-driven world models -- a new class of fully observation-native geophysical simulators -- can match the fidelity of established physical frameworks, providing a scalable data-driven foundation for a digital twin of the Earth.

preprint2026arXiv

In-context learning to predict critical transitions in dynamical systems

Critical transitions - abrupt, often irreversible changes in system dynamics - arise across human and natural systems, often with catastrophic consequences. Real-world observations of such shifts remain scarce, preventing the development of reliable early warning systems. Conventional statistical and spectral indicators, such as increasing variance, tend to fail under realistic conditions of limited data and correlated noise, whereas existing deep learning classifiers do not extrapolate beyond their training data distribution. In this work, we introduce TipPFN, an in-context learning (ICL) framework that uses a prior-data fitted network to infer a system's proximity to a critical transition. Trained on our novel synthetic data generator, which is based on canonical bifurcation scenarios coupled to diverse, randomized stochastic dynamics, TipPFN flexibly capitalizes on contexts of various sizes, complexity and dimensionalities. We demonstrate robust, state-of-the-art early detection of critical transitions in previously unseen tipping regimes, sim-to-real examples, and real-world observations in both ICL and zero-shot settings.

preprint2026arXiv

Wavelet Flow Matching for Multi-Scale Physics Emulation

Accurate emulation of multi-scale physical systems governed by PDEs demands models that remain stable over long autoregressive rollouts while preserving fine-scale structures. Deterministic emulators produce overly-smoothed predictions, while generative approaches better capture details but are costly. Latent-space generative models have emerged as a compromise but with the additional cost of separately pre-trained autoencoders. We propose Wavelet Flow Matching (WFM), a novel generative emulator that overcomes current trade-offs between cost and skill by performing optimal-transport directly in the multi-scale wavelet space. Rather than learning a latent compression, WFM leverages the hierarchical structure of a U-Net to jointly predict transport velocities of a prescribed wavelet representation. On three challenging systems of chaotic fluid dynamics, WFM achieves superior long-horizon stability, accuracy and spectral coherence compared to state-of-the-art models. Our results clearly position the wavelet space as an effective training-free representation for generative emulation of complex physical dynamics.

preprint2022arXiv

Carbon Monitor-Power: near-real-time monitoring of global power generation on hourly to daily scales

We constructed a frequently updated, near-real-time global power generation dataset: Carbon Monitor-Power since January, 2016 at national levels with near-global coverage and hourly-to-daily time resolution. The data presented here are collected from 37 countries across all continents for eight source groups, including three types of fossil sources (coal, gas, and oil), nuclear energy and four groups of renewable energy sources (solar energy, wind energy, hydro energy and other renewables including biomass, geothermal, etc.). The global near-real-time power dataset shows the dynamics of the global power system, including its hourly, daily, weekly and seasonal patterns as influenced by daily periodical activities, weekends, seasonal cycles, regular and irregular events (i.e., holidays) and extreme events (i.e., the COVID-19 pandemic). The Carbon Monitor-Power dataset reveals that the COVID-19 pandemic caused strong disruptions in some countries (i.e., China and India), leading to a temporary or long-lasting shift to low carbon intensity, while it had only little impact in some other countries (i.e., Australia). This dataset offers a large range of opportunities for power-related scientific research and policy-making.

preprint2022arXiv

Dryland evapotranspiration from remote sensing solar-induced chlorophyll fluorescence: constraining an optimal stomatal model within a two-source energy balance model

Evapotranspiration (ET) represents the largest water loss flux in drylands, but ET and its partition into plant transpiration (T) and soil evaporation (E) are poorly quantified, especially at fine temporal scales. Physically-based remote sensing models relying on sensible heat flux estimates, like the two-source energy balance model, could benefit from considering more explicitly the key effect of stomatal regulation on dryland ET. The objective of this study is to assess the value of solar-induced chlorophyll fluorescence (SIF), a proxy for photosynthesis, to constrain the canopy conductance (Gc) of an optimal stomatal model within a two-source energy balance model in drylands. We assessed our ET estimation using in situ eddy covariance GPP as a benchmark, and compared with results from using the Contiguous solar-induced chlorophyll fluorescence (CSIF) remote sensing product instead of GPP, with and without the effect of root-zone soil moisture on the Gc. The estimated ET was robust across four steppes and two tree-grass dryland ecosystem. Comparison of ET simulated against in situ GPP yielded an average R2 of 0.73 (0.86) and RMSE of 0.031 (0.36) mm at half-hourly (daily) timescale. Including explicitly the soil moisture effect on Gc, increased the R2 to 0.76 (0.89). For the CSIF model, the average R2 for ET estimates also improved when including the effect of soil moisture: from 0.65 (0.79) to 0.71 (0.84), with RMSE ranging between 0.023 (0.22) and 0.043 (0.54) mm depending on the site. Our results demonstrate the capacity of SIF to estimate subdaily and daily ET fluxes under very low ET conditions. SIF can provide effective vegetation signals to constrain stomatal conductance and partition ET into T and E in drylands. This approach could be extended for regional estimates using remote sensing SIF estimates such as CSIF, TROPOMI-SIF, or the upcoming FLEX mission, among others.

preprint2022arXiv

Non-Linear Dimensionality Reduction with a Variational Encoder Decoder to Understand Convective Processes in Climate Models

Deep learning can accurately represent sub-grid-scale convective processes in climate models, learning from high resolution simulations. However, deep learning methods usually lack interpretability due to large internal dimensionality, resulting in reduced trustworthiness in these methods. Here, we use Variational Encoder Decoder structures (VED), a non-linear dimensionality reduction technique, to learn and understand convective processes in an aquaplanet superparameterized climate model simulation, where deep convective processes are simulated explicitly. We show that similar to previous deep learning studies based on feed-forward neural nets, the VED is capable of learning and accurately reproducing convective processes. In contrast to past work, we show this can be achieved by compressing the original information into only five latent nodes. As a result, the VED can be used to understand convective processes and delineate modes of convection through the exploration of its latent dimensions. A close investigation of the latent space enables the identification of different convective regimes: a) stable conditions are clearly distinguished from deep convection with low outgoing longwave radiation and strong precipitation; b) high optically thin cirrus-like clouds are separated from low optically thick cumulus clouds; and c) shallow convective processes are associated with large-scale moisture content and surface diabatic heating. Our results demonstrate that VEDs can accurately represent convective processes in climate models, while enabling interpretability and better understanding of sub-grid-scale physical processes, paving the way to increasingly interpretable machine learning parameterizations with promising generative properties

preprint2022arXiv

Pre-averaging fractional processes contaminated by noise, with an application to turbulence

In this article, we consider the problem of estimating fractional processes based on noisy high-frequency data. Generalizing the idea of pre-averaging to a fractional setting, we exhibit a sequence of consistent estimators for the unknown parameters of interest by proving a law of large numbers for associated variation functionals. In contrast to the semimartingale setting, the optimal window size for pre-averaging depends on the unknown roughness parameter of the underlying process. We evaluate the performance of our estimators in a simulation study and use them to empirically verify Kolmogorov's 2/3-law in turbulence data contaminated by instrument noise.

preprint2021arXiv

De-carbonization of global energy use during the COVID-19 pandemic

The COVID-19 pandemic has disrupted human activities, leading to unprecedented decreases in both global energy demand and GHG emissions. Yet a little known that there is also a low carbon shift of the global energy system in 2020. Here, using the near-real-time data on energy-related GHG emissions from 30 countries (about 70% of global power generation), we show that the pandemic caused an unprecedented de-carbonization of global power system, representing by a dramatic decrease in the carbon intensity of power sector that reached a historical low of 414.9 tCO2eq/GWh in 2020. Moreover, the share of energy derived from renewable and low-carbon sources (nuclear, hydro-energy, wind, solar, geothermal, and biomass) exceeded that from coal and oil for the first time in history in May of 2020. The decrease in global net energy demand (-1.3% in the first half of 2020 relative to the average of the period in 2016-2019) masks a large down-regulation of fossil-fuel-burning power plants supply (-6.1%) coincident with a surge of low-carbon sources (+6.2%). Concomitant changes in the diurnal cycle of electricity demand also favored low-carbon generators, including a flattening of the morning ramp, a lower midday peak, and delays in both the morning and midday load peaks in most countries. However, emission intensities in the power sector have since rebounded in many countries, and a key question for climate mitigation is thus to what extent countries can achieve and maintain lower, pandemic-level carbon intensities of electricity as part of a green recovery.

preprint2021arXiv

Enforcing Analytic Constraints in Neural-Networks Emulating Physical Systems

Neural networks can emulate nonlinear physical systems with high accuracy, yet they may produce physically-inconsistent results when violating fundamental constraints. Here, we introduce a systematic way of enforcing nonlinear analytic constraints in neural networks via constraints in the architecture or the loss function. Applied to convective processes for climate modeling, architectural constraints enforce conservation laws to within machine precision without degrading performance. Enforcing constraints also reduces errors in the subsets of the outputs most impacted by the constraints.

preprint2021arXiv

Global Daily CO$_2$ emissions for the year 2020

The diurnal cycle CO$_2$ emissions from fossil fuel combustion and cement production reflect seasonality, weather conditions, working days, and more recently the impact of the COVID-19 pandemic. Here, for the first time we provide a daily CO$_2$ emission dataset for the whole year of 2020 calculated from inventory and near-real-time activity data (called Carbon Monitor project: https://carbonmonitor.org). It was previously suggested from preliminary estimates that did not cover the entire year of 2020 that the pandemics may have caused more than 8% annual decline of global CO$_2$ emissions. Here we show from detailed estimates of the full year data that the global reduction was only 5.4% (-1,901 MtCO$_2$, ). This decrease is 5 times larger than the annual emission drop at the peak of the 2008 Global Financial Crisis. However, global CO$_2$ emissions gradually recovered towards 2019 levels from late April with global partial re-opening. More importantly, global CO$_2$ emissions even increased slightly by +0.9% in December 2020 compared with 2019, indicating the trends of rebound of global emissions. Later waves of COVID-19 infections in late 2020 and corresponding lockdowns have caused further CO$_2$ emissions reductions particularly in western countries, but to a much smaller extent than the declines in the first wave. That even substantial world-wide lockdowns of activity led to a one-time decline in global CO$_2$ emissions of only 5.4% in one year highlights the significant challenges for climate change mitigation that we face in the post-COVID era. These declines are significant, but will be quickly overtaken with new emissions unless the COVID-19 crisis is utilized as a break-point with our fossil-fuel trajectory, notably through policies that make the COVID-19 recovery an opportunity to green national energy and development plans.

preprint2021arXiv

Global Gridded Daily CO$_2$ Emissions

Precise and high-resolution carbon dioxide (CO$_2$) emission data is of great importance of achieving the carbon neutrality around the world. Here we present for the first time the near-real-time Global Gridded Daily CO$_2$ Emission Datasets (called GRACED) from fossil fuel and cement production with a global spatial-resolution of 0.1$^\circ$ by 0.1$^\circ$ and a temporal-resolution of 1-day. Gridded fossil emissions are computed for different sectors based on the daily national CO$_2$ emissions from near real time dataset (Carbon Monitor), the spatial patterns of point source emission dataset Global Carbon Grid (GID), Emission Database for Global Atmospheric Research (EDGAR) and spatiotemporal patters of satellite nitrogen dioxide (NO$_2$) retrievals. Our study on the global CO$_2$ emissions responds to the growing and urgent need for high-quality, fine-grained near-real-time CO2 emissions estimates to support global emissions monitoring across various spatial scales. We show the spatial patterns of emission changes for power, industry, residential consumption, ground transportation, domestic and international aviation, and international shipping sectors between 2019 and 2020. This help us to give insights on the relative contributions of various sectors and provides a fast and fine-grained overview of where and when fossil CO$_2$ emissions have decreased and rebounded in response to emergencies (e.g. COVID-19) and other disturbances of human activities than any previously published dataset. As the world recovers from the pandemic and decarbonizes its energy systems, regular updates of this dataset will allow policymakers to more closely monitor the effectiveness of climate and energy policies and quickly adapt.

preprint2020arXiv

Towards Physically-consistent, Data-driven Models of Convection

Data-driven algorithms, in particular neural networks, can emulate the effect of sub-grid scale processes in coarse-resolution climate models if trained on high-resolution climate simulations. However, they may violate key physical constraints and lack the ability to generalize outside of their training set. Here, we show that physical constraints can be enforced in neural networks, either approximately by adapting the loss function or to within machine precision by adapting the architecture. As these physical constraints are insufficient to guarantee generalizability, we additionally propose to physically rescale the training and validation data to improve the ability of neural networks to generalize to unseen climates.

preprint2018arXiv

Deep learning to represent sub-grid processes in climate models

The representation of nonlinear sub-grid processes, especially clouds, has been a major source of uncertainty in climate models for decades. Cloud-resolving models better represent many of these processes and can now be run globally but only for short-term simulations of at most a few years because of computational limitations. Here we demonstrate that deep learning can be used to capture many advantages of cloud-resolving modeling at a fraction of the computational cost. We train a deep neural network to represent all atmospheric sub-grid processes in a climate model by learning from a multi-scale model in which convection is treated explicitly. The trained neural network then replaces the traditional sub-grid parameterizations in a global general circulation model in which it freely interacts with the resolved dynamics and the surface-flux scheme. The prognostic multi-year simulations are stable and closely reproduce not only the mean climate of the cloud-resolving simulation but also key aspects of variability, including precipitation extremes and the equatorial wave spectrum. Furthermore, the neural network approximately conserves energy despite not being explicitly instructed to. Finally, we show that the neural network parameterization generalizes to new surface forcing patterns but struggles to cope with temperatures far outside its training manifold. Our results show the feasibility of using deep learning for climate model parameterization. In a broader context, we anticipate that data-driven Earth System Model development could play a key role in reducing climate prediction uncertainty in the coming decade.