Towards the combination of physical and data-driven forecasts for Earth system prediction – Eviatar Bach (ENS Paris)

The seminar is on January the 26th, at 14:00 (CET) remotely and in person. The in-person meeting will be held in the SCAI conference room (map at the end of the post).

If you like to attend online, here is the link for zoom:

Eviatar Bach’s presentation is entitled:

«Towards the combination of physical and data-driven forecasts for Earth system prediction»


Due to the recent success of machine learning (ML) in many prediction problems, there is a high degree of interest in applying ML to Earth system prediction. However, because of the high dimensionality of the system, it is critical to use hybrid methods which combine data-driven models, physical models, and observations. I will present two such hybrid methods: Ensemble Oscillation Correction (EnOC) and multi-model data assimilation (MM-DA).

Oscillatory modes of the climate system are one of its most predictable features, especially at intraseasonal timescales. It has previously been shown that these oscillations can be predicted well with statistical methods, often with better skill than dynamical models. However, they only represent a portion of the signal, and a method for beneficially combining them with dynamical forecasts of the full system has not previously been developed. Ensemble Oscillation Correction (EnOC) is a method which corrects oscillatory modes in ensemble forecasts from dynamical models. We show results of EnOC applied to chaotic toy models with significant oscillatory components, as well as to forecasts of South Asian monsoon rainfall.

A more general method for combining multiple models and observations is multi-model data assimilation (MM-DA). MM-DA generalizes the variational or Bayesian formulation of the Kalman filter. However, previous implementations of this approach have not estimated the model error, and have therewith not been able to correctly weight the separate models and the observations. Here, we show how multiple models can be combined for both forecasting and DA by using an ensemble Kalman filter with adaptive model error estimation. This methodology is applied to multiscale chaotic models and results in significant error reductions compared to the best model and to an unweighted multi-model ensemble. Lastly, I will discuss the potential of this method for combining physical model forecasts, ML, and observations.


Eviatar Bach is a Make Our Planet Great Again (MOPGA) postdoctoral fellow in Michael Ghil’s group at the École Normale Supérieure in Paris. Previously, he obtained his PhD at the University of Maryland, College Park with Eugenia Kalnay and Safa Mote. He is currently working on improving geophysical forecasts with data assimilation and data-driven prediction methods, and is interested in understanding the nonlinear dynamics and predictability of the climate system.

New ways for dynamical prediction of extreme heat waves: rare event simulations and machine learning with deep neural networks. – Freddy Bouchet (ENS Lyon)

The seminar is on October the 19th, at 14:00 (CEST)both in-person and remotely

Place of the seminar: « Campus Pierre & Marie Curie » of Sorbonne University. It will take place in SCAI seminar room, building « Esclangon », 1st floor

If you like to attend online, here is the link for zoom:

Freddy Bouchet’s presentation is entitled:

«New ways for dynamical prediction of extreme heat waves: rare event simulations and machine learning with deep neural networks.»


In the climate system, extreme events or transitions between climate attractors are of primarily importance for understanding the impact of climate change. Recent extreme heat waves with huge impacts are striking examples. However, it is very hard to study those events with conventional approaches, because of the lack of statistics, because they are too rare for historical data and because realistic models are too complex to be run long enough.

We cope with this lack of data issue using rare event simulations. Using some of the best climate models, we oversample extremely rare events and obtain several hundreds more events than with usual climate runs, at a fixed numerical cost. Coupled with deep neural networks this approach improves drastically the prediction of extreme heat waves.

This shed new light on the fluid mechanics processes which lead to extreme heat waves. We will describe quasi-stationary patterns of turbulent Rossby waves that lead to global teleconnection patterns in connection with heat waves and analyze their dynamics. We stress the relevance of these patterns for recently observed extreme heat waves and the prediction potential of our approach.

Climate Modeling in the Age of Machine Learning – Laure Zanna (NYU)

The seminar is on June the 23th, at 15:00 and will be held remotely, in english.

Link to the zoom session:

Laure Zanna’s presentation is entitled:

«Climate Modeling in the Age of Machine Learning »


Numerical simulations used for weather and climate predictions solve approximations of the governing laws of fluid motions on a grid. Ultimately, uncertainties in climate predictions originate from the poor or lacking representation of processes, such as ocean turbulence and clouds that are not resolved on the grid of global climate models. The representation of these unresolved processes has been a bottleneck in improving climate simulations and projections. The explosion of climate data and the power of machine learning algorithms are suddenly offering new opportunities: can we deepen our understanding of these unresolved processes and simultaneously improve their representation in climate models to reduce climate projections uncertainty? In this talk, I will discuss the current state of climate modeling and its future, focusing on the advantages and challenges of using machine learning for climate projections. I will present some of our recent work in which we leverage tools from machine learning and deep learning to learn representations of unresolved ocean processes and improve climate simulations. Our work suggests that machine learning could open the door to discovering new physics from data and enhance climate predictions.

Short bio:

Laure Zanna is a Professor in Mathematics & Atmosphere/Ocean Science at the Courant Institute, New York University.  Her research focuses on the role of ocean dynamics in climate change. Prior to NYU, she was a faculty member at the University of Oxford until 2019 and obtained her PhD in 2009 in Climate Dynamics from Harvard University. She was the recipient of the 2020 Nicholas P. Fofonoff Award from the American Meteorological Society “For exceptional creativity in the development and application of new concepts in ocean and climate dynamics”. She is the lead principal investigator of M²LInES, an international effort supported by Schmidt Futures to improve climate models with scientific machine learning. 

Filling gaps in ocean satellite data -Aida Alvera-Azcárate & Alexander Barth

Link to the slides.

The seminar is on May the 6th, at 14:00 and will be held remotely, in english.

Link to the zoom session:

Aida Alvera-Azcárate’s presentation is entitled:

« Filling gaps in ocean satellite data »


 Satellite data offer an unequalled amount of information of the Earth’s surface, including the ocean. However, data measured using visible and infrared wavebands are affected by the presence of clouds and have therefore a large amount of missing data (on average, clouds cover about 75% of the Earth). The spatial and temporal scales of variability in the ocean require techniques able to handle undersampling of the dominant scales of variability. The GHER (GeoHydrodynamics and Environment Research) of the University of Liege in Belgium has been working over the last two decades on interpolation techniques for satellite and in situ ocean data. In this talk we will focus on techniques developed for satellite data. We’ll start with DINEOF – Data Interpolating Empirical Orthogonal Functions- which is a data-driven technique using EOFs to infer missing information in satellite datasets. We will follow with a more recent development, DINCAE – Data Interpolating Convolutional AutoEncoder. Training a neural network with incomplete data is problematic, and this is overcome in DINCAE by using the satellite data and its expected error variance as input. The autoencoder provides the reconstructed field along with its expected error variance as output. We will provide examples of reconstructed satellite data for several variables, like sea surface temperature, chlorophyll concentration, and some recent developments with DINCAE to grid altimetry data to complete fields.

Short bios:

Aida Alvera-Azcárate is a researcher at the GHER (GeoHydrodynamics and Environment Research) of the University of Liege in Belgium. She did a PhD in Science at the University of Liege and made a post-doc at the University of South Florida (US) before joining the GHER in 2007 where she studies the ocean using satellite and in situ data and works in the development of interpolation techniques to reconstruct satellite data.

Alexander Barth is a researcher working at the University of Liege (Belgium) in the GHER group (GeoHydrodynamics and Environment Research). He did a PhD on nested numerical ocean models and data assimilation. Currently he is working on variational analysis schemes for climatologies and neural networks to reconstruct missing data.

Working group 3: Pierre Lepetit – Estimation of visibility and snow height on webcam images with learning to rank approach

L’Atelier interne « SCAI & AI4Climate » réunit les chercheurs, ingénieurs, doctorants, post doctorants concernés par les thématiques liées à conception et l’utilisation de nouvelles méthodes d’Intelligence Artificielle pour l’étude de l’environnement, allant du modèle à l’observation. Les premières réunions seront consacrées aux travaux des doctorants. L’exposé sera suivi d’une discussion avec les participants sur l’approche et les perspectives possibles du travail. 

16 Mars à 10h
sur le campus de Jussieu,
Salle de réunion SCAI
Batiment Esclangon 1er étage

Participer à la réunion Zoom
(voir information de connexion ci-dessous)

  • The image-based estimation of meteorological parameters provides clear benefits for surface weather observation. When a local event arises, as a dense fog or a snow settling, webcams and CCTV cameras are sources of valuable information. These images actually inform about the class of weather (sunny, rainy, foggy, snowy, etc). They also enable to gauge quantitative parameters as the horizontal visibility (the farest you can see), the snow height, the precipitation rate, etc, with a variable precision.
  • Recently, the weather classification task has been successfully addressed by deep learning approaches. However, the quantitative estimation faces a strong difficulty: the existing data sets that contain both images and precise weather measurements are rare and involve only few different outdoor scenes. It is virtually impossible for an expert to assign image-wise quantitative labels, but it is possible to compare two images from the same webcam and therefore assign pairwise labels. An “uncomparable” label being assigned to couples for which the expert is not able to distinguish the two images with respect to the parameter.
  • This analysis gives the starting point of the workshop. The discussion will deal with the methods of labeling, learning to rank and calibration that may help to yield such comparisons and to predict ordinal or quantitative estimations of visibility and snow height. The way uncomparable pairs could lead to predict an image-wise uncertainty will also be addressed.

Participer à la réunion Zoom

ID de réunion : 982 7831 9724

Trouvez votre numéro local :

Machine learning and natural hazards – Sophie Giffard-Roisin

Link for the slides

The seminar is on February 10th 14:00 and will be held remotely.

Link to the zoom session:

Sophie Giffard-Roisin presentation is entitled:

« Machine learning and natural hazards »

The goal of this talk is to show how we can use the strength of artificial intelligence to help making diagnosis and finding concrete and local solutions to natural hazards. Tropical cyclones, avalanches, earthquakes or landslides affects often vulnerable areas and populations, where the understanding of the phenomena and better risk assessment and predictions can make a substantial impact. The data available to monitor these natural phenomena has considerably increased in the recent years. For example, SAR (synthetic aperture radar) imaging data, provided by the Sentinel 1 satellites, is now freely available up to every 6 days in a majority of regions, even remote areas. Yet, artificial intelligence (AI) and machine learning (ML) have only scarcely been used in these domains. But these techniques have already showed their impact in many scientific fields having similar data structures (large volume of data, presence of noise, complex physical phenomena) such as medical imaging (detection/segmentation of pathologies), crop yield (prediction), security (recognition). We will see in this talk, with concrete examples, how to design machine learning models for specific tasks with real imaging or temporal data inputs. Concretely, starting mainly from convolutional neural networks, what are the key aspects to consider and what are pitfalls to avoid?

Short bio:
Sophie Giffard-Roisin is a researcher hired by IRD (French National Institute for Sustainable Development) and based at ISTerre, Grenoble (UGA, France). Her work focuses on machine learning applications for natural hazards, especially using remote sensing and time series data. She did her PhD at Inria, Nice (France) under the supervision of Nicholas Ayache on machine learning and modelling for medical image analysis. Then she did a post-doc in CU Boulder, Colorado (USA) in Claire Monteleoni’s team where she worked on climate and meteorological applications of machine learning. She moved to ISTerre, the Earth Science Laboratory of Grenoble Université (UGA, France), for a permanent position in 2019 where she now focuses on machine learning for natural hazards in geosciences.

Power-efficient deep learning algorithms – Sébastien Loustau

Link for the slides

Next seminar is on October 14th October (14:30) in « Campus Pierre & Marie Curie » of Sorbonne University. It will take place in SCAI seminar room, building « Esclangon », 1st floor

Si vous souhaitez assister en personne à ce séminaire:

Sébastien présentera ses travaux à la salle de séminaire de SCAI (plan d’accès:
Merci de vous inscrire sur ce lien :
Nous vous conseillons néanmoins d’apporter avec vous votre ordinateur portable afin d’être connecté en même temps sur la salle zoom (voir ci-dessous)

Si vous souhaitez assister à distance: 

Voici le lien zoom:
Vous pourrez également poser des questions sur le chat qui seront retransmises dans la salle.

Sebastien Loustau presentation is entitled:

« Power-efficient deep learning algorithms»

In this talk, I will present both theoretical and practical aspect of how designing power-efficient deep learning algorithms. After a non-exhaustive survey of different contributions about the machine learning perspective (training low bit-width networks), the hardware counterpart (CNNs accelerators) and the relationship with Auto-ML and the NAS procedure, I will present a theoretically based approach to add the power efficiency constraint into the optimization procedure of training deep nets. This work in progress bridges optimal transport and information theory with online learning.

Short bio:
Sébastien is a researcher in mathematical statistics and Machine Learning. He has studied the theoretical aspect of both statistical and online learning. His research interests include online learning, unsupervised learning, adaptive algorithms and minimax theory. He also founded LumenAI 5 years ago.