Our Nature Paper: Using Gaussian Processes to Perturb Climate Models

I am excited to share that I am a co-author on a paper recently published in Nature Communications Earth & Environment: “A framework for generating stochastic perturbations in chaotic dynamical systems using machine learning”. This work introduces a novel method for representing model uncertainty in complex climate models.

The Challenge of Climate Model Uncertainty

Climate models are incredibly complex, and one of the major challenges is accounting for “model uncertainty”. This uncertainty arises from processes that occur at scales smaller than the model’s grid resolution. These “sub-grid” processes can have a significant impact on the large-scale climate dynamics. These uncertainties are difficult to represent and understand physically.

Our Approach: Data-Driven Perturbations with Gaussian Processes

Our paper presents a new framework that uses machine learning to generate physically realistic perturbations. The core idea is to use high-resolution model runs to generate training data and “learn” the structure of the sub-grid variability using machine learning. We then use this learned structure to generate perturbations during run-time in a lower-resolution climate model.

My key contributions to this work were in the following areas:

  • Gaussian Process (GP) Design and Training: I was heavily involved in designing and training the Gaussian Process models that form the heart of this framework. GPs are a powerful tool for non-parametric regression, and they are particularly well-suited for this task because they provide a distribution over possible functions, not just a single best-fit. This allowed us to capture the complex, multi-scale nature of the sub-grid variability.
  • Experimental Design: A crucial part of this project was designing the experiments to effectively learn the sub-grid dynamics. I worked on the experimental design to ensure that we could efficiently train our GP models and validate their performance.
  • Extensive Coding and Implementation: I was responsible for a significant portion of the coding and implementation of the GP framework. This involved working with Python and its scientific computing stack to build a robust and efficient pipeline for training the GPs and generating the perturbations.
  • Data Visualization: I created numerous plots and visualizations to analyze the results and communicate our findings. While many of these did not make it into the final paper, they were crucial for understanding the performance of our method and will be featured in my PhD thesis.

Impact and Future Work

This research has demonstrated the potential of using machine learning to create more realistic representations of model uncertainty in climate science. The framework we developed is already being extended to more complex climate models, such as the CESM CAM, in partnership with IIT Delhi. This next phase of the project will involve deploying advanced Gaussian Process methods on state-of-the-art NVIDIA GH200 Grace-Hopper superchips, pushing the boundaries of what is possible in climate model calibration.

I am incredibly proud to have been a part of this research, and I am excited to continue exploring the intersection of machine learning and climate science in my future work. If you are interested in learning more, please feel free to reach out or read the full paper here.