Open In Colab   Open in Kaggle

Bonus Tutorial 8: Paleoclimate Models#

Week 1, Day 4, Paleoclimate

Content creators: Sloane Garelick

Content reviewers: Yosmely Bermúdez, Dionessa Biton, Katrina Dobson, Maria Gonzalez, Will Gregory, Nahid Hasan, Paul Heubel, Sherry Mi, Beatriz Cosenza Muralles, Brodie Pearson, Jenna Pearson, Rieke Schäfer, Chi Zhang, Ohad Zivan

Content editors: Yosmely Bermúdez, Paul Heubel, Zahra Khodakaramimaghsoud, Jenna Pearson, Agustina Pesce, Chi Zhang, Ohad Zivan

Production editors: Wesley Banfield, Paul Heubel, Jenna Pearson, Konstantine Tsafatinos, Chi Zhang, Ohad Zivan

Our 2024 Sponsors: CMIP, NFDI4Earth

Tutorial Objectives#

So far today, we’ve been focusing on how proxy-based reconstructions can be used to understand past variations in Earth’s climate sytem. However, another useful tool in paleoclimate is the use of climate models.

In this tutorial, you’ll explore data from the Paleoclimate Modelling Intercomparison Project 3 (PMIP3). More specifically, you’ll be analyzing global mean surface temperature (GMST) data from simulations for the past 1,000 years.

You’ll also compare the PMIP3 GMST data to a proxy-based reconstruction of temperature from Lake Tanganyika in East Africa (Tierney et al., 2010). Through this proxy-model comparison, you’ll be able to assess the differences and similarities between the two datasets.

By the end of this tutorial you will be able to:

  • Plot time series of paleoclimate model simulations

  • Compare and interpret proxy-based reconstructions of climate to paleoclimate model simulations of climate

Setup#

# installations ( uncomment and run this cell ONLY when using google colab or kaggle )

# !pip install pyleoclim
# imports
import pandas as pd
import matplotlib.pyplot as plt
import pooch
import os
import tempfile
import pyleoclim as pyleo

Figure Settings#

Hide code cell source
# @title Figure Settings
import ipywidgets as widgets  # interactive display

%config InlineBackend.figure_format = 'retina'
plt.style.use(
    "https://raw.githubusercontent.com/neuromatch/climate-course-content/main/cma.mplstyle"
)

Helper functions#

Hide code cell source
# @title Helper functions

def pooch_load(filelocation=None, filename=None, processor=None):
    shared_location = "/home/jovyan/shared/Data/tutorials/W1D4_Paleoclimate"  # this is different for each day
    user_temp_cache = tempfile.gettempdir()

    if os.path.exists(os.path.join(shared_location, filename)):
        file = os.path.join(shared_location, filename)
    else:
        file = pooch.retrieve(
            filelocation,
            known_hash=None,
            fname=os.path.join(user_temp_cache, filename),
            processor=processor,
        )

    return file

Video 1: Paleoclimate Models#

Section 1: Load Proxy-Based Temperature Reconstructions#

The proxy record we’ll be analyzing in this tutorial is a 1,000 year-long lake surface temperature reconstruction from Tierney et al., 2010. This record is from Lake Taganyika in equatorial East Africa and is based on the TEX86 ratio, which is a temperature proxy derived from the distribution of the isoprenoid glycerol dialkyl glycerol tetraether (iGDGT) of archaeal membrane lipids. The organisms producing these iGDGT compounds alter the structure of the compound in response to changes in temperature, so by measuring changes in the ratio of the different compounds, we can infer past changes in temperature.

Let’s start by loading the proxy data, saving it as a Series in Pyleoclim, and plotting a time series.

filename_tang = "tang_sst.csv"
url_tang = "https://osf.io/p8tx3/download"
proxy_temp = pd.read_csv(pooch_load(filelocation=url_tang, filename=filename_tang))
proxy_temp.head()
Downloading data from 'https://osf.io/p8tx3/download' to file '/tmp/tang_sst.csv'.
SHA256 hash of downloaded file: 238b0e12afb70c60d649bfc18072330e0f73fb376f3fccae5afdab72a37c311a
Use this value as the 'known_hash' argument of 'pooch.retrieve' to ensure that the file hasn't changed if it is downloaded again in the future.
Year LST
0 1996 25.7
1 1986 25.6
2 1976 25.0
3 1966 24.6
4 1956 24.5
proxy_t = pyleo.Series(
    time=proxy_temp["Year"],
    value=proxy_temp["LST"],
    time_name="Time",
    time_unit="yrs",
    value_name="Surface Temperature",
    value_unit="ºC",
    label="Lake Tanganyika Surface Temperature",
)

proxy_t.plot(color="C1")
Time axis values sorted in ascending order
/tmp/ipykernel_112773/914135199.py:1: UserWarning: auto_time_params is not specified. Currently default behavior sets this to True, which might modify your supplied time metadata.  Please set to False if you want a different behavior.
  proxy_t = pyleo.Series(
(<Figure size 1000x400 with 1 Axes>,
 <Axes: xlabel='Time [yrs]', ylabel='Surface Temperature [ºC]'>)
../../../_images/f207d5f3d54015bcfd0d77c9cc2d5111ad6053da76ed28e9a6ad8462c4ae5642.png

Questions 1#

Let’s make some initial observations about the data:

  1. What is the overall temperature trend over the past 2,000 years? Where are the major increases or decrease in temperature?

  2. What could be the cause of these shifts in lake surface temperature?

Click for solution

Section 2: Last Millenium PMIP3 GMST Data#

We can now load GMST anomaly data from the PMIP3 simulations for the past 1,000 years (Braconnot et al., 2012 and PAGES 2k-PMIP3 group, 2015). The anomalies are computed compared to the mean of the time series over the full length of temporal overlap between simulations.

# load the raw data 'PMIP3_GMST.txt'
filename_PMIP3 = "PMIP3_GMST.txt"
url_PMIP3 = "https://osf.io/gw2m5/download"
df = pd.read_table(pooch_load(filelocation=url_PMIP3, filename=filename_PMIP3))

# display the raw data
df
Downloading data from 'https://osf.io/gw2m5/download' to file '/tmp/PMIP3_GMST.txt'.
SHA256 hash of downloaded file: 7cb042abdc072202119a45fff3383a590ef770e8602afe1b0583e68fcfcfb843
Use this value as the 'known_hash' argument of 'pooch.retrieve' to ensure that the file hasn't changed if it is downloaded again in the future.
Year bcc_csm1_1 CCSM4 FGOALS_gl FGOALS_s2 IPSL_CM5A_LR MPI_ESM_P CSIRO GISS-E2-R_r1i1p121 GISS-E2-R_r1i1p127 ... CESM_member_1 CESM_member_2 CESM_member_3 CESM_member_4 CESM_member_5 CESM_member_6 CESM_member_7 CESM_member_8 CESM_member_9 CESM_member_10
0 850 -0.570693 -0.431830 NaN -0.620995 -0.475963 -0.170230 NaN 0.116333 0.155407 ... 0.036672 0.067692 0.085340 -0.000616 0.157021 0.048458 0.038173 -0.027151 0.143404 -0.053464
1 851 -0.698903 -0.411177 NaN -0.753160 -0.742970 -0.303124 -0.398695 0.068174 0.210337 ... 0.246816 0.181400 0.251417 0.170710 0.165139 0.324856 0.191677 0.120951 0.216921 0.068698
2 852 -0.575440 -0.404802 NaN -0.743508 -0.758939 -0.422623 -0.406343 0.060088 0.240585 ... 0.187429 0.065922 0.190229 0.264551 0.092629 0.386593 0.068904 0.292246 0.101564 0.200259
3 853 -0.724757 -0.552719 NaN -0.869331 -0.746460 -0.335177 -0.353557 -0.074396 0.030596 ... 0.202443 0.089054 -0.031298 0.205805 0.049447 0.023312 -0.041356 0.206064 0.212954 0.288272
4 854 -0.724328 -0.734938 NaN -0.826238 -0.684093 -0.650792 -0.416140 -0.402800 -0.330589 ... 0.062795 0.137882 -0.233049 -0.227240 -0.156577 -0.339176 -0.103825 0.058420 -0.006102 -0.006619
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1161 2011 1.013544 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1162 2012 NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1163 2013 NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1164 2014 NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1165 2015 NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

1166 rows × 22 columns

Note that the data file includes several ensemble members for Community Earth System Model (CESM) and Goddard Institute for Space Studies (GISS) simulations. Ensembles are essentially groups of climate model simulations used for climate projections, or in this case, reconstructions. You will learn about this in much more detail on W1D5: Climate Modeling.

For now, we can replace these with their ensemble mean series.

# create a new pandas.DataFrame to store the processed data
df_new = df.copy()

# remove the data columns for CESM and GISS ensemble members
for i in range(10):
    df_new = df_new.drop([f"CESM_member_{i+1}"], axis=1)

df_new = df_new.drop(["GISS-E2-R_r1i1p127.1"], axis=1)
df_new = df_new.drop(["GISS-E2-R_r1i1p127"], axis=1)
df_new = df_new.drop(["GISS-E2-R_r1i1p121"], axis=1)

# calculate the ensemble mean for CESM and GISS, and add the results into the table
df_new["CESM"] = df[
    [
        "CESM_member_1",
        "CESM_member_2",
        "CESM_member_3",
        "CESM_member_4",
        "CESM_member_5",
        "CESM_member_6",
        "CESM_member_7",
        "CESM_member_8",
        "CESM_member_9",
        "CESM_member_10",
    ]
].mean(axis=1)

df_new["GISS"] = df[
    [
        "GISS-E2-R_r1i1p127.1",
        "GISS-E2-R_r1i1p127",
        "GISS-E2-R_r1i1p121",
    ]
].mean(axis=1)
# display the processed data
df_new
Year bcc_csm1_1 CCSM4 FGOALS_gl FGOALS_s2 IPSL_CM5A_LR MPI_ESM_P CSIRO HadCM3 CESM GISS
0 850 -0.570693 -0.431830 NaN -0.620995 -0.475963 -0.170230 NaN -0.620517 0.049553 0.127429
1 851 -0.698903 -0.411177 NaN -0.753160 -0.742970 -0.303124 -0.398695 -0.553043 0.193858 0.138796
2 852 -0.575440 -0.404802 NaN -0.743508 -0.758939 -0.422623 -0.406343 -0.560791 0.185033 0.098170
3 853 -0.724757 -0.552719 NaN -0.869331 -0.746460 -0.335177 -0.353557 -0.438949 0.120470 -0.054552
4 854 -0.724328 -0.734938 NaN -0.826238 -0.684093 -0.650792 -0.416140 -0.812194 -0.081349 -0.407169
... ... ... ... ... ... ... ... ... ... ... ...
1161 2011 1.013544 NaN NaN NaN NaN NaN NaN NaN NaN NaN
1162 2012 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1163 2013 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1164 2014 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1165 2015 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

1166 rows × 11 columns

In our new dataframe, you can now see that the ensemble members for CESM and GISS are now replaced with one ensemble mean for each model simulation.

Now we can create a Pyleoclim Series object for each simulated GMST time series, which will allow us to easily plot the time series for each simulation and perform data analysis using various built-in tools.

# store each pyleoclim.Series() object into a dictionary
ts_dict = {}
for name in df_new.columns[1:]:
    ts_dict[name] = pyleo.Series(
        time=df_new["Year"].values,  # the time axis
        value=df_new[name].values,  # the value axis
        label=name,  # optional metadata: the nickname of the series
        time_name="Time",  # optional metadata: the name of the time axis
        time_unit="yrs",  # optional metadata: the unit of the time axis
        value_name="GMST anomaly",  # optional metadata: the name of the value axis
        value_unit="ºC",  # optional metadata: the unit of the value axis
    )
NaNs have been detected and dropped.
Time axis values sorted in ascending order
NaNs have been detected and dropped.
Time axis values sorted in ascending order
NaNs have been detected and dropped.
Time axis values sorted in ascending order
NaNs have been detected and dropped.
Time axis values sorted in ascending order
NaNs have been detected and dropped.
Time axis values sorted in ascending order
NaNs have been detected and dropped.
Time axis values sorted in ascending order
NaNs have been detected and dropped.
Time axis values sorted in ascending order
NaNs have been detected and dropped.
Time axis values sorted in ascending order
NaNs have been detected and dropped.
Time axis values sorted in ascending order
NaNs have been detected and dropped.
Time axis values sorted in ascending order
/tmp/ipykernel_112773/3348475101.py:4: UserWarning: auto_time_params is not specified. Currently default behavior sets this to True, which might modify your supplied time metadata.  Please set to False if you want a different behavior.
  ts_dict[name] = pyleo.Series(

We can now plot each simulation. For example, let’s plot the CCSM4 GMST anomaly:

ts_dict["CCSM4"].plot()
(<Figure size 1000x400 with 1 Axes>,
 <Axes: xlabel='Time [yrs]', ylabel='GMST anomaly [ºC]'>)
../../../_images/0d9c75bd1c3400fd3e0a1b82c3ee73e4d5724007f65eaafee8d8997cc7a66760.png

But what if we wanted to plot all of the PMIP3 time series at once? We can do that using the MultipleSeries object in Pyleoclim, which takes a list of Series objects as input. To do so, we have to convert the dictionary of Series into a list and then create a MultipleSeries object.

ts_list = [
    v for k, v in ts_dict.items()
]  # a pythonic way to convert the pyleo.Series items in the dictionary to a list
ms_pmip = pyleo.MultipleSeries(ts_list)

We can now plot all PMIP3 simulation time series at once:

ms_pmip.plot(
    lgd_kwargs={
        "bbox_to_anchor": (1.25, 1),  # move the legend to the right side
    }
)
(<Figure size 1000x400 with 1 Axes>,
 <Axes: xlabel='Time [yrs]', ylabel='GMST anomaly [ºC]'>)
../../../_images/0fdd6ed7feeae34a5d747a38b3c68c50d6a2a10086c3f7d432a58d8a23a620bd.png

Note that like the paleo proxy data we have looked at, these model simulations are also referred to as reconstructions as they are attempts to recreate past climates.

The reconstructed GMST anomalies from all of the PMIP3 simulations follow the same overall trend of relatively stable, long-term temperature from 800-1800 AD, followed by an increase in temperature over the past 200 years. What do you think is driving this recent warming trend?

Despite the long-term similarities, there are also noticeable differences between the GMST time series from each simulation.

Questions 2: Climate Connection#

  1. How are the GMST anomalies from each simulation different? What could be causing these differences?

  2. How do we know which simulation is the most accurate and reliable?

Click for solution

Section 3: Proxy-Model Comparisons#

Proxy-based reconstructions of climate variables in the past can provide direct measurements of temperature, precipitation, greenhouse gas concentration, etc. Comparing proxy paleoclimate records with paleoclimate model simulations can help to clarify the interpretation of the proxy record and also help to improve the ability of climate models to simulate past variations in climate.

Here, we’ll compare the proxy-based Lake Tanganyika surface temperature reconstruction we downloaded and plotted at the beginning of the tutorial, with the GMST anomaly PMIP3 simulations. But first, we need to standardize the Lake Tanganyika data since we’re comparing the data to the GMST anomaly.

# standardize the proxy data
proxy_stnd = proxy_t.standardize()
fig, ax = ms_pmip.plot(
    lgd_kwargs={
        "bbox_to_anchor": (1.25, 1),  # move the legend to the right side
    }
)

ax.set_ylabel("GMST anomaly (ºC)")
ax1 = ax.twinx()  # create a second axes that shares the same x-axis
ax1.set_ylabel("Tanganyika Surface Temperature Anomaly (ºC)")

proxy_stnd.plot(ax=ax1, color="black")
ax.set_xlim(xmin=850, xmax=2000)
ax.set_ylim(ymin=-4, ymax=2)
ax1.set_ylim(ymin=-4, ymax=2)
(-4.0, 2.0)
../../../_images/86e0f8429629df6202ec210e3f6c65690a56738ccbfe437bdfcd7c8b5203e589.png

Questions 3: Climate Connection#

How do the model simulated GMST and proxy-based surface temperature compare?

  1. Is there more variability in the proxy or model temperatures? What might be causing this?

  2. Are the long-term trends over the last millenium in the simulated GMST anomaly and proxy-based surface temperature record similar or different?

Click for solution

This is just one simplified example of a proxy-model comparison. Larger-scale proxy-model comparisons can help to identify spatial patterns of temperature change and assess forcing mechanisms of paleoclimate variations. In W2D1, you’ll spend more time exploring climate models.

Summary#

In this tutorial, you explored the role of climate models in paleoclimate research, particularly focusing on data from PMIP3. You learned to analyze and plot time series of paleoclimate model simulations, specifically GMST anomaly data from the past 1,000 years. This data reveals a trend of relatively stable global mean surface temperature anomalies from 800-1800 AD, followed by a noticeable rise in temperature over the last two centuries.

Furthermore, you have learned to compare this model data with proxy-based reconstructions. Using the example of temperature data from Lake Tanganyika in East Africa, we’ve assessed the differences and similarities between the model and proxy data. This process not only enhances our interpretation of the proxy record but also improves our understanding of how well climate models simulate past climate variations.

In the next and final tutorial for today, you’ll explore another tool for assessing past climate variability using both proxies and models.

Resources#

Code for this tutorial is based on existing notebooks from LinkedEarth that explore paleoclimate model-data comparisons using PMIP3 model simulations.

Data from the following sources are used in this tutorial:

  • Braconnot, P., Harrison, S., Kageyama, M. et al. Evaluation of climate models using palaeoclimatic data. Nature Clim Change 2, 417–424 (2012). https://doi.org/10.1038/nclimate1456

  • Tierney, J., Mayes, M., Meyer, N. et al. Late-twentieth-century warming in Lake Tanganyika unprecedented since AD 500. Nature Geosci 3, 422–425 (2010). https://doi.org/10.1038/ngeo865