Open In Colab   Open in Kaggle

Tutorial 6: Implementing the Analysis#

Good Research Practices

Content creators: Marguerite Brown, Zane Mitrevica, Natalie Steinemann, Yuxin Zhou

Content reviewers: Katrina Dobson, Sloane Garelick, Maria Gonzalez, Paul Heubel, Nahid Hasan, Sherry Mi, Beatriz Cosenza Muralles, Cheng Zhang

Content editors: Jenna Pearson, Chi Zhang, Ohad Zivan

Production editors: Wesley Banfield, Paul Heubel, Jenna Pearson, Konstantine Tsafatinos, Chi Zhang, Ohad Zivan

Our 2024 Sponsors: CMIP, NFDI4Earth

Tutorials Objectives#

In Tutorials 5-8, you will learn about the research process. This includes how to

  1. Draft analyses of data to test a hypothesis

  2. Implement analysis of data

  3. Interpret results in the context of existing knowledge

  4. Communicate your results and conclusions

By the end of these tutorials you will be able to:

  • Understand the principles of good research practices

  • Learn to view a scientific data set or question through the lens of equity: Who is represented by this data and who is not? Who has access to this information? Who is in a position to use it?

Activity: Implement the Analysis#

In this tutorial, you will be implementing a linear regression model as outlined in Step 5 on real-world CO2 and temperature records.

The CO2 and temperature records we will be analyzing are both examples of paleoclimate data (for more information, refer back to Step 3). The CO2 record (Bereiter et al., 2015) was generated by measuring the CO2 concentration in ancient air bubbles trapped inside ice from multiple ice cores retrieved from Antarctica. The temperature record (Shakun et al., 2015) is based on chemical analysis done on the shells of planktic foraminifera. The foraminifera shells were identified and picked from deep-sea sediments, and the temperature record combined multiple sea-surface temperature records from a range of sites globally.

Why are we focusing on these two records specifically? The CO2 record from Antarctic ice core is the gold standard of air CO2 variability on glacial-interglacial time scales, and it has a temporal resolution unmatched by any other reconstruction methods. The temperature record comes from sediment cores all over the global ocean, and therefore is likely representative of the global sea surface temperature (SST) variability. All SST records were shifted to a mean of zero and combined as unweighted global averages. Polar air temperature records are also available from ice core studies, but such records may represent an exaggerated view of the global temperature because of polar amplification.

If you would like to learn more, the data sources are listed at the bottom of the page.

# imports

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np
from scipy import interpolate
from scipy import stats
import os
import pooch
import tempfile

Figure Settings#

Hide code cell source
# @title Figure Settings
import ipywidgets as widgets  # interactive display

%config InlineBackend.figure_format = 'retina'
plt.style.use(
    "https://raw.githubusercontent.com/neuromatch/climate-course-content/main/cma.mplstyle"
)

Helper functions#

Hide code cell source
# @title Helper functions


def pooch_load(filelocation=None, filename=None, processor=None):
    shared_location = "/home/jovyan/shared/Data/tutorials/W2D1_FutureClimate-IPCCIPhysicalBasis"  # this is different for each day
    user_temp_cache = tempfile.gettempdir()

    if os.path.exists(os.path.join(shared_location, filename)):
        file = os.path.join(shared_location, filename)
    else:
        file = pooch.retrieve(
            filelocation,
            known_hash=None,
            fname=os.path.join(user_temp_cache, filename),
            processor=processor,
        )

    return file
# time series
# read SST data "Shakun2015_SST.txt"
filename_Shakun2015_SST = "Shakun2015_SST.txt"
url_Shakun2015_SST = "https://osf.io/kmy5w/download"
SST = pd.read_table(pooch_load(url_Shakun2015_SST, filename_Shakun2015_SST))
SST.set_index("Age", inplace=True)
SST
Downloading data from 'https://osf.io/kmy5w/download' to file '/tmp/Shakun2015_SST.txt'.
---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:534, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    533 try:
--> 534     response = conn.getresponse()
    535 except (BaseSSLError, OSError) as e:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connection.py:571, in HTTPConnection.getresponse(self)
    570 # Get the response from http.client.HTTPConnection
--> 571 httplib_response = super().getresponse()
    573 try:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/http/client.py:1395, in HTTPConnection.getresponse(self)
   1394 try:
-> 1395     response.begin()
   1396 except ConnectionError:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/http/client.py:325, in HTTPResponse.begin(self)
    324 while True:
--> 325     version, status, reason = self._read_status()
    326     if status != CONTINUE:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/http/client.py:286, in HTTPResponse._read_status(self)
    285 def _read_status(self):
--> 286     line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
    287     if len(line) > _MAXLINE:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/socket.py:718, in SocketIO.readinto(self, b)
    717 try:
--> 718     return self._sock.recv_into(b)
    719 except timeout:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/ssl.py:1314, in SSLSocket.recv_into(self, buffer, nbytes, flags)
   1311         raise ValueError(
   1312           "non-zero flags not allowed in calls to recv_into() on %s" %
   1313           self.__class__)
-> 1314     return self.read(nbytes, buffer)
   1315 else:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/ssl.py:1166, in SSLSocket.read(self, len, buffer)
   1165 if buffer is not None:
-> 1166     return self._sslobj.read(len, buffer)
   1167 else:

TimeoutError: The read operation timed out

The above exception was the direct cause of the following exception:

ReadTimeoutError                          Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/adapters.py:644, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    643 try:
--> 644     resp = conn.urlopen(
    645         method=request.method,
    646         url=url,
    647         body=request.body,
    648         headers=request.headers,
    649         redirect=False,
    650         assert_same_host=False,
    651         preload_content=False,
    652         decode_content=False,
    653         retries=self.max_retries,
    654         timeout=timeout,
    655         chunked=chunked,
    656     )
    658 except (ProtocolError, OSError) as err:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:841, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    839     new_e = ProtocolError("Connection aborted.", new_e)
--> 841 retries = retries.increment(
    842     method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
    843 )
    844 retries.sleep()

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/util/retry.py:490, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    489 if read is False or method is None or not self._is_method_retryable(method):
--> 490     raise reraise(type(error), error, _stacktrace)
    491 elif read is not None:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/util/util.py:39, in reraise(tp, value, tb)
     38         raise value.with_traceback(tb)
---> 39     raise value
     40 finally:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:787, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    786 # Make the request on the HTTPConnection object
--> 787 response = self._make_request(
    788     conn,
    789     method,
    790     url,
    791     timeout=timeout_obj,
    792     body=body,
    793     headers=headers,
    794     chunked=chunked,
    795     retries=retries,
    796     response_conn=response_conn,
    797     preload_content=preload_content,
    798     decode_content=decode_content,
    799     **response_kw,
    800 )
    802 # Everything went great!

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:536, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    535 except (BaseSSLError, OSError) as e:
--> 536     self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
    537     raise

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:367, in HTTPConnectionPool._raise_timeout(self, err, url, timeout_value)
    366 if isinstance(err, SocketTimeout):
--> 367     raise ReadTimeoutError(
    368         self, url, f"Read timed out. (read timeout={timeout_value})"
    369     ) from err
    371 # See the above comment about EAGAIN in Python 3.

ReadTimeoutError: HTTPSConnectionPool(host='osf.io', port=443): Read timed out. (read timeout=30)

During handling of the above exception, another exception occurred:

ReadTimeout                               Traceback (most recent call last)
Cell In[4], line 5
      3 filename_Shakun2015_SST = "Shakun2015_SST.txt"
      4 url_Shakun2015_SST = "https://osf.io/kmy5w/download"
----> 5 SST = pd.read_table(pooch_load(url_Shakun2015_SST, filename_Shakun2015_SST))
      6 SST.set_index("Age", inplace=True)
      7 SST

Cell In[3], line 11, in pooch_load(filelocation, filename, processor)
      9     file = os.path.join(shared_location, filename)
     10 else:
---> 11     file = pooch.retrieve(
     12         filelocation,
     13         known_hash=None,
     14         fname=os.path.join(user_temp_cache, filename),
     15         processor=processor,
     16     )
     18 return file

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/pooch/core.py:239, in retrieve(url, known_hash, fname, path, processor, downloader, progressbar)
    236 if downloader is None:
    237     downloader = choose_downloader(url, progressbar=progressbar)
--> 239 stream_download(url, full_path, known_hash, downloader, pooch=None)
    241 if known_hash is None:
    242     get_logger().info(
    243         "SHA256 hash of downloaded file: %s\n"
    244         "Use this value as the 'known_hash' argument of 'pooch.retrieve'"
   (...)    247         file_hash(str(full_path)),
    248     )

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/pooch/core.py:807, in stream_download(url, fname, known_hash, downloader, pooch, retry_if_failed)
    803 try:
    804     # Stream the file to a temporary so that we can safely check its
    805     # hash before overwriting the original.
    806     with temporary_file(path=str(fname.parent)) as tmp:
--> 807         downloader(url, tmp, pooch)
    808         hash_matches(tmp, known_hash, strict=True, source=str(fname.name))
    809         shutil.move(tmp, str(fname))

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/pooch/downloaders.py:220, in HTTPDownloader.__call__(self, url, output_file, pooch, check_only)
    218     # pylint: enable=consider-using-with
    219 try:
--> 220     response = requests.get(url, timeout=timeout, **kwargs)
    221     response.raise_for_status()
    222     content = response.iter_content(chunk_size=self.chunk_size)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/api.py:73, in get(url, params, **kwargs)
     62 def get(url, params=None, **kwargs):
     63     r"""Sends a GET request.
     64 
     65     :param url: URL for the new :class:`Request` object.
   (...)     70     :rtype: requests.Response
     71     """
---> 73     return request("get", url, params=params, **kwargs)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/api.py:59, in request(method, url, **kwargs)
     55 # By using the 'with' statement we are sure the session is closed, thus we
     56 # avoid leaving sockets open which can trigger a ResourceWarning in some
     57 # cases, and look like a memory leak in others.
     58 with sessions.Session() as session:
---> 59     return session.request(method=method, url=url, **kwargs)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    584 send_kwargs = {
    585     "timeout": timeout,
    586     "allow_redirects": allow_redirects,
    587 }
    588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
    591 return resp

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/sessions.py:724, in Session.send(self, request, **kwargs)
    721 if allow_redirects:
    722     # Redirect resolving generator.
    723     gen = self.resolve_redirects(r, request, **kwargs)
--> 724     history = [resp for resp in gen]
    725 else:
    726     history = []

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/sessions.py:724, in <listcomp>(.0)
    721 if allow_redirects:
    722     # Redirect resolving generator.
    723     gen = self.resolve_redirects(r, request, **kwargs)
--> 724     history = [resp for resp in gen]
    725 else:
    726     history = []

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/sessions.py:265, in SessionRedirectMixin.resolve_redirects(self, resp, req, stream, timeout, verify, cert, proxies, yield_requests, **adapter_kwargs)
    263     yield req
    264 else:
--> 265     resp = self.send(
    266         req,
    267         stream=stream,
    268         timeout=timeout,
    269         verify=verify,
    270         cert=cert,
    271         proxies=proxies,
    272         allow_redirects=False,
    273         **adapter_kwargs,
    274     )
    276     extract_cookies_to_jar(self.cookies, prepared_request, resp.raw)
    278     # extract redirect url, if any, for the next loop

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
    700 start = preferred_clock()
    702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
    705 # Total elapsed time of the request (approximately)
    706 elapsed = preferred_clock() - start

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/adapters.py:690, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    688     raise SSLError(e, request=request)
    689 elif isinstance(e, ReadTimeoutError):
--> 690     raise ReadTimeout(e, request=request)
    691 elif isinstance(e, _InvalidHeader):
    692     raise InvalidHeader(e, request=request)

ReadTimeout: HTTPSConnectionPool(host='osf.io', port=443): Read timed out. (read timeout=30)
# read CO2 dataantarctica2015co2composite_cleaned.txt
filename_antarctica2015co2composite_cleaned = "antarctica2015co2composite_cleaned.txt"
url_antarctica2015co2composite_cleaned = "https://osf.io/45fev/download"
CO2 = pd.read_table(
    pooch_load(
        url_antarctica2015co2composite_cleaned,
        filename_antarctica2015co2composite_cleaned,
    )
)
CO2.set_index("age_gas_calBP", inplace=True)
CO2
Downloading data from 'https://osf.io/45fev/download' to file '/tmp/antarctica2015co2composite_cleaned.txt'.
---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:534, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    533 try:
--> 534     response = conn.getresponse()
    535 except (BaseSSLError, OSError) as e:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connection.py:571, in HTTPConnection.getresponse(self)
    570 # Get the response from http.client.HTTPConnection
--> 571 httplib_response = super().getresponse()
    573 try:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/http/client.py:1395, in HTTPConnection.getresponse(self)
   1394 try:
-> 1395     response.begin()
   1396 except ConnectionError:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/http/client.py:325, in HTTPResponse.begin(self)
    324 while True:
--> 325     version, status, reason = self._read_status()
    326     if status != CONTINUE:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/http/client.py:286, in HTTPResponse._read_status(self)
    285 def _read_status(self):
--> 286     line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
    287     if len(line) > _MAXLINE:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/socket.py:718, in SocketIO.readinto(self, b)
    717 try:
--> 718     return self._sock.recv_into(b)
    719 except timeout:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/ssl.py:1314, in SSLSocket.recv_into(self, buffer, nbytes, flags)
   1311         raise ValueError(
   1312           "non-zero flags not allowed in calls to recv_into() on %s" %
   1313           self.__class__)
-> 1314     return self.read(nbytes, buffer)
   1315 else:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/ssl.py:1166, in SSLSocket.read(self, len, buffer)
   1165 if buffer is not None:
-> 1166     return self._sslobj.read(len, buffer)
   1167 else:

TimeoutError: The read operation timed out

The above exception was the direct cause of the following exception:

ReadTimeoutError                          Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/adapters.py:644, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    643 try:
--> 644     resp = conn.urlopen(
    645         method=request.method,
    646         url=url,
    647         body=request.body,
    648         headers=request.headers,
    649         redirect=False,
    650         assert_same_host=False,
    651         preload_content=False,
    652         decode_content=False,
    653         retries=self.max_retries,
    654         timeout=timeout,
    655         chunked=chunked,
    656     )
    658 except (ProtocolError, OSError) as err:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:841, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    839     new_e = ProtocolError("Connection aborted.", new_e)
--> 841 retries = retries.increment(
    842     method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
    843 )
    844 retries.sleep()

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/util/retry.py:490, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    489 if read is False or method is None or not self._is_method_retryable(method):
--> 490     raise reraise(type(error), error, _stacktrace)
    491 elif read is not None:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/util/util.py:39, in reraise(tp, value, tb)
     38         raise value.with_traceback(tb)
---> 39     raise value
     40 finally:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:787, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    786 # Make the request on the HTTPConnection object
--> 787 response = self._make_request(
    788     conn,
    789     method,
    790     url,
    791     timeout=timeout_obj,
    792     body=body,
    793     headers=headers,
    794     chunked=chunked,
    795     retries=retries,
    796     response_conn=response_conn,
    797     preload_content=preload_content,
    798     decode_content=decode_content,
    799     **response_kw,
    800 )
    802 # Everything went great!

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:536, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    535 except (BaseSSLError, OSError) as e:
--> 536     self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
    537     raise

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:367, in HTTPConnectionPool._raise_timeout(self, err, url, timeout_value)
    366 if isinstance(err, SocketTimeout):
--> 367     raise ReadTimeoutError(
    368         self, url, f"Read timed out. (read timeout={timeout_value})"
    369     ) from err
    371 # See the above comment about EAGAIN in Python 3.

ReadTimeoutError: HTTPSConnectionPool(host='osf.io', port=443): Read timed out. (read timeout=30)

During handling of the above exception, another exception occurred:

ReadTimeout                               Traceback (most recent call last)
Cell In[5], line 5
      2 filename_antarctica2015co2composite_cleaned = "antarctica2015co2composite_cleaned.txt"
      3 url_antarctica2015co2composite_cleaned = "https://osf.io/45fev/download"
      4 CO2 = pd.read_table(
----> 5     pooch_load(
      6         url_antarctica2015co2composite_cleaned,
      7         filename_antarctica2015co2composite_cleaned,
      8     )
      9 )
     10 CO2.set_index("age_gas_calBP", inplace=True)
     11 CO2

Cell In[3], line 11, in pooch_load(filelocation, filename, processor)
      9     file = os.path.join(shared_location, filename)
     10 else:
---> 11     file = pooch.retrieve(
     12         filelocation,
     13         known_hash=None,
     14         fname=os.path.join(user_temp_cache, filename),
     15         processor=processor,
     16     )
     18 return file

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/pooch/core.py:239, in retrieve(url, known_hash, fname, path, processor, downloader, progressbar)
    236 if downloader is None:
    237     downloader = choose_downloader(url, progressbar=progressbar)
--> 239 stream_download(url, full_path, known_hash, downloader, pooch=None)
    241 if known_hash is None:
    242     get_logger().info(
    243         "SHA256 hash of downloaded file: %s\n"
    244         "Use this value as the 'known_hash' argument of 'pooch.retrieve'"
   (...)    247         file_hash(str(full_path)),
    248     )

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/pooch/core.py:807, in stream_download(url, fname, known_hash, downloader, pooch, retry_if_failed)
    803 try:
    804     # Stream the file to a temporary so that we can safely check its
    805     # hash before overwriting the original.
    806     with temporary_file(path=str(fname.parent)) as tmp:
--> 807         downloader(url, tmp, pooch)
    808         hash_matches(tmp, known_hash, strict=True, source=str(fname.name))
    809         shutil.move(tmp, str(fname))

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/pooch/downloaders.py:220, in HTTPDownloader.__call__(self, url, output_file, pooch, check_only)
    218     # pylint: enable=consider-using-with
    219 try:
--> 220     response = requests.get(url, timeout=timeout, **kwargs)
    221     response.raise_for_status()
    222     content = response.iter_content(chunk_size=self.chunk_size)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/api.py:73, in get(url, params, **kwargs)
     62 def get(url, params=None, **kwargs):
     63     r"""Sends a GET request.
     64 
     65     :param url: URL for the new :class:`Request` object.
   (...)     70     :rtype: requests.Response
     71     """
---> 73     return request("get", url, params=params, **kwargs)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/api.py:59, in request(method, url, **kwargs)
     55 # By using the 'with' statement we are sure the session is closed, thus we
     56 # avoid leaving sockets open which can trigger a ResourceWarning in some
     57 # cases, and look like a memory leak in others.
     58 with sessions.Session() as session:
---> 59     return session.request(method=method, url=url, **kwargs)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    584 send_kwargs = {
    585     "timeout": timeout,
    586     "allow_redirects": allow_redirects,
    587 }
    588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
    591 return resp

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/sessions.py:724, in Session.send(self, request, **kwargs)
    721 if allow_redirects:
    722     # Redirect resolving generator.
    723     gen = self.resolve_redirects(r, request, **kwargs)
--> 724     history = [resp for resp in gen]
    725 else:
    726     history = []

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/sessions.py:724, in <listcomp>(.0)
    721 if allow_redirects:
    722     # Redirect resolving generator.
    723     gen = self.resolve_redirects(r, request, **kwargs)
--> 724     history = [resp for resp in gen]
    725 else:
    726     history = []

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/sessions.py:265, in SessionRedirectMixin.resolve_redirects(self, resp, req, stream, timeout, verify, cert, proxies, yield_requests, **adapter_kwargs)
    263     yield req
    264 else:
--> 265     resp = self.send(
    266         req,
    267         stream=stream,
    268         timeout=timeout,
    269         verify=verify,
    270         cert=cert,
    271         proxies=proxies,
    272         allow_redirects=False,
    273         **adapter_kwargs,
    274     )
    276     extract_cookies_to_jar(self.cookies, prepared_request, resp.raw)
    278     # extract redirect url, if any, for the next loop

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
    700 start = preferred_clock()
    702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
    705 # Total elapsed time of the request (approximately)
    706 elapsed = preferred_clock() - start

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/requests/adapters.py:690, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    688     raise SSLError(e, request=request)
    689 elif isinstance(e, ReadTimeoutError):
--> 690     raise ReadTimeout(e, request=request)
    691 elif isinstance(e, _InvalidHeader):
    692     raise InvalidHeader(e, request=request)

ReadTimeout: HTTPSConnectionPool(host='osf.io', port=443): Read timed out. (read timeout=30)
# plot
# set up two subplots in a grid of 2 rows and 1 column
# also make sure the two plots share the same x(time) axis
fig, axes = plt.subplots(2, 1, sharex=True)
# move the two subplots closer to each other
fig.subplots_adjust(hspace=-0.5)
axes[0].plot(SST.index, SST["SST stack"], color="C4")
axes[1].plot(CO2.index / 1000, CO2["co2_ppm"], color="C1")

# beautification
# since sharex=True in plt.subplots(), this sets the x axis limit for both panels
axes[1].set_xlim((0, 805))
# axis labels
axes[1].set_xlabel("Age (ka BP)")
axes[0].set_ylabel(r"Sea Surface Temperature" "\n" "Anomaly (°C)", color="C4")
axes[1].set_ylabel(r"CO${}_\mathrm{2}$ (ppm)", color="C1")

# despine makes the plots look cleaner
sns.despine(ax=axes[0], top=True, right=False, bottom=True, left=True)
sns.despine(ax=axes[1], top=True, right=True, bottom=False, left=False)
# clean up top panel x axis ticks
axes[0].xaxis.set_ticks_position("none")
# move top panel xlabel to the right side
axes[0].yaxis.set_label_position("right")
# the following code ensures the subplots don't overlap
for ax in axes:
    ax.set_zorder(10)
    ax.set_facecolor("none")
# color the axis
axes[0].spines["right"].set_color("C4")
axes[1].spines["left"].set_color("C1")
axes[0].tick_params(axis="y", colors="C4")
axes[1].tick_params(axis="y", colors="C1")

Now that we’ve taken a look at the two time series, let’s make a scatter plot between them and fit a linear regression model through the data.

# in this code block, we will make a scatter plot of CO2 and temperature
# and fit a linear regression model through the data


def age_model_interp(CO2_age, CO2, SST_age):
    """
    This helper function linearly interpolates CO2 data, which
    have a very high temporal resolution, to temperature data,
    which have a relatively low resolution
    """
    f = interpolate.interp1d(CO2_age, CO2)
    all_ages = f(SST_age)
    return all_ages


# interpolate CO2 data to SST age
CO2_interpolated = age_model_interp(CO2.index / 1000, CO2["co2_ppm"], SST.index)

# plot
# set up two subplots in a grid of 2 rows and 1 column
# also make sure the two plots share the same x(time) axis
fig, ax = plt.subplots(1, 1, sharex=True)

ax.scatter(CO2_interpolated, SST["SST stack"], color="gray")

# regression
X = CO2_interpolated
y = SST["SST stack"]
res = stats.linregress(X, y)  # ordinary least sqaure

x_fit = np.arange(180, 280)
# intercept
y_fit = x_fit * res.slope + res.intercept
ax.plot(x_fit, y_fit, color="k")

# beautification
# axis labels
ax.set_xlabel(r"CO${}_\mathrm{2}$ (ppm)")
ax.set_ylabel(r"Sea Surface Temperature" "\n" "Anomaly (°C)")
print(
    "Pearson (r^2) value: "
    + "{:.2f}".format(res.rvalue**2)
    + " \nwith a p-value of: "
    + "{:.2e}".format(res.pvalue)
)

Figure Making Through the Equity Lense#

Click here for some information Are the colors in your figure distinguishable for people with color-vision deficiencies?

More readings on this topic:

Contrast checker: https://www.color-blindness.com/coblis-color-blindness-simulator/

Coloring for color blindness: https://davidmathlogic.com/colorblind

Python-specific color palettes that are friendly to those with color-vision deficiency: https://seaborn.pydata.org/tutorial/color_palettes.html

Resources#

Data from the following sources are used in this tutorial:

CO2: Bereiter, B., Eggleston, S., Schmitt, J., Nehrbass-Ahles, C., Stocker, T.F., Fischer, H., Kipfstuhl, S., Chappellaz, J., 2015. Revision of the EPICA Dome C CO2 record from 800 to 600 kyr before present. Geophysical Research Letters 42, 542–549. https://doi.org/10.1002/2014GL061957

Temperature: Shakun, J.D., Lea, D.W., Lisiecki, L.E., Raymo, M.E., 2015. An 800-kyr record of global surface ocean δ18O and implications for ice volume-temperature coupling. Earth and Planetary Science Letters 426, 58–68. https://doi.org/10.1016/j.epsl.2015.05.042 (not Open Access)