Tutorial 5: Calculating Anomalies Using Precipitation Data#
Week 1, Day 3, Remote Sensing
Content creators: Douglas Rao
Content reviewers: Katrina Dobson, Younkap Nina Duplex, Maria Gonzalez, Will Gregory, Nahid Hasan, Paul Heubel, Sherry Mi, Beatriz Cosenza Muralles, Jenna Pearson, Agustina Pesce, Chi Zhang, Ohad Zivan
Content editors: Paul Heubel, Jenna Pearson, Chi Zhang, Ohad Zivan
Production editors: Wesley Banfield, Paul Heubel, Jenna Pearson, Konstantine Tsafatinos, Chi Zhang, Ohad Zivan
Our 2024 Sponsors: CMIP, NFDI4Earth
Tutorial Objectives#
Estimated timing of tutorial: 25 minutes
In this tutorial, you will learn how to calculate climate anomalies using satellite climate data records.
By the end of this tutorial you will be able to:
Calculate an anomaly to a climatology.
Calculate the rolling mean of the anomaly data to smooth the time series and extract long-term signals/patterns.
Setup#
# installations ( uncomment and run this cell ONLY when using google colab or kaggle )
# !pip install s3fs --quiet
# # properly install cartopy in colab to avoid session crash
# !apt-get install libproj-dev proj-data proj-bin --quiet
# !apt-get install libgeos-dev --quiet
# !pip install cython --quiet
# !pip install cartopy --quiet
# !pip install geoviews
# !apt-get -qq install python-cartopy python3-cartopy --quiet
# !pip uninstall -y shapely --quiet
# !pip install shapely --no-binary shapely --quiet
# !pip install boto3 --quiet
# you may need to restart the runtime after running this cell and that is ok
# imports
import s3fs
import xarray as xr
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import boto3
import botocore
import os
import pooch
import tempfile
import holoviews
from geoviews import Dataset as gvDataset
import geoviews.feature as gf
from geoviews import Image as gvImage
Install and import feedback gadget#
Show code cell source
# @title Install and import feedback gadget
!pip3 install vibecheck datatops --quiet
from vibecheck import DatatopsContentReviewContainer
def content_review(notebook_section: str):
return DatatopsContentReviewContainer(
"", # No text prompt
notebook_section,
{
"url": "https://pmyvdlilci.execute-api.us-east-1.amazonaws.com/klab",
"name": "comptools_4clim",
"user_key": "l5jpxuee",
},
).render()
feedback_prefix = "W1D3_T5"
[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: pip install --upgrade pip
Figure Settings#
Show code cell source
# @title Figure Settings
import ipywidgets as widgets # interactive display
%config InlineBackend.figure_format = 'retina'
plt.style.use(
"https://raw.githubusercontent.com/neuromatch/climate-course-content/main/cma.mplstyle"
)
Helper functions#
Show code cell source
# @title Helper functions
def pooch_load(filelocation=None, filename=None, processor=None):
shared_location = "/home/jovyan/shared/data/tutorials/W1D3_RemoteSensing" # this is different for each day
user_temp_cache = tempfile.gettempdir()
if os.path.exists(os.path.join(shared_location, filename)):
file = os.path.join(shared_location, filename)
else:
file = pooch.retrieve(
filelocation,
known_hash=None,
fname=os.path.join(user_temp_cache, filename),
processor=processor,
)
return file
Video 1: Calculating Anomaly#
Submit your feedback#
Show code cell source
# @title Submit your feedback
content_review(f"{feedback_prefix}_Calculating_Anomaly_Video")
If you want to download the slides: https://osf.io/download/kvdgr/
Submit your feedback#
Show code cell source
# @title Submit your feedback
content_review(f"{feedback_prefix}_Calculating_Anomaly_Slides")
Section 1: From Climatology to Anomaly#
Building upon your knowledge of climatology from the last tutorial, Tutorial 4, you will now calculate the anomalies from this climatology. An anomaly, in the context of climate studies, represents a departure from standard climatological conditions. For example, if the normal January temperature of the city that you live in is 10°C and the January temperature of this year is 15°C. We usually say the temperature anomaly of January this year is 5°C above normal/ the climatology. The anomaly is an essential tool in detecting changes in climate patterns and is frequently utilized in critical climate reports such as those generated by the Intergovernmental Panel on Climate Change (IPCC).
To calculate an anomaly, we first establish a reference period, usually a 30-year window, to define our climatology. In this process, it is crucial to use high-quality data and aggregate it to the desired spatial resolution and temporal frequency, such as weekly or monthly. The anomaly is then determined by subtracting this long-term average from a given observation, thus creating a useful metric for further climate analysis such as trend analysis.
In this tutorial, we will employ the GPCP monthly precipitation Climate Data Record (CDR) to compute a monthly anomaly time series. Furthermore, we will learn to calculate the rolling mean of the generated precipitation anomaly time series. This knowledge will be invaluable for our upcoming tutorial on climate variability.
Section 1.1: Calculating the Monthly Anomaly#
To calculate the anomaly, you first need to calculate the monthly climatology. Since you already learned how to do this during the last tutorial, we will fast-forward this step.
# connect to the AWS S3 bucket for the GPCP Monthly Precipitation CDR data
fs = s3fs.S3FileSystem(anon=True)
# get the list of all data files in the AWS S3 bucket
file_pattern = "noaa-cdr-precip-gpcp-monthly-pds/data/*/gpcp_v02r03_monthly_*.nc"
file_location = fs.glob(file_pattern)
# open connection to all data files
client = boto3.client(
"s3", config=botocore.client.Config(signature_version=botocore.UNSIGNED)
) # initialize aws s3 bucket client
file_ob = [
pooch_load(filelocation="http://s3.amazonaws.com/" + file, filename=file)
for file in file_location
]
# open all the monthly data files and concatenate them along the time dimension
# this process will take ~ 1 minute to complete due to the number of data files
ds = xr.open_mfdataset(file_ob, combine="nested", concat_dim="time")
# comment for colab users only: this could toss an error message for you.
# you should still be able to use the dataset with this error just not print ds
# you can try uncommenting the following line to avoid the error
# ds.attrs['history']='' # the history attribute have unique chars that cause a crash on Google colab.
ds
<xarray.Dataset> Size: 46MB Dimensions: (latitude: 72, longitude: 144, time: 540, nv: 2) Coordinates: * latitude (latitude) float32 288B -88.75 -86.25 -83.75 ... 86.25 88.75 * longitude (longitude) float32 576B 1.25 3.75 6.25 ... 353.8 356.2 358.8 * time (time) datetime64[ns] 4kB 1979-01-01 1979-02-01 ... 2023-12-01 Dimensions without coordinates: nv Data variables: lat_bounds (time, latitude, nv) float32 311kB dask.array<chunksize=(1, 72, 2), meta=np.ndarray> lon_bounds (time, longitude, nv) float32 622kB dask.array<chunksize=(1, 144, 2), meta=np.ndarray> time_bounds (time, nv) datetime64[ns] 9kB dask.array<chunksize=(1, 2), meta=np.ndarray> precip (time, latitude, longitude) float32 22MB dask.array<chunksize=(1, 72, 144), meta=np.ndarray> precip_error (time, latitude, longitude) float32 22MB dask.array<chunksize=(1, 72, 144), meta=np.ndarray> Attributes: (12/45) Conventions: CF-1.6, ACDD 1.3 title: Global Precipitation Climatatology Project (G... source: oc.197901.sg references: Huffman et al. 1997, http://dx.doi.org/10.117... history: 1) �R`�, Dr. Jian-Jian Wang, U of Maryland,... Metadata_Conventions: CF-1.6, Unidata Dataset Discovery v1.0, NOAA ... ... ... metadata_link: gov.noaa.ncdc:C00979 product_version: v23rB1 platform: NOAA POES (Polar Orbiting Environmental Satel... sensor: AVHRR>Advanced Very High Resolution Radiometer spatial_resolution: 2.5 degree comment: Processing computer: eagle2.umd.edu
# calculate climatology using `.sel()` and `.groupby()` directly.
precip_clim = (
ds.precip.sel(time=slice("1981-01-01", "2010-12-01"))
.groupby("time.month")
.mean(dim="time")
)
precip_clim
<xarray.DataArray 'precip' (month: 12, latitude: 72, longitude: 144)> Size: 498kB dask.array<transpose, shape=(12, 72, 144), dtype=float32, chunksize=(1, 72, 144), chunktype=numpy.ndarray> Coordinates: * latitude (latitude) float32 288B -88.75 -86.25 -83.75 ... 86.25 88.75 * longitude (longitude) float32 576B 1.25 3.75 6.25 ... 353.8 356.2 358.8 * month (month) int64 96B 1 2 3 4 5 6 7 8 9 10 11 12 Attributes: long_name: NOAA Climate Data Record (CDR) of GPCP Monthly Satellite-... standard_name: precipitation amount units: mm/day valid_range: [ 0. 100.] cell_methods: area: mean time: mean
Now we have the monthly precipitation climatology. How can we calculate the monthly anomaly?
As we learned before - let’s use .groupby()
from xarray
. We can split the entire time period based on the month of the year and then subtract the climatology of that specific month from the monthly value and recombine the value together.
# use `.groupby()` to calculate the monthly anomaly
precip_anom = ds.precip.groupby("time.month") - precip_clim
precip_anom
<xarray.DataArray 'precip' (time: 540, latitude: 72, longitude: 144)> Size: 22MB dask.array<sub, shape=(540, 72, 144), dtype=float32, chunksize=(1, 72, 144), chunktype=numpy.ndarray> Coordinates: * latitude (latitude) float32 288B -88.75 -86.25 -83.75 ... 86.25 88.75 * longitude (longitude) float32 576B 1.25 3.75 6.25 ... 353.8 356.2 358.8 * time (time) datetime64[ns] 4kB 1979-01-01 1979-02-01 ... 2023-12-01 month (time) int64 4kB 1 2 3 4 5 6 7 8 9 10 ... 3 4 5 6 7 8 9 10 11 12
You may have noticed that there is an additional coordinate in the anomaly dataset. The additional coordinate is month
which is a direct outcome because of the .groupby()
action we just performed.
If you want to save the data for future use, you can write the data out to a netCDF file using .to_netcdf()
. It will automatically carry all the coordinates, dimensions, and relevant information into the netCDF file.
# an example of how to export the GPCP monthly anomaly data comparing to the climatology period of 1981-2010.
# precip_anom.to_netcdf('t5_gpcp-monthly-anomaly_1981-2010.nc')
Section 1.2: Examining the Anomaly#
First, let’s take a look at the geospatial pattern of the monthly anomaly of a selected month – January of 1979.
# initate plot
fig = plt.figure()
# set map projection
ax = plt.axes(projection=ccrs.Robinson())
# add coastal lines to indicate land/ocean
ax.coastlines()
# add grid lines for latitude and longitude
ax.gridlines()
# draw the precipitation data
precip_anom.sel(time="1979-01-01").plot(
ax=ax,
transform=ccrs.PlateCarree(),
vmin=-8,
vmax=8,
cmap="BrBG",
cbar_kwargs=dict(shrink=0.5, label="Monthly anomaly \n(mm/day)"),
)
<cartopy.mpl.geocollection.GeoQuadMesh at 0x7f34c40c5280>
From the map of this monthly anomaly, we can see the spatial pattern of how precipitation for the January of 1979 has departed from the 30-year normal. Part of the Amazon saw notable increase of precipitation during this month as well as the northeast coast of the United States.
Interactive Demo 1.2#
In the interactive demo below (make sure to run the code) you will be able to scroll through the anomaly for a few months in 1979 to see how it changes from month to month during this year.
holoviews.extension("bokeh")
dataset_plot = gvDataset(
precip_anom.isel(time=slice(0, 10))
) # only the first 10, as it is a time consuming task
images = dataset_plot.to(gvImage, ["longitude", "latitude"], ["precip"], "time")
images.opts(
cmap="BrBG",
colorbar=True,
width=600,
height=400,
projection=ccrs.Robinson(),
clim=(-8, 8),
clabel="Monthly anomaly \n(mm/day)",
) * gf.coastline