{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "9735141d",
   "metadata": {
    "execution": {}
   },
   "source": [
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuromatch/climate-course-content/blob/main/tutorials/W1D1_ClimateSystemOverview/student/W1D1_Tutorial7.ipynb)   <a href=\"https://kaggle.com/kernels/welcome?src=https://raw.githubusercontent.com/neuromatch/climate-course-content/main/tutorials/W1D1_ClimateSystemOverview/student/W1D1_Tutorial7.ipynb\" target=\"_blank\"><img alt=\"Open in Kaggle\" src=\"https://kaggle.com/static/images/open-in-kaggle.svg\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5yJKHFT3Dnmu",
   "metadata": {
    "execution": {}
   },
   "source": [
    "# Tutorial 7: Other Computational Tools in Xarray\n",
    "\n",
    "**Week 1, Day 1, Climate System Overview**\n",
    "\n",
    "**Content creators:** Sloane Garelick, Julia Kent\n",
    "\n",
    "**Content reviewers:** Katrina Dobson, Younkap Nina Duplex, Danika Gupta, Maria Gonzalez, Will Gregory, Nahid Hasan, Paul Heubel, Sherry Mi, Beatriz Cosenza Muralles, Jenna Pearson, Agustina Pesce, Chi Zhang, Ohad Zivan\n",
    "\n",
    "**Content editors:** Paul Heubel, Jenna Pearson, Chi Zhang, Ohad Zivan\n",
    "\n",
    "**Production editors:** Wesley Banfield, Paul Heubel, Jenna Pearson, Konstantine Tsafatinos, Chi Zhang, Ohad Zivan\n",
    "\n",
    "**Our 2024 Sponsors:** CMIP, NFDI4Earth"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "97bbc2d6-8eae-4e11-94e5-f7320fecd163",
   "metadata": {
    "execution": {}
   },
   "source": [
    "## ![project pythia](https://projectpythia.org/_static/images/logos/pythia_logo-blue-rtext.svg)\n",
    "\n",
    "Pythia credit: Rose, B. E. J., Kent, J., Tyle, K., Clyne, J., Banihirwe, A., Camron, D., May, R., Grover, M., Ford, R. R., Paul, K., Morley, J., Eroglu, O., Kailyn, L., & Zacharias, A. (2023). Pythia Foundations (Version v2023.05.01) https://zenodo.org/record/8065851\n",
    "\n",
    "## ![CMIP.png](https://github.com/ClimateMatchAcademy/course-content/blob/main/tutorials/Art/CMIP.png?raw=true)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "y8yIi1WBkwHB",
   "metadata": {
    "execution": {}
   },
   "source": [
    "# Tutorial Objectives\n",
    "\n",
    "*Estimated timing of tutorial:* 15 minutes \n",
    "\n",
    "Thus far, we've learned about various climate processes in the videos, and we've explored tools in Xarray that are useful for analyzing and interpreting climate data in the tutorials. \n",
    "\n",
    "In this tutorial, you'll continue using the SST data from CESM2 and practice using some additional computational tools in Xarray to resample your data, which can help with data comparison and analysis. The functions you will use are:\n",
    "\n",
    "- `.resample()`: Groupby-like functionality specifically for time dimensions. Can be used for temporal upsampling and downsampling. Additional information about resampling in Xarray can be found [here](https://xarray.pydata.org/en/stable/user-guide/time-series.html#resampling-and-grouped-operations).\n",
    "- `.rolling()`: Useful for computing aggregations on moving windows of your dataset e.g. computing moving averages. Additional information about resampling in Xarray can be found [here](https://xarray.pydata.org/en/stable/user-guide/computation.html#rolling-window-operations).\n",
    "- `.coarsen()`: Generic functionality for downsampling data. Additional information about resampling in Xarray can be found [here](https://xarray.pydata.org/en/stable/user-guide/computation.html#coarsen-large-arrays)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0af7bee1-3de3-453a-8ae8-bcd7910b4266",
   "metadata": {
    "execution": {},
    "tags": []
   },
   "source": [
    "# Setup\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9fa2315d-13a8-4499-b962-4434934b39ed",
   "metadata": {
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# installations ( uncomment and run this cell ONLY when using google colab or kaggle )\n",
    "#!pip install pythia_datasets cftime nc-time-axis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "06073287-7bdb-45b5-9cec-8cdf123adb49",
   "metadata": {
    "execution": {},
    "tags": []
   },
   "outputs": [],
   "source": [
    "# imports\n",
    "import matplotlib.pyplot as plt\n",
    "import xarray as xr\n",
    "from pythia_datasets import DATASETS"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Install and import feedback gadget\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21b5614e-3a3f-46d1-a926-283e9458da74",
   "metadata": {
    "cellView": "form",
    "execution": {},
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# @title Install and import feedback gadget\n",
    "\n",
    "!pip3 install vibecheck datatops --quiet\n",
    "\n",
    "from vibecheck import DatatopsContentReviewContainer\n",
    "def content_review(notebook_section: str):\n",
    "    return DatatopsContentReviewContainer(\n",
    "        \"\",  # No text prompt\n",
    "        notebook_section,\n",
    "        {\n",
    "            \"url\": \"https://pmyvdlilci.execute-api.us-east-1.amazonaws.com/klab\",\n",
    "            \"name\": \"comptools_4clim\",\n",
    "            \"user_key\": \"l5jpxuee\",\n",
    "        },\n",
    "    ).render()\n",
    "\n",
    "\n",
    "feedback_prefix = \"W1D1_T7\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Figure Settings\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5c385e97-285f-45d4-88e2-26ee76b8d039",
   "metadata": {
    "cellView": "form",
    "execution": {},
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# @title Figure Settings\n",
    "import ipywidgets as widgets  # interactive display\n",
    "\n",
    "%config InlineBackend.figure_format = 'retina'\n",
    "plt.style.use(\n",
    "    \"https://raw.githubusercontent.com/neuromatch/climate-course-content/main/cma.mplstyle\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Video 1: Carbon Cycle and the Greenhouse Effect\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2fd1ee75-9e23-4204-a896-9a08da3e794f",
   "metadata": {
    "cellView": "form",
    "execution": {},
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [],
   "source": [
    "# @title Video 1: Carbon Cycle and the Greenhouse Effect\n",
    "\n",
    "from ipywidgets import widgets\n",
    "from IPython.display import YouTubeVideo\n",
    "from IPython.display import IFrame\n",
    "from IPython.display import display\n",
    "\n",
    "\n",
    "class PlayVideo(IFrame):\n",
    "  def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n",
    "    self.id = id\n",
    "    if source == 'Bilibili':\n",
    "      src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n",
    "    elif source == 'Osf':\n",
    "      src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n",
    "    super(PlayVideo, self).__init__(src, width, height, **kwargs)\n",
    "\n",
    "\n",
    "def display_videos(video_ids, W=400, H=300, fs=1):\n",
    "  tab_contents = []\n",
    "  for i, video_id in enumerate(video_ids):\n",
    "    out = widgets.Output()\n",
    "    with out:\n",
    "      if video_ids[i][0] == 'Youtube':\n",
    "        video = YouTubeVideo(id=video_ids[i][1], width=W,\n",
    "                             height=H, fs=fs, rel=0)\n",
    "        print(f'Video available at https://youtube.com/watch?v={video.id}')\n",
    "      else:\n",
    "        video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n",
    "                          height=H, fs=fs, autoplay=False)\n",
    "        if video_ids[i][0] == 'Bilibili':\n",
    "          print(f'Video available at https://www.bilibili.com/video/{video.id}')\n",
    "        elif video_ids[i][0] == 'Osf':\n",
    "          print(f'Video available at https://osf.io/{video.id}')\n",
    "      display(video)\n",
    "    tab_contents.append(out)\n",
    "  return tab_contents\n",
    "\n",
    "\n",
    "video_ids = [('Youtube', 'y55CNZbTqw8'), ('Bilibili', 'BV1uV411T7RK')]\n",
    "tab_contents = display_videos(video_ids, W=730, H=410)\n",
    "tabs = widgets.Tab()\n",
    "tabs.children = tab_contents\n",
    "for i in range(len(tab_contents)):\n",
    "  tabs.set_title(i, video_ids[i][0])\n",
    "display(tabs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Submit your feedback\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "43bfb42a-8ce6-44b6-9a88-6bca2000bd7e",
   "metadata": {
    "cellView": "form",
    "execution": {},
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Carbon_Cycle_Video\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c9027490-8086-41a3-9489-0f164b6375a2",
   "metadata": {
    "cellView": "form",
    "execution": {},
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [],
   "source": [
    "# @markdown\n",
    "from ipywidgets import widgets\n",
    "from IPython.display import IFrame\n",
    "\n",
    "link_id = \"sb3n5\"\n",
    "\n",
    "print(f\"If you want to download the slides: https://osf.io/download/{link_id}/\")\n",
    "IFrame(src=f\"https://mfr.ca-1.osf.io/render?url=https://osf.io/{link_id}/?direct%26mode=render%26action=download%26mode=render\", width=854, height=480)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Submit your feedback\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3918492b-f1c5-4b8e-8479-748a1652f724",
   "metadata": {
    "cellView": "form",
    "execution": {},
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Carbon_Cycle_Slides\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3045c67e-21cd-4ef9-a49f-e12ae7db23cf",
   "metadata": {
    "execution": {},
    "tags": []
   },
   "source": [
    "# Section 1: High-level Computation Functionality\n",
    "\n",
    "In this tutorial, you will learn about several methods for dealing with the resolution of data. Here are some links for quick reference, and we will go into detail in each of them in the sections below.\n",
    "\n",
    "- `.resample()`: [Groupby-like functionality especially for time dimensions. Can be used for temporal upsampling and downsampling](https://xarray.pydata.org/en/stable/user-guide/time-series.html#resampling-and-grouped-operations)\n",
    "- `.rolling()`: [Useful for computing aggregations on moving windows of your dataset e.g. computing moving averages](https://xarray.pydata.org/en/stable/user-guide/computation.html#rolling-window-operations)\n",
    "- `.coarsen()`: [Generic functionality for downsampling data](https://xarray.pydata.org/en/stable/user-guide/computation.html#coarsen-large-arrays)\n",
    "\n",
    "First, let's load the same data that we used in the previous tutorials (monthly SST data from CESM2):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7837f8bd-da89-4718-ab02-d5107576d2d6",
   "metadata": {
    "execution": {},
    "tags": []
   },
   "outputs": [],
   "source": [
    "filepath = DATASETS.fetch(\"CESM2_sst_data.nc\")\n",
    "ds = xr.open_dataset(filepath)\n",
    "ds"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eaf4dc7d-dfac-419e-a875-fc0c70fcd08c",
   "metadata": {
    "execution": {}
   },
   "source": [
    "## Section 1.1: Resampling Data\n",
    "\n",
    "For upsampling or downsampling temporal resolutions, we can use the `.resample()` method in Xarray.  For example, you can use this function to downsample a dataset from hourly to 6-hourly resolution.\n",
    "\n",
    "Our original SST data is monthly resolution. Let's use `.resample()` to downsample to annual frequency:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a9cfb76b-c4ab-441e-a474-c66b7af944ad",
   "metadata": {
    "execution": {},
    "executionInfo": {
     "elapsed": 499,
     "status": "ok",
     "timestamp": 1681569618680,
     "user": {
      "displayName": "Sloane Garelick",
      "userId": "04706287370408131987"
     },
     "user_tz": 240
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# resample from a monthly to an annual frequency\n",
    "tos_yearly = ds.tos.resample(time=\"AS\")\n",
    "tos_yearly"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6927f5be-d313-4d03-bab8-d22b3cb13899",
   "metadata": {
    "execution": {},
    "executionInfo": {
     "elapsed": 312,
     "status": "ok",
     "timestamp": 1681569621935,
     "user": {
      "displayName": "Sloane Garelick",
      "userId": "04706287370408131987"
     },
     "user_tz": 240
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# calculate the global mean of the resampled data\n",
    "annual_mean = tos_yearly.mean()\n",
    "annual_mean_global = annual_mean.mean(dim=[\"lat\", \"lon\"])\n",
    "annual_mean_global.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a373cb9e",
   "metadata": {
    "execution": {}
   },
   "source": [
    "\n",
    "## Section 1.2: Moving Average\n",
    "\n",
    "The `.rolling()` method allows for a rolling window aggregation and is applied along one dimension using the name of the dimension as a key (e.g. `time`) and the window size as the value (e.g. `6`). We will use these values in the demonstration below.\n",
    "\n",
    "Let's use the `.rolling()` function to compute a 6-month moving average of our SST data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "342acbf1-4eee-4d0d-bb52-b394ffcd556d",
   "metadata": {
    "execution": {},
    "executionInfo": {
     "elapsed": 791,
     "status": "ok",
     "timestamp": 1681569626067,
     "user": {
      "displayName": "Sloane Garelick",
      "userId": "04706287370408131987"
     },
     "user_tz": 240
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# calculate the running mean\n",
    "tos_m_avg = ds.tos.rolling(time=6, center=True).mean()\n",
    "tos_m_avg"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0eb0cc4e-661a-4ab1-96ad-e096917ef104",
   "metadata": {
    "execution": {},
    "executionInfo": {
     "elapsed": 1333,
     "status": "ok",
     "timestamp": 1681569630104,
     "user": {
      "displayName": "Sloane Garelick",
      "userId": "04706287370408131987"
     },
     "user_tz": 240
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# calculate the global average of the running mean\n",
    "tos_m_avg_global = tos_m_avg.mean(dim=[\"lat\", \"lon\"])\n",
    "tos_m_avg_global.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0c719e22",
   "metadata": {
    "execution": {}
   },
   "source": [
    "\n",
    "## Section 1.3: Coarsening the Data\n",
    "\n",
    "The `.coarsen()` function allows for block aggregation along multiple dimensions. \n",
    "\n",
    "Let's use the `.coarsen()` function to take a block mean for every 4 months and globally (i.e., 180 points along the latitude dimension and 360 points along the longitude dimension). Although we know the dimensions of our data quite well, we will include code that finds the length of the latitude and longitude variables so that it could work for other datasets that have a different format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "118df2c7",
   "metadata": {
    "execution": {},
    "tags": []
   },
   "outputs": [],
   "source": [
    "# coarsen the data\n",
    "coarse_data = ds.coarsen(time=4, lat=len(ds.lat), lon=len(ds.lon)).mean()\n",
    "coarse_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f0b9ef33",
   "metadata": {
    "execution": {},
    "tags": []
   },
   "outputs": [],
   "source": [
    "coarse_data.tos.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f93092b",
   "metadata": {
    "execution": {}
   },
   "source": [
    "## Section 1.4: Compare the Resampling Methods\n",
    "\n",
    "Now that we've tried multiple resampling methods on different temporal resolutions, we can compare the resampled datasets to the original."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1517017a",
   "metadata": {
    "execution": {},
    "tags": []
   },
   "outputs": [],
   "source": [
    "original_global = ds.mean(dim=[\"lat\", \"lon\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "daf709a8",
   "metadata": {
    "execution": {},
    "tags": []
   },
   "outputs": [],
   "source": [
    "original_global.tos.plot(size=6)\n",
    "coarse_data.tos.plot()\n",
    "tos_m_avg_global.plot()\n",
    "annual_mean_global.plot()\n",
    "\n",
    "\n",
    "plt.legend(\n",
    "    [\n",
    "        \"original data (monthly)\",\n",
    "        \"coarsened (4 months)\",\n",
    "        \"moving average (6 months)\",\n",
    "        \"annually resampled (12 months)\",\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f1d99f56-adcc-410f-811d-a2a71289513d",
   "metadata": {
    "execution": {}
   },
   "source": [
    "### Questions 1.4: Climate Connection\n",
    "\n",
    "1. What type of information can you obtain from each time series?\n",
    "2. In what scenarios would you use different temporal resolutions?\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34f4c5ba-8a12-4472-b482-4ca4f2948a2c",
   "metadata": {
    "colab_type": "text",
    "execution": {},
    "tags": []
   },
   "source": [
    "[*Click for solution*](https://github.com/neuromatch/climate-course-content/tree/main/tutorials/W1D1_ClimateSystemOverview/solutions/W1D1_Tutorial7_Solution_9e05e276.py)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "####  Submit your feedback\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b026c173-582c-48e1-8504-ef1c13acee9e",
   "metadata": {
    "cellView": "form",
    "execution": {},
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Questions_1_4\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c18bb57b-63f2-4f9d-ac8d-32d0231d02fa",
   "metadata": {
    "execution": {}
   },
   "source": [
    "# Summary\n",
    "In this tutorial, we've explored Xarray tools to simplify and understand climate data better. Given the complexity and variability of climate data, tools like `.resample()`, `.rolling()`, and `.coarsen()` come in handy to make the data easier to compare and find long-term trends. You've also looked at valuable techniques like calculating moving averages. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b438a488-b3c7-4f87-90db-2685d44c65fe",
   "metadata": {
    "execution": {}
   },
   "source": [
    "# Resources"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a0db8d5a-8c2c-4019-96df-c15783b84cd1",
   "metadata": {
    "execution": {}
   },
   "source": [
    "Code and data for this tutorial is based on existing content from [Project Pythia](https://foundations.projectpythia.org/core/xarray/computation-masking.html)."
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "include_colab_link": true,
   "name": "W1D1_Tutorial7",
   "provenance": [],
   "toc_visible": true
  },
  "kernel": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}