From d18c8eecc37d7e6903f782d2e8d3c9677befa04f Mon Sep 17 00:00:00 2001 From: Robert Ford <50349951+r-ford@users.noreply.github.com> Date: Fri, 24 May 2024 12:44:12 -0400 Subject: [PATCH] Delete notebooks/example-workflows/enso-esgf.ipynb Notebook is retained in separate branch --- notebooks/example-workflows/enso-esgf.ipynb | 501 -------------------- 1 file changed, 501 deletions(-) delete mode 100644 notebooks/example-workflows/enso-esgf.ipynb diff --git a/notebooks/example-workflows/enso-esgf.ipynb b/notebooks/example-workflows/enso-esgf.ipynb deleted file mode 100644 index 88956b6..0000000 --- a/notebooks/example-workflows/enso-esgf.ipynb +++ /dev/null @@ -1,501 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "7d1ea66b-fa0c-4fa4-9b65-3d7b07716e62", - "metadata": {}, - "source": [ - "# Calculating ENSO Using Intake-ESGF" - ] - }, - { - "cell_type": "markdown", - "id": "4a415308-0e9a-470c-bb68-da75b349c006", - "metadata": {}, - "source": [ - "## Overview\n", - "\n", - "In this workflow, we combine topics covered in previous Pythia Foundations and CMIP6 Cookbook content to apply the [Niño 3.4 Index](https://climatedataguide.ucar.edu/climate-data/nino-sst-indices-nino-12-3-34-4-oni-and-tni) to a broader range of datasets. As a refresher of what the ENSO 3.4 index is, please see the following text, which is also included in the [ENSO Xarray](https://foundations.projectpythia.org/core/xarray/enso-xarray.html) content in the Pythia Foundations content.\n", - "\n", - "> Niño 3.4 (5N-5S, 170W-120W): The Niño 3.4 anomalies may be thought of as representing the average equatorial SSTs across the Pacific from about the dateline to the South American coast. The Niño 3.4 index typically uses a 5-month running mean, and El Niño or La Niña events are defined when the Niño 3.4 SSTs exceed +/- 0.4C for a period of six months or more.\n", - "\n", - "> Niño X Index computation: a) Compute area averaged total SST from Niño X region; b) Compute monthly climatology (e.g., 1950-1979) for area averaged total SST from Niño X region, and subtract climatology from area averaged total SST time series to obtain anomalies; c) Smooth the anomalies with a 5-month running mean; d) Normalize the smoothed values by its standard deviation over the climatological period.\n", - "\n", - "![](https://www.ncdc.noaa.gov/monitoring-content/teleconnections/nino-regions.gif)\n", - "\n", - "The previous example in the Pythia Foundations content detailed a single simulation. In this example, we aim to apply this computation more generically across a variety of datasets.\n", - "\n", - "The overall goal of this tutorial is to produce a plot of ENSO data using Xarray and intake-ESGF. The plots will resemble the Oceanic Niño Index plot shown below.\n", - "\n", - "![ONI index plot from NCAR Climate Data Guide](https://climatedataguide.ucar.edu/sites/default/files/styles/extra_large/public/2022-03/indices_oni_2_2_lg.png)" - ] - }, - { - "cell_type": "markdown", - "id": "2d4c6aed-a9c5-4d29-bfa3-c8e8be230567", - "metadata": {}, - "source": [ - "## Prerequisites\n", - "\n", - "| Concepts | Importance | Notes |\n", - "| --- | --- | --- |\n", - "| [Intro to Xarray](https://foundations.projectpythia.org/core/xarray/xarray-intro.html) | Necessary | |\n", - "| [hvPlot Basics](https://hvplot.holoviz.org/getting_started/hvplot.html) | Necessary | Interactive Visualization with hvPlot |\n", - "| [Understanding of NetCDF](https://foundations.projectpythia.org/core/data-formats/netcdf-cf.html) | Helpful | Familiarity with metadata structure |\n", - "| [Calculating ENSO with Xarray](https://foundations.projectpythia.org/core/xarray/enso-xarray.html) | Neccessary | Understanding of Masking and Xarray Functions |\n", - "| Dask | Helpful | |\n", - "\n", - "- **Time to learn**: 30 minutes" - ] - }, - { - "cell_type": "markdown", - "id": "7ff38f37-8f14-443f-b0c7-188baf75d1be", - "metadata": {}, - "source": [ - "## Imports" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "52bcfa1a-3907-446d-b384-29e97b5c8cb9", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "import hvplot.xarray\n", - "import holoviews as hv\n", - "import numpy as np\n", - "import hvplot.xarray\n", - "import matplotlib.pyplot as plt\n", - "import cartopy.crs as ccrs\n", - "from intake_esgf import ESGFCatalog\n", - "import xarray as xr\n", - "import cf_xarray\n", - "import warnings\n", - "warnings.filterwarnings(\"ignore\")\n", - "\n", - "hv.extension(\"bokeh\")" - ] - }, - { - "cell_type": "markdown", - "id": "7aa56cff-6a80-47fc-b99e-fd9fa960032c", - "metadata": {}, - "source": [ - "## Access ESGF-hosted CMIP6 Data\n", - "We will use the Climate Model Intercomparison Project version 6 (CMIP6) dataset, which is available from the Earth System Grid Federation (ESGF) data servers.\n", - "\n", - "There is a toolkit, `intake-esgf`, we can use to interface with the data servers, making it easier to search for our datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b82ae2da-1928-4f6b-b17b-551b82465845", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "cat = ESGFCatalog()\n", - "cat.search(\n", - " activity_id='CMIP',\n", - " experiment_id=[\"historical\",\"ssp585\"],\n", - " source_id=\"CESM2\",\n", - " variable_id=[\"tos\"],\n", - " member_id='r11i1p1f1',\n", - " grid_label=\"gn\",\n", - " table_id=\"Omon\",\n", - " )" - ] - }, - { - "cell_type": "markdown", - "id": "29bc9830-67b0-40ed-aae3-ad5b759878d5", - "metadata": {}, - "source": [ - "### Load into a DataTree\n", - "Once we subset for our data, we can load the data into a datatree, which is a nested structure of `xarray.Dataset`s, which include the climate grid cell statistics as well!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2b8d3bc8-f568-44f7-bee2-833cf532d823", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "tos_tree = cat.to_dataset_dict()\n", - "tos_tree" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3694d7de-3b75-4e25-8001-1fa5c7877f53", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "tos_tree[]" - ] - }, - { - "cell_type": "markdown", - "id": "5395c611-afda-405e-9b1c-35b7566c557f", - "metadata": { - "tags": [] - }, - "source": [ - "There is only one variable in this dictionary of datasets, `tos`, which is the \"Sea Surface Temperature\". We can extract this dataset by passing in that key:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "317a9198-19ba-4e73-b006-4a8ba61327a8", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "ds = tos_tree[\"tos\"]\n", - "ds" - ] - }, - { - "cell_type": "markdown", - "id": "f3ac774f-7a6f-425b-82c9-c5b3099eb203", - "metadata": {}, - "source": [ - "## Calculate ENSO\n", - "The calculation is covered in more detail in the Pythia Foundations book, here, we apply the calculation to our datasets!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c01c56eb-71b9-4b4b-92c3-31ff3714b55b", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "def calculate_enso(ds):\n", - " \n", - " # Subset the El Nino 3.4 index region\n", - " dso = ds.where(\n", - " (ds.cf[\"latitude\"] < 5) & (ds.cf[\"latitude\"] > -5) & (ds.cf[\"longitude\"] > 190) & (ds.cf[\"longitude\"] < 240), drop=True\n", - " )\n", - " \n", - " # Calculate the monthly means\n", - " gb = dso.tos.groupby('time.month')\n", - " \n", - " # Subtract the monthly averages, returning the anomalies\n", - " tos_nino34_anom = gb - gb.mean(dim='time')\n", - " \n", - " # Determine the non-time dimensions and average using these\n", - " non_time_dims = set(tos_nino34_anom.dims)\n", - " non_time_dims.remove(ds.tos.cf[\"T\"].name)\n", - " weighted_average = tos_nino34_anom.weighted(ds[\"areacello\"]).mean(dim=list(non_time_dims))\n", - " \n", - " # Calculate the rolling average\n", - " rolling_average = weighted_average.rolling(time=5, center=True).mean()\n", - " std_dev = weighted_average.std()\n", - " return rolling_average / std_dev" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "193b0055-35a4-4f48-8d00-b24059f00160", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "enso_index = calculate_enso(ds).compute()\n", - "enso_index" - ] - }, - { - "cell_type": "markdown", - "id": "0a6dba6d-f305-4909-bbde-63366c5cb2b5", - "metadata": {}, - "source": [ - "## Visualize ENSO\n", - "\n", - "### Basic Visualization\n", - "We can create a basic visualization of the dataset using hvplot!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "67cac155-bb58-4c80-a0e4-1c3fd8de8eb4", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "enso_index.hvplot(x='time')" - ] - }, - { - "cell_type": "markdown", - "id": "629d1d24-accb-426d-858c-6422d568d97d", - "metadata": {}, - "source": [ - "### Identify El Niño and La Niña\n", - "Including the indices as we showed above is not always the most helpful. We need to add additional context to help the reader understand when we reach El Niño and La Niña, which are helpful thresholds for the wider community to use.\n", - "\n", - "A typical threshold to use is 0.4, which means El Niño occurs when the ENSO 3.4 index is equal to or greater than 0.4, and La Niña occurs when the ENSO 3.4 index is equal to or less than 0.4.\n", - "\n", - "We apply this using the following function." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0bf6b460-f64a-47d2-b819-53abdc0b03f6", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "def add_enso_thresholds(da, threshold=0.4):\n", - " \n", - " # Conver the xr.DataArray into an xr.Dataset\n", - " ds = da.to_dataset()\n", - " \n", - " # Cleanup the time and use the thresholds\n", - " try:\n", - " ds[\"time\"]= ds.indexes[\"time\"].to_datetimeindex()\n", - " except:\n", - " pass\n", - " ds[\"tos_gt_04\"] = (\"time\", ds.tos.where(ds.tos >= threshold, threshold).data)\n", - " ds[\"tos_lt_04\"] = (\"time\", ds.tos.where(ds.tos <= -threshold, -threshold).data)\n", - " \n", - " # Add fields for the thresholds\n", - " ds[\"el_nino_threshold\"] = (\"time\", np.zeros_like(ds.tos) + threshold)\n", - " ds[\"la_nina_threshold\"] = (\"time\", np.zeros_like(ds.tos) - threshold)\n", - " \n", - " return ds" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "13680161-2f71-40f0-9c56-df8cac9621a5", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "enso_ds = add_enso_thresholds(enso_index)\n", - "enso_ds" - ] - }, - { - "cell_type": "markdown", - "id": "b502d1d8-e4d7-497d-84fa-c478ec7b4e91", - "metadata": {}, - "source": [ - "### Configure a Function to Plot the Data\n", - "We will use the `hvplot.area` functionality here, which enables us to shade the area between values. We use the newly added variables in our dataset to help here." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d2e8e487-5158-4825-a7aa-9dfe582192c4", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "def plot_enso(ds):\n", - " el_nino = ds.hvplot.area(x=\"time\", y2='tos_gt_04', y='el_nino_threshold', color='red', hover=False)\n", - " el_nino_label = hv.Text(ds.isel(time=40).time.values, 2, 'El Niño').opts(text_color='red',)\n", - "\n", - " # Create the La Niña area graphs\n", - " la_nina = ds.hvplot.area(x=\"time\", y2='tos_lt_04', y='la_nina_threshold', color='blue', hover=False)\n", - " la_nina_label = hv.Text(ds.isel(time=-40).time.values, -2, 'La Niña').opts(text_color='blue')\n", - "\n", - " # Plot a timeseries of the ENSO 3.4 index\n", - " enso = ds.tos.hvplot(x='time', line_width=0.5, color='k', xlabel='Year', ylabel='ENSO 3.4 Index')\n", - "\n", - " # Combine all the plots into a single plot\n", - " return (el_nino_label * la_nina_label * el_nino * la_nina * enso)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ed848ecc-1731-46a3-a898-a4cfcaa55d71", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "plot_enso(enso_ds)" - ] - }, - { - "cell_type": "markdown", - "id": "c3bf5d22-57fa-4fad-b194-064a2aae342b", - "metadata": {}, - "source": [ - "## Apply to Multiple Datasets\n", - "Now that we have the workflow, let's apply this to multiple datasets. We focus here on two different instiutions:\n", - "- The National Center for Atmospheric Research (NCAR)\n", - "- Model for Interdisciplinary Research on Climate (MIROC)\n", - "\n", - "Both of these modeling centers produced output for CMIP6.\n" - ] - }, - { - "cell_type": "markdown", - "id": "0624f670-f805-4813-8853-d7f94f2c2a86", - "metadata": {}, - "source": [ - "### Setup a Function for Searching and Combining Datasets\n", - "We can use the query mentioned previously to configure our search. Here, we parameterize based on the institution id (ex. NCAR, MIROC)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "256207a2-0275-4897-b380-334de13e9ce1", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "def search_esgf(institution_id, grid='gn'):\n", - " \n", - " # Search and load the ocean surface temperature (tos)\n", - " cat = ESGFCatalog()\n", - " cat.search(\n", - " activity_id=\"CMIP\",\n", - " experiment_id=\"historical\",\n", - " institution_id=institution_id,\n", - " grid_label=grid,\n", - " variable_id=[\"tos\"],\n", - " member_id='r11i1p1f1',\n", - " table_id=\"Omon\",\n", - " )\n", - "\n", - " return cat.to_dataset_dict()[\"tos\"]" - ] - }, - { - "cell_type": "markdown", - "id": "28fd9293-4b6e-4881-af47-8f95ece00c62", - "metadata": {}, - "source": [ - "### Apply the Search and Computations" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3a173bf5-c152-4e63-a0c3-c1055ea1439b", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "ncar_ds = search_esgf(\"NCAR\")\n", - "enso_index_ncar = add_enso_thresholds(calculate_enso(ncar_ds).compute())" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "64d12adf-1c0f-4f8d-a37e-6b45b20c55e8", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "miroc_ds = search_esgf(\"MIROC\")\n", - "enso_index_miroc = add_enso_thresholds(calculate_enso(miroc_ds).compute())" - ] - }, - { - "cell_type": "markdown", - "id": "d2f0a6b0-07d7-4c95-9598-0d746cfcaf72", - "metadata": {}, - "source": [ - "### Visualize our ENSO Comparison\n", - "Now that we have our data, we can plot the comparison, stacking the two together using hvPlot." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "54594fe4-136b-4150-bb1f-25f70df82364", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "ncar_enso_plot = plot_enso(enso_index_ncar).opts(title=f'NCAR {ncar_ds.attrs[\"source_id\"]} \\n Ensemble Member: {ncar_ds.attrs[\"variant_label\"]}')\n", - "miroc_enso_plot = plot_enso(enso_index_miroc).opts(title=f'MIROC {miroc_ds.attrs[\"source_id\"]} \\n Ensemble Member: {miroc_ds.attrs[\"variant_label\"]}')\n", - "\n", - "(ncar_enso_plot + miroc_enso_plot).cols(1)" - ] - }, - { - "cell_type": "markdown", - "id": "b3bfceb9-124a-4d72-9d49-3ca255965e29", - "metadata": {}, - "source": [ - "## Summary\n", - "In this notebook, we searched for and accessed two different CMIP6 datasets hosted through ESGF, calculated the ENSO 3.4 indices for the datasets, and created interactive plots comparing where we see El Niño and La Niña.\n", - "\n", - "### What's next?\n", - "We will see some more advanced examples of using the CMIP6 and other data access methods as well as computations\n", - "\n", - "## Resources and references\n", - "- [Intake-ESGF Documentation](https://github.com/nocollier/intake-esgf)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f5b2864f-6661-4aa4-8d65-8dc10c961b36", - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -}