add dvsg tutorial

NeuroBench · May 17, 2024 · 2c74da7 · 2c74da7
1 parent 60718c7
commit 2c74da7
Showing 1 changed file with 306 additions and 0 deletions.
diff --git a/neurobench/examples/dvs_gesture/DVS_Gesture_tutorial.ipynb b/neurobench/examples/dvs_gesture/DVS_Gesture_tutorial.ipynb
@@ -0,0 +1,306 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "yGm4fad3M-Sr"
+   },
+   "source": [
+    "# DVS Gesture Benchmark Tutorial\n",
+    "\n",
+    "This tutorial aims to provide an insight on how the NeuroBench framework is organized and how you can use it to benchmark your own models!\n",
+    "\n",
+    "## About DVS Gesture:\n",
+    "The IBM Dynamic Vision Sensor (DVS) Gesture dataset is composed of recordings of 29 distinct individuals executing 10 different types of gestures, including but not limited to clapping, waving, etc. Additionally, an 11th gesture class is included that comprises gestures that cannot be categorized within the first 10 classes. The gestures are recorded under four distinct lighting conditions, and each gesture is associated with a label that indicates the corresponding lighting condition under which it was performed.\n",
+    "\n",
+    "### Benchmark Task:\n",
+    "The task is to classify gestures and achieve high accuracy. This tutorial demonstrates with a trained convolutional spiking neural network."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "First we will import the relevant libraries. We will use the [Tonic library](https://tonic.readthedocs.io/en/latest/) for loading and pre-processing the data, and the model wrapper, post-processor, and benchmark object from NeuroBench."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "lqtM6XbMM_hO"
+   },
+   "outputs": [],
+   "source": [
+    "# Tonic library is used for DVS Gesture dataset loading and processing\n",
+    "import tonic\n",
+    "import tonic.transforms as transforms\n",
+    "from torch.utils.data import DataLoader\n",
+    "\n",
+    "from neurobench.models import SNNTorchModel\n",
+    "from neurobench.postprocessing import choose_max_count\n",
+    "from neurobench.benchmarks import Benchmark"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "R7HMjVPX7LZh"
+   },
+   "source": [
+    "For this tutorial, we will make use of a four-layer convolutional SNN, written using snnTorch."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "r0yYDNRZ7UxY"
+   },
+   "outputs": [],
+   "source": [
+    "import torch\n",
+    "import torch.nn as nn\n",
+    "import snntorch as snn\n",
+    "from snntorch import surrogate\n",
+    "\n",
+    "class Net(torch.nn.Module):\n",
+    "    def __init__(self):\n",
+    "        super().__init__()\n",
+    "\n",
+    "        # Hyperparameters\n",
+    "        beta_1 = 0.9999903192467171\n",
+    "        beta_2 = 0.7291118090686332\n",
+    "        beta_3 = 0.9364650136740154\n",
+    "        beta_4 = 0.8348241794080301\n",
+    "        threshold_1 = 3.511291184386264\n",
+    "        threshold_2 = 3.494437965584431\n",
+    "        threshold_3 = 1.5986853560315544\n",
+    "        threshold_4 = 0.3641469130041378\n",
+    "        spike_grad = surrogate.atan()\n",
+    "        dropout = 0.5956071342984011\n",
+    "        \n",
+    "         # Initialize layers\n",
+    "        self.conv1 = nn.Conv2d(2, 16, 5, padding=\"same\")\n",
+    "        self.pool1 = nn.MaxPool2d(2)\n",
+    "        self.lif1 = snn.Leaky(beta=beta_1, threshold=threshold_1, spike_grad=spike_grad, init_hidden=True)\n",
+    "        \n",
+    "        self.conv2 = nn.Conv2d(16, 32, 5, padding=\"same\")\n",
+    "        self.pool2 = nn.MaxPool2d(2)\n",
+    "        self.lif2 = snn.Leaky(beta=beta_2, threshold=threshold_2, spike_grad=spike_grad, init_hidden=True)\n",
+    "        \n",
+    "        self.conv3 = nn.Conv2d(32, 64, 5, padding=\"same\")\n",
+    "        self.pool3 = nn.MaxPool2d(2)\n",
+    "        self.lif3 = snn.Leaky(beta=beta_3, threshold=threshold_3, spike_grad=spike_grad, init_hidden=True)\n",
+    "        \n",
+    "        self.linear1 = nn.Linear(64*4*4, 11)\n",
+    "        self.dropout_4 = nn.Dropout(dropout)\n",
+    "        self.lif4 = snn.Leaky(beta=beta_4, threshold=threshold_4, spike_grad=spike_grad, init_hidden=True, output=True)\n",
+    "\n",
+    "    def forward(self, x):\n",
+    "        # x is expected to be in shape (batch, channels, height, width) = (B, 2, 32, 32)\n",
+    "        \n",
+    "        # Layer 1\n",
+    "        y = self.conv1(x)\n",
+    "        y = self.pool1(y)\n",
+    "        spk1 = self.lif1(y)\n",
+    "\n",
+    "        # Layer 2\n",
+    "        y = self.conv2(spk1)\n",
+    "        y = self.pool2(y)\n",
+    "        spk2 = self.lif2(y)\n",
+    "\n",
+    "        # Layer 3\n",
+    "        y = self.conv3(spk2)\n",
+    "        y = self.pool3(y)\n",
+    "        spk3 = self.lif3(y)\n",
+    "\n",
+    "        # Layer 4\n",
+    "        y = self.linear1(spk3.flatten(1))\n",
+    "        y = self.dropout_4(y)\n",
+    "        spk4, mem4 = self.lif4(y)\n",
+    "\n",
+    "        return spk4, mem4"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "VNIgTfvuOMe-"
+   },
+   "source": [
+    "We load a pre-trained model. The model is wrapped in the SNNTorchModel wrapper, which includes boilerplate inference code and interfaces with the top-level Benchmark class."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "chZeyUTAOQ6B"
+   },
+   "outputs": [],
+   "source": [
+    "device = torch.device(\"cuda\") if torch.cuda.is_available() else torch.device(\"cpu\")\n",
+    "\n",
+    "net = Net()\n",
+    "net.load_state_dict(torch.load(\"model_data/dvs_gesture_snn\", map_location=device))\n",
+    "\n",
+    "model = SNNTorchModel(net)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Next, we will load the dataset. Here, we are using the DVSGesture dataset from the Tonic library, as well as transforms to turn the events into frames that can be processed."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "x4jOfnt6OeIH"
+   },
+   "outputs": [],
+   "source": [
+    "# Load the dataset, here we are using the Tonic library\n",
+    "data_dir = \"../../../data/dvs_gesture\" # data in repo root dir\n",
+    "test_transform = transforms.Compose([transforms.Denoise(filter_time=10000),\n",
+    "                                     transforms.Downsample(spatial_factor=0.25),\n",
+    "                                     transforms.ToFrame(sensor_size=(32, 32, 2),\n",
+    "                                                        n_time_bins=150),\n",
+    "                                    ])\n",
+    "test_set = tonic.datasets.DVSGesture(save_to=data_dir, transform=test_transform, train=False)\n",
+    "test_set_loader = DataLoader(test_set, batch_size=16,\n",
+    "                         collate_fn=tonic.collation.PadTensors(batch_first=True))"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {
+    "id": "UfRfdvXvOqRP"
+   },
+   "source": [
+    "Specify any pre-processors and post-processors you want to use. These will be applied to your data before feeding into the model, and to the output spikes respectively.\n",
+    "Here, the transforms listed above account for all necessary pre-processing. The post-processor counts up the spikes corresponding to the output labels, and chooses the label with the max count."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "3GHY8vTROwzP"
+   },
+   "outputs": [],
+   "source": [
+    "preprocessors = []\n",
+    "postprocessors = [choose_max_count]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "o9doNsI0O0Jl"
+   },
+   "source": [
+    "Next specify the metrics which you want to calculate. The metrics include static metrics, which are computed before any model inference, and workload metrics, which show inference results.\n",
+    "\n",
+    "- Footprint: Bytes used to store the model parameters and buffers.\n",
+    "- Connection sparsity: Proportion of zero weights in the model.\n",
+    "- Classification accuracy: Accuracy of keyword predictions.\n",
+    "- Activation sparsity: Proportion of zero activations, averaged over all neurons, timesteps, and samples.\n",
+    "- Synaptic operations: Number of weight-activation operations, averaged over keyword samples.\n",
+    "  - Effective MACs: Number of non-zero multiply-accumulate synops, where the activations are not spikes with values -1 or 1.\n",
+    "  - Effective ACs: Number of non-zero accumulate synops, where the activations are -1 or 1 only.\n",
+    "  - Dense: Total zero and non-zero synops."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "sDUczVTkPOsQ"
+   },
+   "outputs": [],
+   "source": [
+    "static_metrics = [\"footprint\", \"connection_sparsity\"]\n",
+    "workload_metrics = [\"classification_accuracy\", \"activation_sparsity\", \"synaptic_operations\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "KXQYfiJpPTZb"
+   },
+   "source": [
+    "Next, we instantiate the benchmark. We pass the model, the dataloader, the preprocessors, the postprocessor and the list of the static and data metrics which we want to measure:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "U0_N96ADPeO5"
+   },
+   "outputs": [],
+   "source": [
+    "benchmark = Benchmark(model, test_set_loader, preprocessors, postprocessors, [static_metrics, workload_metrics])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "6ytLJ-dUPp0b"
+   },
+   "source": [
+    "Now, let's run the benchmark and print our results!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "Ldww7kiYPsU2"
+   },
+   "outputs": [],
+   "source": [
+    "results = benchmark.run()\n",
+    "print(results)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Expected output:\n",
+    "{'footprint': 304828, 'connection_sparsity': 0.0, \n",
+    "'classification_accuracy': 0.8636363636363633, 'activation_sparsity': 0.9507192967815323, \n",
+    "'synaptic_operations': {'Effective_MACs': 9227011.575757576, 'Effective_ACs': 30564577.174242426, 'Dense': 891206400.0}}"
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}