From 6538cd32ff1a07aa9a696fabce96b9a38868ac4e Mon Sep 17 00:00:00 2001 From: mrdbourke Date: Fri, 5 Apr 2024 14:02:42 +1000 Subject: [PATCH] fix typos, add text for dog vision v2 --- .../end-to-end-dog-vision-v2.ipynb | 167 ++++++++++++------ 1 file changed, 111 insertions(+), 56 deletions(-) diff --git a/section-4-unstructured-data-projects/end-to-end-dog-vision-v2.ipynb b/section-4-unstructured-data-projects/end-to-end-dog-vision-v2.ipynb index 850ac7cb7..963abd3e1 100644 --- a/section-4-unstructured-data-projects/end-to-end-dog-vision-v2.ipynb +++ b/section-4-unstructured-data-projects/end-to-end-dog-vision-v2.ipynb @@ -19,10 +19,11 @@ "# Introduction to TensorFlow, Deep Learning and Transfer Learning (work in progress)\n", "\n", "* **Project:** Dog Vision πŸΆπŸ‘ - Using computer vision to classify dog photos into different breeds.\n", - "* **Goals:** Learn TensorFlow, deep learning and transfer learning.\n", + "* **Goals:** Learn TensorFlow, deep learning and transfer learning, beat the original research paper results (22% accuracy).\n", "* **Domain:** Computer vision.\n", "* **Data:** Images of dogs from [Stanford Dogs Dataset](http://vision.stanford.edu/aditya86/ImageNetDogs/) (120 dog breeds, 20,000+ images).\n", "* **Problem type:** Multi-class classification (120 different classes).\n", + "* **Runtime:** This project is designed to run end-to-end in [Google Colab](https://colab.research.google.com/) (for free GPU access and easy setup). If you'd like to run it locally, it will require environment setup. \n", "\n", "Welcome, welcome!\n", "\n", @@ -50,7 +51,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "Last updated: 2024-04-03 23:39:24.257592\n" + "Last updated: 2024-04-04 15:06:16.564421\n" ] } ], @@ -68,7 +69,7 @@ "source": [ "## TK - What we're going to cover\n", "\n", - "In this project, we're going to be introduced to the power of deep learning and more specifically, transfer learning using TensorFlow.\n", + "In this project, we're going to be introduced to the power of deep learning and more specifically, transfer learning using TensorFlow/Keras.\n", "\n", "We'll go through each of these in the context of the [6 step machine learning framework](https://dev.mrdbourke.com/zero-to-mastery-ml/a-6-step-framework-for-approaching-machine-learning-projects/):\n", "\n", @@ -97,9 +98,12 @@ "source": [ "## TK - Where can can you get help?\n", "\n", - "All of the materials for this course [live on GitHub](https://github.com/mrdbourke/zero-to-mastery-ml/tree/master).\n", + "All of the materials for this course are [available on GitHub](https://github.com/mrdbourke/zero-to-mastery-ml/tree/master).\n", "\n", - "If you run into trouble, you can ask a question on the course [GitHub Discussions page](https://github.com/mrdbourke/zero-to-mastery-ml/discussions) there too." + "If you run into trouble, you can ask a question on the course [GitHub Discussions page](https://github.com/mrdbourke/zero-to-mastery-ml/discussions) there too.\n", + "\n", + "* TK - Ask Stack Overflow\n", + "* TK - Ask ChatGPT/Gemini" ] }, { @@ -114,7 +118,7 @@ "\n", "### TK - What is TensorFlow?\n", "\n", - "[TensorFlow](https://www.tensorflow.org/) is an open source machine learning and deep learning framework originally developed by Google.\n", + "[TensorFlow](https://www.tensorflow.org/) is an open source machine learning and deep learning framework originally developed by Google. Inside TensorFlow, you can also use Keras which is another very helpful machine learning framework known for its ease of use. \n", "\n", "### TK - Why use TensorFlow?\n", "\n", @@ -124,13 +128,17 @@ "\n", "Many of world's largest companies [power their machine learning workloads with TensorFlow](https://www.tensorflow.org/about/case-studies).\n", "\n", + "TK - image of people using TensorFlow\n", + "\n", "### TK - What is deep learning?\n", "\n", "[Deep learning](https://en.wikipedia.org/wiki/Deep_learning) is a form of machine learning where data passes through a series of progressive layers which all contribute to learning an overall representation of that data.\n", "\n", - "The series of progressive layers combines to form what's referred to as a [**neural network**](https://en.wikipedia.org/wiki/Artificial_neural_network).\n", + "Each layer performs a pre-defined operation.\n", + "\n", + "The series of progressive layers combine to form what's referred to as a [**neural network**](https://en.wikipedia.org/wiki/Artificial_neural_network).\n", "\n", - "For example, a photo may be turned into numbers and those numbers are then manipulated mathematically through each progressive layer to learn patterns in the photo.\n", + "For example, a photo may be turned into numbers (e.g. red, green and blue pixel values) and those numbers are then manipulated mathematically through each progressive layer to learn patterns in the photo.\n", "\n", "The \"deep\" in deep learning comes from the number of layers used in the neural network.\n", "\n", @@ -142,15 +150,17 @@ "\n", "Most of the modern forms of artifical intelligence (AI) applications you see, are powered by deep learning.\n", "\n", - "[ChatGPT](https://chat.openai.com) uses deep learning to process text and return a response.\n", + "[ChatGPT](https://chat.openai.com) and other large language models (LLMs) such as Llama, Claude and Gemini use deep learning to process text and return a response.\n", "\n", "Tesla's [self-driving cars use deep learning](https://www.tesla.com/AI) to power their computer vision systems.\n", "\n", "Apple's Photos app uses deep learning to [recognize faces in images](https://machinelearning.apple.com/research/recognizing-people-photos) and create Photo Memories.\n", "\n", + "Siri and Google Assistant use deep learning to recognize and understand voice commands.\n", + "\n", "[Nutrify](https://nutrify.app) (an app my brother and I build) uses deep learning to recognize food in images.\n", "\n", - "TK - image of examples\n", + "TK - image of examples where deep learning can be used for\n", "\n", "### TK - What is transfer learning?\n", "\n", @@ -160,9 +170,11 @@ "\n", "In our case, we're going to use transfer learning to take the patterns a neural network has learned from the 1 million+ images and over 1000 classes in [ImageNet](https://www.image-net.org/) (a gold standard computer vision benchmark) and apply them to our own problem of recognizing dog breeds.\n", "\n", + "However, this concept can be applied to many different domains. You could take a large language model (LLM) that has been pre-trained on *most* of the text on the internet and learned very well the patterns in naturual language and customize it for your own specific chat use-case.\n", + "\n", "The biggest benefit of transfer learning is that it often allows you to get outstanding results with less data and time.\n", "\n", - "TK - Transfer learning workflow - Large data -> Large model -> Patterns -> Custom data -> Custom model" + "TK - image Transfer learning workflow - Large data -> Large model -> Patterns -> Custom data -> Custom model" ] }, { @@ -173,11 +185,11 @@ "source": [ "## TK - Getting setup\n", "\n", - "This section of the course is taught with Google Colab, an online Jupyter Notebook that provides free access to GPUs (Graphics Processing Units, we'll hear more on these later).\n", + "This section of the course is taught with [Google Colab](https://colab.research.google.com/), an online Jupyter Notebook that provides free access to GPUs (Graphics Processing Units, we'll hear more on these later).\n", "\n", "For a quick rundown on how to use Google Colab, see their [introductory guide](https://colab.research.google.com/notebooks/basic_features_overview.ipynb) (it's quite similar to a Jupyter Notebook with a few different options).\n", "\n", - "Google Colab also comes with many data science and machine learning libraries pre-installed, including TensorFlow.\n" + "Google Colab also comes with many data science and machine learning libraries pre-installed, including TensorFlow/Keras." ] }, { @@ -192,7 +204,7 @@ "\n", "Why use a GPU?\n", "\n", - "Since neural networks perform a large amount of calculations behind the scenes (the main one being matrix multiplication), you need a computer chip that perform these calculations quickly, otherwise you'll be waiting all day for a model to train.\n", + "Since neural networks perform a large amount of calculations behind the scenes (the main one being [matrix multiplication](https://en.wikipedia.org/wiki/Matrix_multiplication)), you need a computer chip that perform these calculations quickly, otherwise you'll be waiting all day for a model to train.\n", "\n", "And in short, GPUs are much faster at performing matrix multiplications than CPUs.\n", "\n", @@ -291,24 +303,32 @@ "source": [ "## TK - Getting Data\n", "\n", + "All machine learning (and deep learning) projects start with data.\n", + "\n", + "If you have no data, you have no project.\n", + "\n", + "If you have no project, you have no cool models to show your friends or improve your business.\n", + "\n", + "Not to worry!\n", + "\n", "There are several options and locations to get data for a deep learning project.\n", "\n", "| Resource | Description |\n", - "|----------|-------------|\n", + "| :----- | :----- |\n", "| [Kaggle Datasets](https://www.kaggle.com/datasets) | A collection of datasets across a wide range of topics. |\n", "| [TensorFlow Datasets](https://www.tensorflow.org/datasets) | A collection of ready-to-use machine learning datasets ready for use under the `tf.data.Datasets` API. You can see a list of [all available datasets](https://www.tensorflow.org/datasets/catalog/overview#all_datasets) in the TensorFlow documentation. |\n", "| [Hugging Face Datasets](https://huggingface.co/datasets) | A continually growing resource of datasets broken into several different kinds of topics. |\n", "| [Google Dataset Search](https://datasetsearch.research.google.com/) | A search engine by Google specifically focused on searching online datasets. |\n", "| Original sources | Datasets which are made available by researchers or companies with the release of a product or research paper (sources for these will vary, they could be a link on a website or a link to an application form). |\n", - "| Custom datasets | These are datasets comprised of your own custom source of data. You may build these from scratch on your own or have access to them from an existing product or service. For example, your entire photos library could be your own custom dataset or your entire notes and documents folder or your company's custom order history. |\n", + "| Custom datasets | These are datasets comprised of your own custom source of data. You may build these from scratch on your own or have access to them from an existing product or service. For example, your entire photos library could be your own custom dataset or your entire notes and documents folder or your company's customer order history. |\n", "\n", "In our case, the dataset we're going to use is called the Stanford Dogs dataset (or ImageNet dogs, as the images are dogs separated from ImageNet).\n", "\n", "Because the Stanford Dogs dataset has been around for a while (since 2011, which as of writing this in 2024 is like a lifetime in deep learning), it's available from several resources:\n", "\n", - "* The [original project website](http://vision.stanford.edu/aditya86/ImageNetDogs/) via link download\n", - "* Inside [TensorFlow datasets under `stanford_dogs`](https://www.tensorflow.org/datasets/catalog/stanford_dogs)\n", - "* On [Kaggle as a downloadable dataset](https://www.kaggle.com/datasets/jessicali9530/stanford-dogs-dataset)\n", + "* The [original project website](http://vision.stanford.edu/aditya86/ImageNetDogs/) via link download.\n", + "* Inside [TensorFlow datasets under `stanford_dogs`](https://www.tensorflow.org/datasets/catalog/stanford_dogs).\n", + "* On [Kaggle as a downloadable dataset](https://www.kaggle.com/datasets/jessicali9530/stanford-dogs-dataset).\n", "\n", "The point here is that when you're starting out with practicing deep learning projects, there's no shortage of datasets available.\n", "\n", @@ -320,7 +340,7 @@ "\n", "To practice formatting a dataset for a machine learning problem, we're going to download the Stanford Dogs dataset from the original website.\n", "\n", - "Before we do so, the following code is an example of how we'd get the Stanford Dogs dataset from TensorFlow Datasets." + "Before we do so, the following code is an example of how we'd get the Stanford Dogs dataset from [TensorFlow Datasets](https://www.tensorflow.org/datasets)." ] }, { @@ -354,7 +374,7 @@ "\n", "1. [Images](http://vision.stanford.edu/aditya86/ImageNetDogs/images.tar) (757MB) - `images.tar`\n", "3. [Annotations](http://vision.stanford.edu/aditya86/ImageNetDogs/annotation.tar) (21MB) - `annotation.tar`\n", - "3. [Lists](http://vision.stanford.edu/aditya86/ImageNetDogs/lists.tar), with train/test splits (0.5MB) - `lists.tar`\n", + "3. [Lists](http://vision.stanford.edu/aditya86/ImageNetDogs/lists.tar) with train/test splits (0.5MB) - `lists.tar`\n", "\n", "Our goal is to get a file structure like this:\n", "\n", @@ -465,7 +485,7 @@ "source": [ "Data downloaded!\n", "\n", - "Nice work!\n", + "Nice work! This may seem like a bit of work but it's an important step with any deep learning project. Getting data to work with.\n", "\n", "Now if we get the contents of `local_dir` (`dog_vision_data`), what do we get?\n", "\n", @@ -543,7 +563,7 @@ }, "outputs": [], "source": [ - "# Untar images\n", + "# Untar images, notes/tags: \n", "# -x = extract files from the zipped file\n", "# -v = verbose\n", "# -z = decompress files\n", @@ -634,11 +654,18 @@ "\n", "Once you've got a dataset, before building a model, it's wise to explore it for a bit to see what kind of data you're working with.\n", "\n", - "* TK - things you should do when you start with a new dataset\n", - "* visualize\n", - "* check the distributions (e.g. number of samples per class)\n", + "Exploring a dataset can mean many things.\n", + "\n", + "But a few rules of thumb when exploring new data:\n", + "* **View at least 100+ random samples for a \"vibe check\".** For example, if you have a large dataset of images, randomly sample 10 images at a time and view them. Or if you have a large dataset of texts, what do some of them say? The same with audio. It will often be impossible to view all samples in your dataset, but you can start to get a good idea of what's inside by randomly inspecting samples.\n", + "* ***Visualize, viuslaize, visualize!*** This is the data explorer's motto. Use it often. As in, it's good to get statistics about your dataset but it's often even better to view 100s of samples with your own eyes (see the point above).\n", + "* **Check the distributions and other various statistics.** How many samples are there? If you're dealing with classification, how many classes and labels per class are there? Which classes don't you understand? If you don't have labels, investigate [clustering methods](https://developers.google.com/machine-learning/clustering/overview) to put similar samples close together.\n", "\n", - "TK - daniel bourke tweet about abraham loss function - https://twitter.com/mrdbourke/status/1456087631641473033" + "As Abraham Lossfunction says... \n", + "\n", + "\"text\n", + "\n", + "*A play on words of Abraham Lincoln's [famous quote](https://www.brainyquote.com/quotes/abraham_lincoln_109275) on sharpening an axe before cutting down a tree in theme of machine learning. Source: [Daniel Bourke X/Twitter](https://twitter.com/mrdbourke/status/1456087631641473033).*" ] }, { @@ -647,7 +674,7 @@ "id": "SBuYrnJZWecW" }, "source": [ - "### Discussing our target data format\n", + "### Our target data format\n", "\n", "Since our goal is to build a computer vision model to classify dog breeds, we need a way to tell our model what breed of dog is in what image.\n", "\n", @@ -655,7 +682,7 @@ "\n", "For example:\n", "\n", - "\n", + "\"example\n", "\n", "In the case of dog images, we'd put all of the images labelled \"chihuahua\" in a folder called `chihuahua/` (and so on for all the other classes and images).\n", "\n", @@ -663,7 +690,7 @@ "\n", "This is what we'll be working towards creating.\n", "\n", - "> **Note:** This structure of folder format doesn't just work for only images, it works for text, audio and other kind of classification data too.\n" + "> **Note:** This structure of folder format doesn't just work for only images, it can work for text, audio and other kind of classification data too." ] }, { @@ -677,7 +704,7 @@ "\n", "How about we check out the `train_list.mat`, `test_list.mat` and `full_list.mat` files?\n", "\n", - "Searching online, for \"what is a .mat file?\", I found that [it's a MATLAB file](https://www.mathworks.com/help/matlab/import_export/mat-file-versions.html). Before Python became the default language for machine learning and deep learning, many models and datasets were built in MATLAB.\n", + "Searching online, for \"what is a .mat file?\", I found that [it's a MATLAB file](https://www.mathworks.com/help/matlab/import_export/mat-file-versions.html). Before Python became the default language for machine learning and deep learning, many models and datasets were built in [MATLAB](https://www.mathworks.com/products/matlab.html).\n", "\n", "Then I searched, \"how to open a .mat file with Python?\" and found an [answer on Stack Overflow](https://stackoverflow.com/questions/874461/read-mat-files-in-python) saying I could use the [`scipy` library](https://scipy.org/) (a scientific computing library).\n", "\n", @@ -1471,7 +1498,7 @@ } ], "source": [ - "sorted(folder_to_class_name_dict.items())[:10]" + "list(folder_to_class_name_dict.items())[:10]" ] }, { @@ -1586,14 +1613,13 @@ } ], "source": [ - "from typing import List\n", "from pathlib import Path\n", "import matplotlib.pyplot as plt\n", "import random\n", "\n", "# 1. Take in a select list of image paths\n", - "def plot_10_random_images_from_path_list(path_list: List[Path],\n", - " extract_title=True) -> None:\n", + "def plot_10_random_images_from_path_list(path_list: list(Path),\n", + " extract_title: bool=True) -> None:\n", " # 2. Set up a grid of plots\n", " fig, axes = plt.subplots(nrows=2, ncols=5, figsize=(20, 10))\n", "\n", @@ -1636,7 +1662,7 @@ "\n", "What I like to do here is rerun the random visualizations until I've seen 100+ samples so I've got an idea of the data we're working with.\n", "\n", - "> **Question:** Here's something to think about, how would you code a system to differentiate between all the different breeds of dogs? Perhaps you write an algorithm to look at the shapes or the colours? You might be thinking \"that would take quite a long time...\" And you'd be right. Then how would we do it? Machine learning of course!" + "> **Question:** Here's something to think about, how would you code a system of rules to differentiate between all the different breeds of dogs? Perhaps you write an algorithm to look at the shapes or the colours? For example, if the dog had black fur, it's unlikely to be a golden retriever. You might be thinking \"that would take quite a long time...\" And you'd be right. Then how would we do it? With machine learning of course!" ] }, { @@ -2779,7 +2805,7 @@ "\n", "The main takeaway(s):\n", "* When working on a classification problem, ideally, all classes have a similar number of samples (however, in some problems this may be unattainable, such as fraud detection, where you may have 1000x more \"not fraud\" samples to \"fraud\" samples.\n", - "* If you wanted to add a new class of dog breed to the existing 120, ideally, you'd have at least ~150 images for it." + "* If you wanted to add a new class of dog breed to the existing 120, ideally, you'd have at least ~150 images for it (though as we'll see with transfer learning, the number of required images could be less as long as they're high quality)." ] }, { @@ -2794,9 +2820,9 @@ "\n", "This includes:\n", "\n", - "| Set Name | Description | Typical Percentage of Data |\n", - "|--------------------------|------------------------------------------------------|---------------------------|\n", - "| Training Set | A dataset for the model to learn on | 70-80% |\n", + "| Set Name | Description | Typical Percentage of Data |\n", + "|:-----|:-----|:-----|\n", + "| Training Set | A dataset for the model to learn on | 70-80% |\n", "| Testing Set | A dataset for the model to be evaluated on | 20-30% |\n", "| (Optional) Validation Set | A dataset to tune the model on | 50% of the test data |\n", "| (Optional) Smaller Training Set | A smaller size dataset to run quick experiments on | 5-20% of the training set |\n", @@ -2834,8 +2860,8 @@ "```\n", "\n", "So let's write some code to create:\n", - "* `images/train/` directory.\n", - "* `images/test/` directory.\n", + "* `images/train/` directory to hold all of the training images.\n", + "* `images/test/` directory to hold all of the testing images.\n", "* Make a directory inside each of `images/train/` and `images/test/` for each of the dog breed classes.\n", "\n", "We can make each of the directories we need using [`Path.mkdir()`](https://docs.python.org/3/library/pathlib.html#pathlib.Path.mkdir).\n", @@ -2876,8 +2902,8 @@ "# Using Path.mkdir with exist_ok=True ensures the directory is created only if it doesn't exist\n", "train_dir.mkdir(parents=True, exist_ok=True)\n", "test_dir.mkdir(parents=True, exist_ok=True)\n", - "print(f\"Directory {train_dir} is ensured to exist.\")\n", - "print(f\"Directory {test_dir} is ensured to exist.\")\n", + "print(f\"Directory {train_dir} is exists.\")\n", + "print(f\"Directory {test_dir} is exists.\")\n", "\n", "# Make a folder for each dog name\n", "for dog_name in dog_names:\n", @@ -3259,9 +3285,36 @@ "\n", "Once you find something that does work, you can start to scale up your experiments (more data, bigger models, longer training times - we'll see these later on).\n", "\n", - "* TK image - make an image diagram of the image split folder we're going to make e.g. train_10_percent...\n", + "To make our 10% training dataset, let's copy a random 10% of the existing training set to a new folder called `images_split/train_10_percent`, so we've got the layout:\n", "\n", - "To make our 10% training dataset, let's copy a random 10% of the existing training set to a new folder called `images_split/train_10_percent`.\n", + "```\n", + "images_split/\n", + "β”œβ”€β”€ train/\n", + "β”‚ β”œβ”€β”€ class_1/\n", + "β”‚ β”‚ β”œβ”€β”€ train_image1.jpg\n", + "β”‚ β”‚ β”œβ”€β”€ train_image2.jpg\n", + "β”‚ β”‚ └── ...\n", + "β”‚ β”œβ”€β”€ class_2/\n", + "β”‚ β”‚ β”œβ”€β”€ train_image1.jpg\n", + "β”‚ β”‚ β”œβ”€β”€ train_image2.jpg\n", + "β”‚ β”‚ └── ...\n", + "β”œβ”€β”€ train_10_percent/ <--- NEW!\n", + "β”‚ β”œβ”€β”€ class_1/\n", + "β”‚ β”‚ β”œβ”€β”€ random_train_image42.jpg\n", + "β”‚ β”‚ └── ...\n", + "β”‚ β”œβ”€β”€ class_2/\n", + "β”‚ β”‚ β”œβ”€β”€ random_train_image106.jpg\n", + "β”‚ β”‚ └── ...\n", + "└── test/\n", + " β”œβ”€β”€ class_1/\n", + " β”‚ β”œβ”€β”€ test_image1.jpg\n", + " β”‚ β”œβ”€β”€ test_image2.jpg\n", + " β”‚ └── ...\n", + " β”œβ”€β”€ class_2/\n", + " β”‚ β”œβ”€β”€ test_image1.jpg\n", + " β”‚ β”œβ”€β”€ test_image2.jpg\n", + " β”‚ └── ...\n", + "```\n", "\n", "Let's start by creating that folder.\n" ] @@ -3865,7 +3918,9 @@ "\n", "However, it could be better.\n", "\n", - "If we really wanted to, we could recreate the train 10% dataset with 10% of the images from *each class* rather than 10% of images globally." + "If we really wanted to, we could recreate the train 10% dataset with 10% of the images from *each class* rather than 10% of images globally.\n", + "\n", + "> **Extension:** How would you create the `train_10_percent` data split with 10% of the images from each class? For example, each folder would have at least 10 images of a particular dog breed." ] }, { @@ -3880,7 +3935,7 @@ "\n", "But how do we get the images from different folders into a machine learning model?\n", "\n", - "Well, like the other machine learning models we've built, we need a way to turn our images into numbers.\n", + "Well, like the other machine learning models we've built throughout the course, we need a way to turn our images into numbers.\n", "\n", "Specifically, we're going to turn our images into tensors.\n", "\n", @@ -3895,13 +3950,13 @@ "The reason why we spent time getting our data into the standard image classification format (where the class name is the folder name) is because TensorFlow includes several utility functions to load data from this directory format.\n", "\n", "| Function | Description |\n", - "| --- | --- |\n", + "| :----- | :----- |\n", "| [`tf.keras.utils.image_dataset_from_directory()`](https://www.tensorflow.org/api_docs/python/tf/keras/utils/image_dataset_from_directory) | Creates a [`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) from image files in a directory. |\n", "| [`tf.keras.utils.audio_dataset_from_directory()`](https://www.tensorflow.org/api_docs/python/tf/keras/utils/audio_dataset_from_directory) | Creates a `tf.data.Dataset` from audio files in a directory. |\n", "| [`tf.keras.utils.text_dataset_from_directory()`](https://www.tensorflow.org/api_docs/python/tf/keras/utils/text_dataset_from_directory) | Creates a `tf.data.Dataset` from text files in a directory. |\n", "| [`tf.keras.utils.timeseries_dataset_from_array()`](https://www.tensorflow.org/api_docs/python/tf/keras/utils/timeseries_dataset_from_array) | Creates a dataset of sliding windows over a timeseries provided as array. |\n", "\n", - "What is a `tf.data.Dataset`?\n", + "What is a [`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset)?\n", "\n", "It's TensorFlow's efficient way to store a potentially large set of elements.\n", "\n", @@ -3916,8 +3971,8 @@ "We'll pass in the following parameters:\n", "\n", "* `directory` = the target directory we'd like to turn into a `tf.data.Dataset`.\n", - "* `label_mode` = the kind of labels we'd like to use, in our case it's `\"categorical\"` since we're dealing with a multi-class classification problem.\n", - "* `batch_size` = the number of images we'd like our model to see at a time (due to computation limitations, our model won't be able to look at every image at once), generally [32 is a good value to start](https://x.com/ylecun/status/989610208497360896?s=20).\n", + "* `label_mode` = the kind of labels we'd like to use, in our case it's `\"categorical\"` since we're dealing with a multi-class classification problem (we would use `\"binary\"` if we were working with binary classifcation problem).\n", + "* `batch_size` = the number of images we'd like our model to see at a time (due to computation limitations, our model won't be able to look at every image at once so we split them into small batches and the model looks at each batch individually), generally [32 is a good value to start](https://x.com/ylecun/status/989610208497360896?s=20), this means our model will look at 32 images at a time (this number is flexible).\n", "* `image_size` = the size we'd like to shape our images to before we feed them to our model (height x width).\n", "* `shuffle` = whether we'd like our dataset to be shuffled to randomize the order.\n", "* `seed` = if we're shuffling the order in a random fashion, do we want that to be reproducible?\n", @@ -4037,8 +4092,8 @@ "You'll notice a few things going on here.\n", "\n", "Essentially, we've got a collection of tuples:\n", - "1. The image tensor(s) - `TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None)` where `(None, 224, 224, 3)` is the shape of the image tensor (`None` is the batch size, `(224, 224)` is the `IMG_SIZE` we set and `3` is the number of colour channels, as in, red, green, blue or RGB since our images are in colour).\n", - "2. The label tensor(s) - `TensorSpec(shape=(None, 120), dtype=tf.int32, name=None)` where `None` is the batch size and `120` is the number of labels we're using.\n", + "1. The image tensor(s) - `TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None)` where `(None, 224, 224, 3)` is the shape of the image tensor (`None` is the batch size, `(224, 224)` is the `IMG_SIZE` we set and `3` is the number of colour channels, as in, [red, green, blue or RGB](https://en.wikipedia.org/wiki/RGB_color_model) since our images are in colour).\n", + "2. The label tensor(s) - `TensorSpec(shape=(None, 120), dtype=tf.float32, name=None)` where `None` is the batch size and `120` is the number of labels we're using.\n", "\n", "The batch size often appears as `None` since it's flexible and can change on the fly.\n", "\n", @@ -4094,7 +4149,7 @@ "\n", "These are numerical representations of our data images and labels!\n", "\n", - "> **Note:** The shape of a tensor does not necessarily reflect the values inside a tensor.\n", + "> **Note:** The shape of a tensor does not necessarily reflect the values inside a tensor. The shape only reflects the dimensionality of a tensor. For example, `[32, 224, 224, 3]` is a 4-dimensional tensor. Values inside a tensor can be any number (positive, negative, 0, float, integer, etc) representing almost any kind of data.\n", "\n", "We can further inspect our data by looking at a single sample." ] @@ -4189,7 +4244,7 @@ "source": [ "Woah!!\n", "\n", - "We've got a numerical representation of a dog image!\n", + "We've got a numerical representation of a dog image (in the form of red, green, blue pixel values)!\n", "\n", "This is exactly the kind of format our model will want.\n", "\n",