vision_learner

HuyenNguyenHelen · Apr 24, 2022 · 9f9e597 · 9f9e597
1 parent b7f756b
commit 9f9e597
Show file tree

Hide file tree

Showing 20 changed files with 78 additions and 54 deletions.
diff --git a/01_intro.ipynb b/01_intro.ipynb
@@ -576,7 +576,7 @@
     "    path, get_image_files(path), valid_pct=0.2, seed=42,\n",
     "    label_func=is_cat, item_tfms=Resize(224))\n",
     "\n",
-    "learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
+    "learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
     "learn.fine_tune(1)"
    ]
   },
@@ -1580,7 +1580,7 @@
     "The fifth line of the code training our image recognizer tells fastai to create a *convolutional neural network* (CNN) and specifies what *architecture* to use (i.e. what kind of model to create), what data we want to train it on, and what *metric* to use:\n",
     "\n",
     "```python\n",
-    "learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
+    "learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
     "```\n",
     "\n",
     "Why a CNN? It's the current state-of-the-art approach to creating computer vision models. We'll be learning all about how CNNs work in this book. Their structure is inspired by how the human vision system works.\n",
@@ -1596,9 +1596,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "`cnn_learner` also has a parameter `pretrained`, which defaults to `True` (so it's used in this case, even though we haven't specified it), which sets the weights in your model to values that have already been trained by experts to recognize a thousand different categories across 1.3 million photos (using the famous [*ImageNet* dataset](http://www.image-net.org/)). A model that has weights that have already been trained on some other dataset is called a *pretrained model*. You should nearly always use a pretrained model, because it means that your model, before you've even shown it any of your data, is already very capable. And, as you'll see, in a deep learning model many of these capabilities are things you'll need, almost regardless of the details of your project. For instance, parts of pretrained models will handle edge, gradient, and color detection, which are needed for many tasks.\n",
+    "`vision_learner` also has a parameter `pretrained`, which defaults to `True` (so it's used in this case, even though we haven't specified it), which sets the weights in your model to values that have already been trained by experts to recognize a thousand different categories across 1.3 million photos (using the famous [*ImageNet* dataset](http://www.image-net.org/)). A model that has weights that have already been trained on some other dataset is called a *pretrained model*. You should nearly always use a pretrained model, because it means that your model, before you've even shown it any of your data, is already very capable. And, as you'll see, in a deep learning model many of these capabilities are things you'll need, almost regardless of the details of your project. For instance, parts of pretrained models will handle edge, gradient, and color detection, which are needed for many tasks.\n",
     "\n",
-    "When using a pretrained model, `cnn_learner` will remove the last layer, since that is always specifically customized to the original training task (i.e. ImageNet dataset classification), and replace it with one or more new layers with randomized weights, of an appropriate size for the dataset you are working with. This last part of the model is known as the *head*.\n",
+    "When using a pretrained model, `vision_learner` will remove the last layer, since that is always specifically customized to the original training task (i.e. ImageNet dataset classification), and replace it with one or more new layers with randomized weights, of an appropriate size for the dataset you are working with. This last part of the model is known as the *head*.\n",
     "\n",
     "Using pretrained models is the *most* important method we have to allow us to train more accurate models, more quickly, with less data, and less time and money. You might think that would mean that using pretrained models would be the most studied area in academic deep learning... but you'd be very, very wrong! The importance of pretrained models is generally not recognized or discussed in most courses, books, or software library features, and is rarely considered in academic papers. As we write this at the start of 2020, things are just starting to change, but it's likely to take a while. So be careful: most people you speak to will probably greatly underestimate what you can do in deep learning with few resources, because they probably won't deeply understand how to use pretrained models.\n",
     "\n",
@@ -2914,9 +2914,21 @@
    "split_at_heading": true
   },
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.5"
   }
  },
  "nbformat": 4,

diff --git a/02_production.ipynb b/02_production.ipynb
@@ -1063,7 +1063,7 @@
     }
    ],
    "source": [
-    "learn = cnn_learner(dls, resnet18, metrics=error_rate)\n",
+    "learn = vision_learner(dls, resnet18, metrics=error_rate)\n",
     "learn.fine_tune(4)"
    ]
   },

diff --git a/04_mnist_basics.ipynb b/04_mnist_basics.ipynb
@@ -4984,7 +4984,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To create a `Learner` without using an application (such as `cnn_learner`) we need to pass in all the elements that we've created in this chapter: the `DataLoaders`, the model, the optimization function (which will be passed the parameters), the loss function, and optionally any metrics to print:"
+    "To create a `Learner` without using an application (such as `vision_learner`) we need to pass in all the elements that we've created in this chapter: the `DataLoaders`, the model, the optimization function (which will be passed the parameters), the loss function, and optionally any metrics to print:"
    ]
   },
   {
@@ -5706,7 +5706,7 @@
    ],
    "source": [
     "dls = ImageDataLoaders.from_folder(path)\n",
-    "learn = cnn_learner(dls, resnet18, pretrained=False,\n",
+    "learn = vision_learner(dls, resnet18, pretrained=False,\n",
     "                    loss_func=F.cross_entropy, metrics=accuracy)\n",
     "learn.fit_one_cycle(1, 0.1)"
    ]

diff --git a/05_pet_breeds.ipynb b/05_pet_breeds.ipynb
@@ -610,7 +610,7 @@
     }
    ],
    "source": [
-    "learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
+    "learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
     "learn.fine_tune(2)"
    ]
   },
@@ -1774,7 +1774,7 @@
     }
    ],
    "source": [
-    "learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
+    "learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
     "learn.fine_tune(1, base_lr=0.1)"
    ]
   },
@@ -1821,7 +1821,7 @@
     }
    ],
    "source": [
-    "learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
+    "learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
     "lr_min,lr_steep = learn.lr_find(suggest_funcs=(minimum, steep))"
    ]
   },
@@ -1927,7 +1927,7 @@
     }
    ],
    "source": [
-    "learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
+    "learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
     "learn.fine_tune(2, base_lr=3e-3)"
    ]
   },
@@ -2053,7 +2053,7 @@
     }
    ],
    "source": [
-    "learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
+    "learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
     "learn.fit_one_cycle(3, 3e-3)"
    ]
   },
@@ -2406,7 +2406,7 @@
     }
    ],
    "source": [
-    "learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
+    "learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
     "learn.fit_one_cycle(3, 3e-3)\n",
     "learn.unfreeze()\n",
     "learn.fit_one_cycle(12, lr_max=slice(1e-6,1e-4))"
@@ -2626,7 +2626,7 @@
    ],
    "source": [
     "from fastai.callback.fp16 import *\n",
-    "learn = cnn_learner(dls, resnet50, metrics=error_rate).to_fp16()\n",
+    "learn = vision_learner(dls, resnet50, metrics=error_rate).to_fp16()\n",
     "learn.fine_tune(6, freeze_epochs=3)"
    ]
   },

diff --git a/06_multicat.ipynb b/06_multicat.ipynb
@@ -885,7 +885,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Now we'll create our `Learner`. We saw in <<chapter_mnist_basics>> that a `Learner` object contains four main things: the model, a `DataLoaders` object, an `Optimizer`, and the loss function to use. We already have our `DataLoaders`, we can leverage fastai's `resnet` models (which we'll learn how to create from scratch later), and we know how to create an `SGD` optimizer. So let's focus on ensuring we have a suitable loss function. To do this, let's use `cnn_learner` to create a `Learner`, so we can look at its activations:"
+    "Now we'll create our `Learner`. We saw in <<chapter_mnist_basics>> that a `Learner` object contains four main things: the model, a `DataLoaders` object, an `Optimizer`, and the loss function to use. We already have our `DataLoaders`, we can leverage fastai's `resnet` models (which we'll learn how to create from scratch later), and we know how to create an `SGD` optimizer. So let's focus on ensuring we have a suitable loss function. To do this, let's use `vision_learner` to create a `Learner`, so we can look at its activations:"
    ]
   },
   {
@@ -894,7 +894,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "learn = cnn_learner(dls, resnet18)"
+    "learn = vision_learner(dls, resnet18)"
    ]
   },
   {
@@ -1225,7 +1225,7 @@
     }
    ],
    "source": [
-    "learn = cnn_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2))\n",
+    "learn = vision_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2))\n",
     "learn.fine_tune(3, base_lr=3e-3, freeze_epochs=4)"
    ]
   },
@@ -1782,7 +1782,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As usual, we can use `cnn_learner` to create our `Learner`. Remember way back in <<chapter_intro>> how we used `y_range` to tell fastai the range of our targets? We'll do the same here (coordinates in fastai and PyTorch are always rescaled between -1 and +1):"
+    "As usual, we can use `vision_learner` to create our `Learner`. Remember way back in <<chapter_intro>> how we used `y_range` to tell fastai the range of our targets? We'll do the same here (coordinates in fastai and PyTorch are always rescaled between -1 and +1):"
    ]
   },
   {
@@ -1791,7 +1791,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "learn = cnn_learner(dls, resnet18, y_range=(-1,1))"
+    "learn = vision_learner(dls, resnet18, y_range=(-1,1))"
    ]
   },
   {
@@ -1880,7 +1880,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "This makes sense, since when coordinates are used as the dependent variable, most of the time we're likely to be trying to predict something as close as possible; that's basically what `MSELoss` (mean squared error loss) does. If you want to use a different loss function, you can pass it to `cnn_learner` using the `loss_func` parameter.\n",
+    "This makes sense, since when coordinates are used as the dependent variable, most of the time we're likely to be trying to predict something as close as possible; that's basically what `MSELoss` (mean squared error loss) does. If you want to use a different loss function, you can pass it to `vision_learner` using the `loss_func` parameter.\n",
     "\n",
     "Note also that we didn't specify any metrics. That's because the MSE is already a useful metric for this task (although it's probably more interpretable after we take the square root). \n",
     "\n",

diff --git a/07_sizing_and_tta.ipynb b/07_sizing_and_tta.ipynb
@@ -371,7 +371,7 @@
     "\n",
     "This means that when you distribute a model, you need to also distribute the statistics used for normalization, since anyone using it for inference, or transfer learning, will need to use the same statistics. By the same token, if you're using a model that someone else has trained, make sure you find out what normalization statistics they used, and match them.\n",
     "\n",
-    "We didn't have to handle normalization in previous chapters because when using a pretrained model through `cnn_learner`, the fastai library automatically adds the proper `Normalize` transform; the model has been pretrained with certain statistics in `Normalize` (usually coming from the ImageNet dataset), so the library can fill those in for you. Note that this only applies with pretrained models, which is why we need to add this information manually here, when training from scratch.\n",
+    "We didn't have to handle normalization in previous chapters because when using a pretrained model through `vision_learner`, the fastai library automatically adds the proper `Normalize` transform; the model has been pretrained with certain statistics in `Normalize` (usually coming from the ImageNet dataset), so the library can fill those in for you. Note that this only applies with pretrained models, which is why we need to add this information manually here, when training from scratch.\n",
     "\n",
     "All our training up until now has been done at size 224. We could have begun training at a smaller size before going to that. This is called *progressive resizing*."
    ]

diff --git a/08_collab.ipynb b/08_collab.ipynb
@@ -919,7 +919,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Now that we have defined our architecture, and created our parameter matrices, we need to create a `Learner` to optimize our model. In the past we have used special functions, such as `cnn_learner`, which set up everything for us for a particular application. Since we are doing things from scratch here, we will use the plain `Learner` class:"
+    "Now that we have defined our architecture, and created our parameter matrices, we need to create a `Learner` to optimize our model. In the past we have used special functions, such as `vision_learner`, which set up everything for us for a particular application. Since we are doing things from scratch here, we will use the plain `Learner` class:"
    ]
   },
   {

diff --git a/10_nlp.ipynb b/10_nlp.ipynb
@@ -1424,7 +1424,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "It takes quite a while to train each epoch, so we'll be saving the intermediate model results during the training process. Since `fine_tune` doesn't do that for us, we'll use `fit_one_cycle`. Just like `cnn_learner`, `language_model_learner` automatically calls `freeze` when using a pretrained model (which is the default), so this will only train the embeddings (the only part of the model that contains randomly initialized weights—i.e., embeddings for words that are in our IMDb vocab, but aren't in the pretrained model vocab):"
+    "It takes quite a while to train each epoch, so we'll be saving the intermediate model results during the training process. Since `fine_tune` doesn't do that for us, we'll use `fit_one_cycle`. Just like `vision_learner`, `language_model_learner` automatically calls `freeze` when using a pretrained model (which is the default), so this will only train the embeddings (the only part of the model that contains randomly initialized weights—i.e., embeddings for words that are in our IMDb vocab, but aren't in the pretrained model vocab):"
    ]
   },
   {
@@ -2276,9 +2276,21 @@
    "split_at_heading": true
   },
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.5"
   }
  },
  "nbformat": 4,

diff --git a/11_midlevel_data.ipynb b/11_midlevel_data.ipynb
@@ -1209,7 +1209,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We can now train a model using this `DataLoaders`. It will need a bit more customization than the usual model provided by `cnn_learner` since it has to take two images instead of one, but we will see how to create such a model and train it in <<chapter_arch_dtails>>."
+    "We can now train a model using this `DataLoaders`. It will need a bit more customization than the usual model provided by `vision_learner` since it has to take two images instead of one, but we will see how to create such a model and train it in <<chapter_arch_dtails>>."
    ]
   },
   {