Skip to content

Commit

Permalink
vision_learner
Browse files Browse the repository at this point in the history
  • Loading branch information
jph00 committed Apr 24, 2022
1 parent b7f756b commit 9f9e597
Show file tree
Hide file tree
Showing 20 changed files with 78 additions and 54 deletions.
22 changes: 17 additions & 5 deletions 01_intro.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -576,7 +576,7 @@
" path, get_image_files(path), valid_pct=0.2, seed=42,\n",
" label_func=is_cat, item_tfms=Resize(224))\n",
"\n",
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
"learn.fine_tune(1)"
]
},
Expand Down Expand Up @@ -1580,7 +1580,7 @@
"The fifth line of the code training our image recognizer tells fastai to create a *convolutional neural network* (CNN) and specifies what *architecture* to use (i.e. what kind of model to create), what data we want to train it on, and what *metric* to use:\n",
"\n",
"```python\n",
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
"```\n",
"\n",
"Why a CNN? It's the current state-of-the-art approach to creating computer vision models. We'll be learning all about how CNNs work in this book. Their structure is inspired by how the human vision system works.\n",
Expand All @@ -1596,9 +1596,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"`cnn_learner` also has a parameter `pretrained`, which defaults to `True` (so it's used in this case, even though we haven't specified it), which sets the weights in your model to values that have already been trained by experts to recognize a thousand different categories across 1.3 million photos (using the famous [*ImageNet* dataset](http://www.image-net.org/)). A model that has weights that have already been trained on some other dataset is called a *pretrained model*. You should nearly always use a pretrained model, because it means that your model, before you've even shown it any of your data, is already very capable. And, as you'll see, in a deep learning model many of these capabilities are things you'll need, almost regardless of the details of your project. For instance, parts of pretrained models will handle edge, gradient, and color detection, which are needed for many tasks.\n",
"`vision_learner` also has a parameter `pretrained`, which defaults to `True` (so it's used in this case, even though we haven't specified it), which sets the weights in your model to values that have already been trained by experts to recognize a thousand different categories across 1.3 million photos (using the famous [*ImageNet* dataset](http://www.image-net.org/)). A model that has weights that have already been trained on some other dataset is called a *pretrained model*. You should nearly always use a pretrained model, because it means that your model, before you've even shown it any of your data, is already very capable. And, as you'll see, in a deep learning model many of these capabilities are things you'll need, almost regardless of the details of your project. For instance, parts of pretrained models will handle edge, gradient, and color detection, which are needed for many tasks.\n",
"\n",
"When using a pretrained model, `cnn_learner` will remove the last layer, since that is always specifically customized to the original training task (i.e. ImageNet dataset classification), and replace it with one or more new layers with randomized weights, of an appropriate size for the dataset you are working with. This last part of the model is known as the *head*.\n",
"When using a pretrained model, `vision_learner` will remove the last layer, since that is always specifically customized to the original training task (i.e. ImageNet dataset classification), and replace it with one or more new layers with randomized weights, of an appropriate size for the dataset you are working with. This last part of the model is known as the *head*.\n",
"\n",
"Using pretrained models is the *most* important method we have to allow us to train more accurate models, more quickly, with less data, and less time and money. You might think that would mean that using pretrained models would be the most studied area in academic deep learning... but you'd be very, very wrong! The importance of pretrained models is generally not recognized or discussed in most courses, books, or software library features, and is rarely considered in academic papers. As we write this at the start of 2020, things are just starting to change, but it's likely to take a while. So be careful: most people you speak to will probably greatly underestimate what you can do in deep learning with few resources, because they probably won't deeply understand how to use pretrained models.\n",
"\n",
Expand Down Expand Up @@ -2914,9 +2914,21 @@
"split_at_heading": true
},
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.5"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion 02_production.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1063,7 +1063,7 @@
}
],
"source": [
"learn = cnn_learner(dls, resnet18, metrics=error_rate)\n",
"learn = vision_learner(dls, resnet18, metrics=error_rate)\n",
"learn.fine_tune(4)"
]
},
Expand Down
4 changes: 2 additions & 2 deletions 04_mnist_basics.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4984,7 +4984,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To create a `Learner` without using an application (such as `cnn_learner`) we need to pass in all the elements that we've created in this chapter: the `DataLoaders`, the model, the optimization function (which will be passed the parameters), the loss function, and optionally any metrics to print:"
"To create a `Learner` without using an application (such as `vision_learner`) we need to pass in all the elements that we've created in this chapter: the `DataLoaders`, the model, the optimization function (which will be passed the parameters), the loss function, and optionally any metrics to print:"
]
},
{
Expand Down Expand Up @@ -5706,7 +5706,7 @@
],
"source": [
"dls = ImageDataLoaders.from_folder(path)\n",
"learn = cnn_learner(dls, resnet18, pretrained=False,\n",
"learn = vision_learner(dls, resnet18, pretrained=False,\n",
" loss_func=F.cross_entropy, metrics=accuracy)\n",
"learn.fit_one_cycle(1, 0.1)"
]
Expand Down
14 changes: 7 additions & 7 deletions 05_pet_breeds.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -610,7 +610,7 @@
}
],
"source": [
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
"learn.fine_tune(2)"
]
},
Expand Down Expand Up @@ -1774,7 +1774,7 @@
}
],
"source": [
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
"learn.fine_tune(1, base_lr=0.1)"
]
},
Expand Down Expand Up @@ -1821,7 +1821,7 @@
}
],
"source": [
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
"lr_min,lr_steep = learn.lr_find(suggest_funcs=(minimum, steep))"
]
},
Expand Down Expand Up @@ -1927,7 +1927,7 @@
}
],
"source": [
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
"learn.fine_tune(2, base_lr=3e-3)"
]
},
Expand Down Expand Up @@ -2053,7 +2053,7 @@
}
],
"source": [
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
"learn.fit_one_cycle(3, 3e-3)"
]
},
Expand Down Expand Up @@ -2406,7 +2406,7 @@
}
],
"source": [
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
"learn.fit_one_cycle(3, 3e-3)\n",
"learn.unfreeze()\n",
"learn.fit_one_cycle(12, lr_max=slice(1e-6,1e-4))"
Expand Down Expand Up @@ -2626,7 +2626,7 @@
],
"source": [
"from fastai.callback.fp16 import *\n",
"learn = cnn_learner(dls, resnet50, metrics=error_rate).to_fp16()\n",
"learn = vision_learner(dls, resnet50, metrics=error_rate).to_fp16()\n",
"learn.fine_tune(6, freeze_epochs=3)"
]
},
Expand Down
12 changes: 6 additions & 6 deletions 06_multicat.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -885,7 +885,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we'll create our `Learner`. We saw in <<chapter_mnist_basics>> that a `Learner` object contains four main things: the model, a `DataLoaders` object, an `Optimizer`, and the loss function to use. We already have our `DataLoaders`, we can leverage fastai's `resnet` models (which we'll learn how to create from scratch later), and we know how to create an `SGD` optimizer. So let's focus on ensuring we have a suitable loss function. To do this, let's use `cnn_learner` to create a `Learner`, so we can look at its activations:"
"Now we'll create our `Learner`. We saw in <<chapter_mnist_basics>> that a `Learner` object contains four main things: the model, a `DataLoaders` object, an `Optimizer`, and the loss function to use. We already have our `DataLoaders`, we can leverage fastai's `resnet` models (which we'll learn how to create from scratch later), and we know how to create an `SGD` optimizer. So let's focus on ensuring we have a suitable loss function. To do this, let's use `vision_learner` to create a `Learner`, so we can look at its activations:"
]
},
{
Expand All @@ -894,7 +894,7 @@
"metadata": {},
"outputs": [],
"source": [
"learn = cnn_learner(dls, resnet18)"
"learn = vision_learner(dls, resnet18)"
]
},
{
Expand Down Expand Up @@ -1225,7 +1225,7 @@
}
],
"source": [
"learn = cnn_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2))\n",
"learn = vision_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2))\n",
"learn.fine_tune(3, base_lr=3e-3, freeze_epochs=4)"
]
},
Expand Down Expand Up @@ -1782,7 +1782,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"As usual, we can use `cnn_learner` to create our `Learner`. Remember way back in <<chapter_intro>> how we used `y_range` to tell fastai the range of our targets? We'll do the same here (coordinates in fastai and PyTorch are always rescaled between -1 and +1):"
"As usual, we can use `vision_learner` to create our `Learner`. Remember way back in <<chapter_intro>> how we used `y_range` to tell fastai the range of our targets? We'll do the same here (coordinates in fastai and PyTorch are always rescaled between -1 and +1):"
]
},
{
Expand All @@ -1791,7 +1791,7 @@
"metadata": {},
"outputs": [],
"source": [
"learn = cnn_learner(dls, resnet18, y_range=(-1,1))"
"learn = vision_learner(dls, resnet18, y_range=(-1,1))"
]
},
{
Expand Down Expand Up @@ -1880,7 +1880,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This makes sense, since when coordinates are used as the dependent variable, most of the time we're likely to be trying to predict something as close as possible; that's basically what `MSELoss` (mean squared error loss) does. If you want to use a different loss function, you can pass it to `cnn_learner` using the `loss_func` parameter.\n",
"This makes sense, since when coordinates are used as the dependent variable, most of the time we're likely to be trying to predict something as close as possible; that's basically what `MSELoss` (mean squared error loss) does. If you want to use a different loss function, you can pass it to `vision_learner` using the `loss_func` parameter.\n",
"\n",
"Note also that we didn't specify any metrics. That's because the MSE is already a useful metric for this task (although it's probably more interpretable after we take the square root). \n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion 07_sizing_and_tta.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -371,7 +371,7 @@
"\n",
"This means that when you distribute a model, you need to also distribute the statistics used for normalization, since anyone using it for inference, or transfer learning, will need to use the same statistics. By the same token, if you're using a model that someone else has trained, make sure you find out what normalization statistics they used, and match them.\n",
"\n",
"We didn't have to handle normalization in previous chapters because when using a pretrained model through `cnn_learner`, the fastai library automatically adds the proper `Normalize` transform; the model has been pretrained with certain statistics in `Normalize` (usually coming from the ImageNet dataset), so the library can fill those in for you. Note that this only applies with pretrained models, which is why we need to add this information manually here, when training from scratch.\n",
"We didn't have to handle normalization in previous chapters because when using a pretrained model through `vision_learner`, the fastai library automatically adds the proper `Normalize` transform; the model has been pretrained with certain statistics in `Normalize` (usually coming from the ImageNet dataset), so the library can fill those in for you. Note that this only applies with pretrained models, which is why we need to add this information manually here, when training from scratch.\n",
"\n",
"All our training up until now has been done at size 224. We could have begun training at a smaller size before going to that. This is called *progressive resizing*."
]
Expand Down
2 changes: 1 addition & 1 deletion 08_collab.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -919,7 +919,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have defined our architecture, and created our parameter matrices, we need to create a `Learner` to optimize our model. In the past we have used special functions, such as `cnn_learner`, which set up everything for us for a particular application. Since we are doing things from scratch here, we will use the plain `Learner` class:"
"Now that we have defined our architecture, and created our parameter matrices, we need to create a `Learner` to optimize our model. In the past we have used special functions, such as `vision_learner`, which set up everything for us for a particular application. Since we are doing things from scratch here, we will use the plain `Learner` class:"
]
},
{
Expand Down
16 changes: 14 additions & 2 deletions 10_nlp.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1424,7 +1424,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"It takes quite a while to train each epoch, so we'll be saving the intermediate model results during the training process. Since `fine_tune` doesn't do that for us, we'll use `fit_one_cycle`. Just like `cnn_learner`, `language_model_learner` automatically calls `freeze` when using a pretrained model (which is the default), so this will only train the embeddings (the only part of the model that contains randomly initialized weights—i.e., embeddings for words that are in our IMDb vocab, but aren't in the pretrained model vocab):"
"It takes quite a while to train each epoch, so we'll be saving the intermediate model results during the training process. Since `fine_tune` doesn't do that for us, we'll use `fit_one_cycle`. Just like `vision_learner`, `language_model_learner` automatically calls `freeze` when using a pretrained model (which is the default), so this will only train the embeddings (the only part of the model that contains randomly initialized weights—i.e., embeddings for words that are in our IMDb vocab, but aren't in the pretrained model vocab):"
]
},
{
Expand Down Expand Up @@ -2276,9 +2276,21 @@
"split_at_heading": true
},
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.5"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion 11_midlevel_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1209,7 +1209,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now train a model using this `DataLoaders`. It will need a bit more customization than the usual model provided by `cnn_learner` since it has to take two images instead of one, but we will see how to create such a model and train it in <<chapter_arch_dtails>>."
"We can now train a model using this `DataLoaders`. It will need a bit more customization than the usual model provided by `vision_learner` since it has to take two images instead of one, but we will see how to create such a model and train it in <<chapter_arch_dtails>>."
]
},
{
Expand Down
Loading

0 comments on commit 9f9e597

Please sign in to comment.