Skip to content

amemetov/sdc-term1-prj5-vehicle-detection

Repository files navigation

Vehicle Detection


Vehicle Detection Project

The goals / steps of this project are the following:

  • Perform a Histogram of Oriented Gradients (HOG) feature extraction on a labeled training set of images and train a classifier Linear SVM classifier
  • Optionally, you can also apply a color transform and append binned color features, as well as histograms of color, to your HOG feature vector.
  • Note: for those first two steps don't forget to normalize your features and randomize a selection for training and testing.
  • Implement a sliding-window technique and use your trained classifier to search for vehicles in images.
  • Run your pipeline on a video stream (start with the test_video.mp4 and later implement on full project_video.mp4) and create a heat map of recurring detections frame by frame to reject outliers and follow detected vehicles.
  • Estimate a bounding box for vehicles detected.

Rubric Points

Here I will consider the rubric points individually and describe how I addressed each point in my implementation.


Submitted Files

1. Submission includes following files:

Histogram of Oriented Gradients (HOG)

1. Explain how (and identify where in your code) you extracted HOG features from the training images.

The code for this step is contained in the file features.py method extract_features (line #55).

I started by reading in all the vehicle and non-vehicle images.
Here is an example of one of each of the vehicle and non-vehicle classes:

alt text

I then explored different color spaces and different skimage.hog() parameters (orientations, pixels_per_cell, and cells_per_block).
I grabbed random images from each of the two classes and displayed them to get a feel for what the skimage.hog() output looks like for different color spaces and HOG parameters of orientations=6, pixels_per_cell=(8, 8) and cells_per_block=(2, 2):

alt text

2. Explain how you settled on your final choice of HOG parameters.

I tried various combinations of parameters (orientations, pixels_per_cell, and cells_per_block):

orientations=9, pixels_per_cell=(8, 8) and cells_per_block=(2, 2) alt text

I noticed that orientations=9 gives more precise gradients (than for orientations=6) which allow to notice a car (especially for YCrCb color space)

Then I tried to tune pixels_per_cell:

orientations=9, pixels_per_cell=(4, 4) and cells_per_block=(2, 2) alt text

orientations=9, pixels_per_cell=(16, 16) and cells_per_block=(2, 2) alt text

As for me pixels_per_cell=8 produces optimal result.

For cells_per_block I did not notice a big difference

orientations=9, pixels_per_cell=(8, 8) and cells_per_block=(4, 4) alt text

So I decided to stop on values:

color_space=YCrCb, orientations=9, pixels_per_cell=(8, 8), cells_per_block=(2, 2)

3. Describe how (and identify where in your code) you trained a classifier using your selected HOG features (and color features if you used them).

The code for training classifier is contained in the file car_detect_svm.py method train (line #13).

Firstly I tried different values for hog_channel and noticed that using ALL channels gives a better result than using any single channel separatelly.

Then I tried to include/exclude using spatial_feature and hist_feateture.

Including all of them gives better result.

Sliding Window Search

1. Describe how (and identify where in your code) you implemented a sliding window search. How did you decide what scales to search and how much to overlap windows?

I have implemented the idea shown in the lecture #33. Multi-scale Windows.

The code is contained in the file car_detect.py method find_windows (line #74).

The result with no overlapping is shown in the below image:

alt text

The method have several parameters which specify:

  • n_levels - How many levels of windows should be constructed.
  • (y_start, y_stop) - Which Y coordinates should be chosen for the first level of windows and for the final level.
  • (dx_start, dx_stop) - How many pixels along X axis windows should be shifted from boundaries respectivelly for the first level of windows and for the final level.
  • (x_overlap_start, x_overlap_stop) - How much windows should be overlapped respectivelly for the first level of windows and for the final level.
  • (start_level_win_size, end_level_win_size) - The size of windows respectivelly for the first level of windows and for the final level.

For intermediate levels the necessary parameters are interpolated depending on the level position.

2. Show some examples of test images to demonstrate how your pipeline is working. What did you do to optimize the performance of your classifier?

Ultimately I searched on two scales using YCrCb 3-channel HOG features plus spatially binned color and histograms of color in the feature vector, which provided a nice result.
Here are some example images:

Sample1: alt text

Sample2: alt text

Sample3: alt text

Mostly for optimizing the result of classifying I tuned sliding window method to get good result on test images.

For SVM itself I used GridSearchCV to find the best parameters (P5-Vehicle-Detection.ipynb cell #8).


Video Implementation

1. Provide a link to your final video output. Your pipeline should perform reasonably well on the entire project video (somewhat wobbly or unstable bounding boxes are ok as long as you are identifying the vehicles most of the time with minimal false positives.)

Here's a link to my video result

2. Describe how (and identify where in your code) you implemented some kind of filter for false positives and some method for combining overlapping bounding boxes.

To solve False Positives and Multiple Detections I have used Heatmap.

The code is contained in the file car_detect.py method process_car_candidates (line #53).

The method takes the positions of positive detections.

Then method creates a heatmap from the positive detections and then thresholds that map to identify vehicle positions.

Then method uses scipy.ndimage.measurements.label() to identify individual blobs in the heatmap.

It is assumed that each blob corresponds to a vehicle.

The final step - constructing bounding boxes to cover the area of each blob detected.

Here's an example result showing the whole pipeline:

Sample1: alt text

Sample2: alt text

Sample3: alt text


Discussion

1. Briefly discuss any problems / issues you faced in your implementation of this project. Where will your pipeline likely fail? What could you do to make it more robust?

It's my first project where I do feature engineering for computer vision task.

Each step required a lot of tunings to get appropriate result.

And even after all done work I'm not very satisfied with the final result.

I tried to use CNN (cnn.py) to see how it can help, but to tell the truth, built CNN produced a lot of False Postives and False Negatives (for Cars). I think that collecting more train data (especially from test video) can improve CNN's performance.

I found that there are many Deep Learning approaches which may give much better result than manual feature engineering:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published