- Files
- Pipeline
- Pipeline In Action
- Figure: Feature Vector - Spatial Features and HLS Histogram
- Figure: Feature Vector - HOG and HLS
- Figure: Limited Sliding Window Search (Scale = 1x)
- Figure: Rolling Sum is Robust to Noise
- Figure: Rolling Sum Picks up where current frame heatmap may not
- Figure: Rolling Sum is averse to cars on other side of the Highway
- Figure: Rolling Sum smoother than current frame (See Video)
- Challenges
- Shortcomings
- Future Enhancements
- Acknowledgements & References
This video contains results and illustration of challenges encountered during this project:
The project was designed to be modular and reusable. The significant independent domains get their own Class
and an individual file:
main.py
- Main test runner withVehicleDetection
Class and test functions liketrain_svc_with_color_hog_hist
,test_sliding_window
,train_or_load_model
.utils.py
- Handy utils likeimcompare
,warper
,debug
shared across modulessettings.py
- Hyperparameters and Settings shared across modulerolling_statistics.py
-RollingStatistics
class to computemoving_average
androlling_sum
README.md
- description of the development process (this file)
The Pipeline section below has a high level description of the pipeline and pointers to implementation. The code is fairly readable and contains detailed comments that explain their working.
Set Hyperparameters and configurations in settings.py
and run the main.py
python script as shown below. Repository includes all required files and can be used to rerun vehicle detection & tracking on a given video. Refer References below for training dataset.
$ grep 'INPUT\|OUTPUT' -Hn settings.py
settings.py:9: INPUT_VIDEOFILE = 'test_video_output.mp4'
settings.py:11: OUTPUT_DIR = 'output_images/'
$ python main.py
[test_slide_search_window:369] Processing Video: test_video.mp4
[MoviePy] >>>> Building video output_images/test_video_output.mp4
[MoviePy] Writing video output_images/test_video_output.mp4
100%|██████████████████████████████████████████████████████████████████| 39/39 [02:01<00:03, 3.05s/it]
[MoviePy] Done.
[MoviePy] >>>> Video ready: output_images/test_video_output.mp4
$ open output_images/test_video_output.mp4
-
Basic Data Exploration
- Visually scanned images from each class and selected a
Car
and aRoad
as a canonical "Vehicle" "Non-Vehicle" class samples for data exploration. Figures in section Pipeline In Action - Explored Color Histograms and HOG Features by comparing how well it separates above classes. Figures in section Pipeline In Action
- Validating that both "in class" and "out of class" samples have nearly equal sizes ~8.5K samples. Thus we have a balanced dataset and training wouldn't be too biased
- Visually scanned images from each class and selected a
-
Feature Extraction - from "Vehicle" and "Non-Vehicle" classes in
extract_features_hog
andsingle_img_features
- Image Spatial Features as
spatial_features
- Image Color Histograms as
hist_features
, and - Histogram of Oriented Gradients as
hog_features
- Image Spatial Features as
-
Training Car Detector - with
train_or_load_model
usingLinearSVC
intrain_svc_with_color_hog_hist
- Initial classifier test accuracies was 90% without HOG
- Including HOG, experimentation & careful combination of hyperparameters
settings.py:L31-L41
the accuracy rose up to 99% - Save Model & Features - Using
joblib
notpickle
.joblib
handles large numpy arrays a lot more efficiently
-
Vehicle Detection -
Class
that utilizes region limited sliding window search, heatmap, thresholding, labelling and rolling sum to eventually filter the vehicles.__init__
- Initializes Instance Variables like Feature Extraction and Sliding Window Search- Memory -
Rolling Statistics
:Moving Average
andRolling Sum
RollingStatistics
object with a circular queue for savingMEMORY_SIZE
number of previous frames. LeveragesPandas
underneath.- The
rolling_sum
based heatmap accumulates heatmaps from pastMEMORY_SIZE
frames and thresholds them together. Thus eliminating one off noisy detections. I experimented with 25+ differentMEMORY_SIZE
,ROLLING_SUM_HEAT_THRESHOLD
combinations to come up with a video that was smooth, avoided false positives and was responsive enough to a visible car in the video. - Prior to
rolling_sum
I experimented withmoving_average
but soon realized a literal moving average is a very strict thresholding criterion and hence decided to graduate torolling_sum
which is simpler, more intuitive, lenient and offers a finer thresholding control.
sliding_window_search
- Search sliding window- The window size (ie., scale) and overlap
XY_WINDOW
andXY_OVERLAP
were defined as(96, 96) and 70%
respectively.96px
window size is a fair middle ground that works well to identify cars both near and far and 70% overlap helps cover enough ground to not avoid missing any true positives. It also helps in improving the heat score of a successful detection. Having a single scale makes the algorithm less robust and this could be improved. See caching discussion further down. - Region limited search to Y =
[400, 656]
for optimization. I did not want to have any X region limit as it is possible to find cars in the left and the right lane of the autonomous car in a general case. - Utilizes memory,
🐛
debug, & exception🎇
handling
- The window size (ie., scale) and overlap
update_overlay
- Sliding Window Search Area Highlighting withidentifier
, anddimensions
of bounding_box
heat_and_threshold
- Computes the heatmaps🔥, labels and bounding boxes with metrics for each labelled boxrolling_sum
- Gets a rolling_sum heatmap frommemory
add_to_debugbar
- Insets debug information picture-in-picture or rather video-in-video. Quite professional, you see! 👔
It was non-trivial to choose the Hyperparameters. Its primarily been a trial-and-error process. Mohan Karthik's blogpost was precise and served as a good general direction for Color histogram and HOG params. I still experimented with them on my own to determine what works best for me. As mentioned earlier, just the spatial features and channel histograms yielded classifier test accuracy at 90%. I chose HLS
color space as it (or HSV
) yielded great results for lane keeping. By some argument HLS
is more intuitive than HSV
color space. I added HOG features to bump up accuracy to 99%.
It wasn't easy to visualize why the system didn't work for a given frame of video. Using a rolling sum made things even harder. Hence I decided to a few elements to make my life easy:
- Add insets to preview
Current Frame Heatmap
andRolling Sum Heatmap
- Color coded respective detections differently with thin red from
Current Frame Heatmap
and THICK GREEN from theRolling Sum Heatmap
. - Rheir respective thresholds are presented in the status screen as
HeatTh: RollSum * CurFr
, 19 & 1 respectively. - Rolling window buffer size is also displayed as
Memory
- Current frame id is displayed on the left as
1046
- Accuracy of the classifier used is also presented as
Accuracy
- Bounding box ids and sizes are also displayed as
id | width x height
around each box; This will be useful in considering a weighted average (see Enhancements below)
Figure: Example of a frame where a shorter bounding box needs to be merged with an adjacent bigger one
Figure: Example of a frame where the current implementation detects a long tail due to long frame memory
There was a tradeoff between long-tail and possibility of not having a car detected. I chose to be conservative and err on the side of having a long-tail.
- Optimize the hog feature extraction by caching and reusing hog for different windows and scales.
- Try different Neural Network based vehicle detection approaches like YOLO instead of SVM LinearSVC
- Ideas to Reduce False Positives:
- Consider Weighted Averaging of frames (on Bounding Box size, for example); Penalize small boxes, sustain large ones
- Consider frame weight expiry by incorporating a decay factor, like half-life (λ)
- Consider using Optical Flow techniques
- Debug: Add heatmap preview window in upper Debug Bar of Video Output (done)
- Minor: Write asserts for unittests in
rolling_statistics.py
- Use sliders to tune thresholds (Original idea courtesy Sagar Bhokre)
- Integrate with Advanced Lane Finding Implementation
- Sagar Bhokre - For constant support
- Caleb Kirksey - For motivation and company
- Mohan Karthik's Blogpost - Excellent analysis on feature extraction and hyperparameter selection
- GridSpace Reference: Subplot vs GridSpace - Subplotting made sense
- CarND-Vehicle-Detection - Udacity Repository test images and test videos
- Vehicle Training Dataset - Dataset to train the classifier
- Non-Vehicle Training Dataset - Dataset to train the classifier