- Violence Detection Using YOLOv8: Towards Automated Video Surveillance and Public Safety
- Custom training dataset : Roboflow Dataset
- Total =
2834 images
- Train =
1969 images
- Valid =
575 images
- Test =
290 images
- Train =
- Total =
Video dataset: Kaggle Dataset(Not using this as it is same dataset as our selected image dataset)Total =2000 videos
Non-violence =1000 videos
Violence =1000 videos
- Video dataset: RWF-2000: An Open Large Scale Video Database for Violence Detection
- Total =
2000 mixed videos
- Total =
- Model used
YOLOv8s
- Number of epochs =
25
- Batch size =
16
- Total training time =
0.255 hours
- Confidence threshold =
0.25
- Prediction on videos =
10 videos
As advised by the supervisor we used some CNN models and Yolo-NAS model and compare each of those models.
CNN models we used:
- VGG16
- VGG19
- ResNet152V2
- InceptionV3
- MobileNetV2
- DenseNet201
CNN models:
- Number of epochs =
25
- Batch size =
32
- Loss function used =
smooth_l1_loss
- Intersection Over Union
(IOU)
is observed in train , test , validation - Total training time:
- VGG16 -
1640.69 seconds
- VGG19 -
1962.71 seconds
- InceptionV3 -
1204.79 seconds
- MobileNetV2 -
1023.70 seconds
- DenseNet201 -
1547.24 seconds
- VGG16 -
Yolo-NAS model:
- Model used
yolo_nas_s
- Number of epochs =
25
- Batch size =
16
- Caching annotation time (minutes) =
Train dataset-07:35
Valid dataset-02:10
Test dataset-01:03
- Total training time (minutes) =
75.4 minutes
- Confidence threshold =
0.25
- Prediction on videos =
10 videos
Selected Dataset:
- Custom training dataset : Roboflow Dataset
- Total =
17491 images
- Train =
15296 images
- Valid =
1458 images
- Test =
737 images
- Train =
- Total =
- Model used
YOLOv8s
- Number of epochs =
15
- Batch size =
16
- Total training time =
2.568 hours
- Confidence threshold =
0.25
- Prediction on images =
737 images
- For object tracking =
ByteTrack
- For Line drawing ,annotation , coloring frame by frame =
Supervision
- Prediction and tracking on videos =
2 Videos
- Viewed some of the sample images from the dataset to include in paper.
- Kaggle video dataset won’t be used in model testing
- RWF-2000 video dataset will be used in model testing