Code for "CCTV Latent Representations for Reducing Accident Response Time", Shafin Haque. Published in ACM Proceedings of ICDSP 2022.
Emergency Medical Services' response times to accidents are crucial to saving lives in vehicle accidents. Using deep learning to instantly detect accidents in public cameras and automatically alerting authorities could help this issue. However, this would require a large set of data on public cameras to train on, but this type of data hardly exists in a usable form. Current deep learning approaches to vehicle accidents typically use first-person cameras, which are not helpful for reducing response time as we do not have access to these cameras at all times. Also, public cameras such as closed-circuit television (CCTV) pick up a much larger amount of street activity than private cameras. Thus, we create a video dataset from live closed-circuit television, so we have access to the cameras at all times. We annotate the videos with metadata to help with future trend prediction as well as give further information for each video, as they are unlabeled. We create an unsupervised learning model to train on this video dataset, and visualize latent space representations of this data in order to cluster different types of street activity and pinpoint vehicle accidents.
git clone https://github.com/ShafinH/CCTV-LSRV.git
cd CCTV-LSRV
Python requirements for this implementation.
pip install -r requirements.txt
The video downloader requires FFmpeg and downloads hour-long videos into the designated folder.
mkdir scraped_data/Maryland
python downloader.py
The dataset can be modified in the cctv.py
for cutomizable training.
Pretrained checkpoints can be found in the checkpoints
folder.
If training new model, edit conv_autoencoder.py
and run main.py. A
results/cctv
folder will be created where model checkpoint and results can be found.
python main.py
Experiments can be run and edited by encoder.py
python encoder.py