Perform anomaly detection in videos using neural network architectures such as 2D convolutional auto-encoder[2] and spatial-temporal auto-encoder[3]. The focus is on finding frame-level anomalies in the UCSD Ped1 dataset[1, 5].
- Python 2.7
- PIL
- glob
- cv2
- numpy
- matplotlib
- sklearn
- CUDA Toolkit 8+
- TensorFlow 1.12 (?) <= tf.VERSION <= 1.3
- config/
- config.ini: contains settings for the run such as which network to use, learning rate, batch size and etcetera.
- data.nosync/
- (empty): space for train.tfrecords, test.tfrecords, frame-level annotation files created using src/create_tfrecords.py and src/create__frame_annotation.py.
- models.nosync/
- (empty): space for saved model using TensorFlow's saver methods.
- results/
- (empty): space for log files, plots and data structures that could be useful for post processing.
- src/
- evaluation/* : space for routines used to evaluate quality of anomaly detection (frame and pixel-level AUCs).
- create_ped1_frame_annotation.py: creates frame annotation to guide frame-level AUC calculation which is used to guide training.
- create_ped2_frame_annotation.py: creates frame annotation to guide frame-level AUC calculation which is used to guide training.
- create_streetscene_frame_annotation.py: creates frame annotation to guide frame-level AUC calculation which is used to guide training.
- conv_AE_2D.py: implements a 2D convolutional auto-encoder.
- conv_lstm_cell.py: implements a convLSTM cell to be used in an RNN. Credit: [4].
- create_tfrecords.py: creates train.npy and test.npy from a video anomaly detection dataset's raw data by some preprocessing.
- data_iterator.py: tf.data pipeline feeds batches of preprocessed video clips for training and testing.
- plots.py: implements plotting functions for results from a run.
- spatial_temporal_autoencoder.py: implements a spatial-temporal auto-encoder which is an RNN that uses convLSTM cells in between conv and deconv of a convAE.
- train.py: implements functions to run the network in training and testing modes by interacting with the data iterator and a model.
- max_unpool.py: implements the max_unpool operation in the convolutional auto-encoder. Credit: [6].
- main.py: read the config file, start logging, initialize data iterator and model builder and perform training.
- Note: src/evaluation/compute_frame_roc_auc and src/evaluation/compute_pixel_roc_auc cannot be made available due to copyright. They are not essential to this repo; details on how to implement them can be found in [1, 5].
- Run src/create_<dataset_name>_frame_annotation.py.
- Set DATA_DIR and EXT in config/config.ini and run src/create_tfrecords.py.
- Set all variables in config/config.ini and run main.py.
- Bharathkumar "Tiny" Ramachandra: tnybny at gmail dot com
- Zexi "Jay" Chen
- Mahadevan, Vijay, et al. "Anomaly detection in crowded scenes." Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010.
- Hasan, Mahmudul, et al. "Learning temporal regularity in video sequences." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
- Chong, Yong Shean, and Yong Haur Tay. "Abnormal event detection in videos using spatiotemporal autoencoder." International Symposium on Neural Networks. Springer, Cham, 2017.
- https://github.com/carlthome/tensorflow-convlstm-cell/blob/master/cell.py
- Li, Weixin, Vijay Mahadevan, and Nuno Vasconcelos. "Anomaly detection and localization in crowded scenes." IEEE transactions on pattern analysis and machine intelligence 36.1 (2014): 18-32.
- https://github.com/Pepslee