This repo re-implements Faster R-CNN fully on MXNet Gluon API, which supports batch size larger than one and Multi-GPU training. You can use the code to train/validate/test for object detection task.
- RPN, Fast R-CNN, Faster R-CNN with VGG16 model
- Inference and prediction in hybridize mode
- Multi-GPU and lager batch size support
- End to end training/validating/testing
- Alternate training/validating/testing
More functions are in developing...
Note: This repo depends on MXNet version 1.2.1+, due to MXNet Symbol and Gluon Proposal API are inconsistent in previous version.
This repo requires Python3 with the following packages:
mxnet
tqdm
EasyDict
matplotlib
opencv-python
You may also need a GPU with at least 8GB memory for training.
- Install all required packages in python3.
- Clone the Gluon Faster R-CNN repository.
git clone https://github.com/WalterMa/gluon-faster-rcnn cd gluon-faster-rcnn
- Download pre-trained model parameters from release. Then extract it to ./model directory.
- Run demo_faster_rcnn.py.
python ./demo_faster_rcnn.py
Currently, this repo only support voc2007/2012 dataset. But you could easily modify or create your own dataset by reference Gluon-CV dataset code, or generate and using record dataset.
Note: Record Dataset is only available in num_workers=0, due to MXNet issue.
-
We need the following three files from Pascal VOC:
Filename Size SHA-1 VOCtrainval_06-Nov-2007.tar 439 MB 34ed68851bce2a36e2a223fa52c661d592c66b3c VOCtest_06-Nov-2007.tar 430 MB 41a8d6e12baa5ab18ee7f8f8029b9e11805b4ef1 VOCtrainval_11-May-2012.tar 1.9 GB 4e443f8a2eca6b1dac8a6c57641b67dd40621a49 -
Download and extract voc dataset to ./data/VOCdevkit/, or you need to specify dataset path in .utils/config.py or related python scripts.
-
Start e2e training and validating:
python ./train_faster_rcnn.py
Method | Network | Training Data | Testing Data | Reference | Result |
---|---|---|---|---|---|
Faster R-CNN end-to-end | VGG16 | VOC07+12 | VOC07test | 73.2 | - |
This is a re-implementation of original Faster R-CNN which is based on caffe. The arXiv paper is available here.
This repository used code from MXNet, Faster R-CNN, MX R-CNN, MXNet SSD, Gluon CV.