Code for paper "Category-Level Pose Retrieval with Contrastive Features Learnt with Occlusion Augmentation"
The table below presents the expected performance on PASCAL3D (L0) for 12 object categories:
plane | bike | boat | bottle | bus | car | chair | table | mbike | sofa | train | tv | Mean | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Pi/6 | 84.8 | 88.1 | 82.5 | 91.7 | 98.7 | 99.2 | 95.9 | 88.8 | 85.6 | 97.0 | 98.0 | 90.0 | 92.3 |
Pi/18 | 59.5 | 42.8 | 54.2 | 68.7 | 94.5 | 95.9 | 70.4 | 71.8 | 33.9 | 69.9 | 88.7 | 58.7 | 72.2 |
MedErr | 8.2 | 11.6 | 9.4 | 7.1 | 3.0 | 3.1 | 6.7 | 6.3 | 13.5 | 6.5 | 3.9 | 8.4 | 6.6 |
-
Install Miniconda
-
Create and activate a new conda environment:
conda create -n myenv
conda activate myenv
- Install pytorch related packages:
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
- Install dependencies:
pip3 install -r /path/to/requirements.txt
(Needed for training and/or evaluation)
Follow the instructions in the repository of NeMo to download and preprocess PASCAL3D
Follow the instructions in OccludedPASCAL3D to generate an occluded version of PASCAL3D.
wget "http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar"
tar -xf "VOCtrainval_11-May-2012.tar"
(Needed for training and/or evaluation)
Generate rendering counterparts for train and test splits of PASCAL3D+. The script will generate rendering, silhouette, depth, and normal images for the train and test splits, located on PASCAL_(train_)NeMo/renderings/. The script takes a few hours at most to generate around 11k rendering per split.
python3 scripts/generate_pascal3d_renderings.py \
--split="all" \
--from_scratch="False" \
--object_category="car" \
--positive_type='all' \
--root_dir="/path/to/datasets" \
--downsample_rate="2" \
You can use the following command to inspect the images:
python3 scripts/visualize_pascal3d.py \
--root_dir="/path/to/datasets"
--symmetric \
--split=train \
--positive=normals
You can download trained models and encoded reference sets for all of the 12 PASCAL3D object categories from the following link: https://rdr.kuleuven.be/api/access/datafile/28679
In order to train on PASCAL3D you need to have followed all the aforementioned steps to download and preprocess PASCAL3D to generate the training and testing sets. Then simply execute the following script:
./scripts/train_pascal3d.sh
Make sure the provided paths in the script are correct!!!
To evaluate a trained model simply execute the following script after adjusting the dataset and models paths and specifying the correct object category:
./scripts/evaluate_occluded_pascal3d.sh
To get an idea how to use the pose estimator take a look at the predict_pose.py script or execute it giiven paths to the image, model weights, and reference embeddings.
python3 scripts/predict_pose.py
--image_path=/path/to/query-image
--weights_path=/path/to/trained_models/car
--refset_path=/path/to/referense-embeddings
Please cite the following paper if you find this the code useful for your research/projects.
@inproceedings{Kouros_2022_BMVC,
author = {Georgios Kouros and Shubham Shrivastava and Cédric Picron and Sushruth Nagesh and Punarjay Chakravarty and Tinne Tuytelaars},
title = {Category-Level Pose Retrieval with Contrastive Features Learnt with Occlusion Augmentation},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year = {2022},
url = {https://bmvc2022.mpi-inf.mpg.de/0026.pdf}
}