Diff9D: Diffusion-Based Domain-Generalized Category-Level 9DoF Object Pose Estimation

This is the PyTorch implementation of paper Diff9D by J. Liu, W. Sun, H. Yang, P. Deng, C. Liu, N. Sebe, H. Rahmani, and A. Mian. Diff9D is a simple yet effective prior-free domain-generalized (sim2real) category-level 9DoF object pose generator based on diffusion.

Installation

Our code has been trained and tested with:

Ubuntu 20.04
Python 3.8.15
PyTorch 1.12.0
CUDA 11.3

Complete installation can refer to our environment.

Datasets

Download NOCS dataset (CAMERA_train, Real_test, gt annotations, mesh models, and segmentation results) and Wild6D (testset). Data processing can refer to IST-Net. Unzip and organize these files in ../data as follows:

data
├── CAMERA
├── camera_full_depths
├── Real
├── gts
├── obj_models
├── segmentation_results
├── Wild6D

Evaluation

You can download our pretrained model epoch_1000.pth (trained solely on the synthetic CAMERA25 dataset) and put it in the '../log1/diffusion_pose' directory. Then, you can quickly evaluate the real-world REAL275 dataset using the following command:

python test.py --config config/diffusion_pose.yaml

The real-world Wild6D dataset can be evaluated using the following command:

bash test_wild6d.sh

Note that there is a small mistake in the original evaluation code of NOCS for the 3D IOU metrics. We thank CATRE and SSC-6D for pointing out this. We have revised it and recalculated the metrics of some methods. The revised evaluation code is given in our released code.

Training

To train the model, remember to download the synthetic CAMERA25 dataset and organize & preprocess it properly.

train.py is the main file for training. You can start training using the following command:

python train.py --gpus 0 --config config/diffusion_pose.yaml

The complete training log has been provided.

Citation

If you find our work useful, please consider citing:

@article{Diff9D,
  author={Liu, Jian and Sun, Wei and Yang, Hui and Deng, Pengchao and Liu, Chongpei and Sebe, Nicu and Rahmani, Hossein and Mian, Ajmal},
  title={Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation},
  journal={arXiv preprint arXiv:2502.02525},
  year={2025}
}

Acknowledgment

Our implementation leverages the code from DPDN and IST-Net. We thank the authors for releasing the code.

Licence

This project is licensed under the terms of the MIT license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Diff9D: Diffusion-Based Domain-Generalized Category-Level 9DoF Object Pose Estimation

Installation

Datasets

Evaluation

Training

Citation

Acknowledgment

Licence

Files

README.md

Latest commit

History

README.md

File metadata and controls

Diff9D: Diffusion-Based Domain-Generalized Category-Level 9DoF Object Pose Estimation

Installation

Datasets

Evaluation

Training

Citation

Acknowledgment

Licence