How to use this repository:
- Extract optical flows from the video. To do this, run the video_preprocessing.py script.
- Train a model by running the training.py script.
- Evaluate the trained model and get different results including U-map plots, gesture classification, skill classification, task classification by running the embeddings_cluster_explore.py script.
The paper associated with this repository can be found at https://link.springer.com/article/10.1007/s11548-021-02343-y.
The citation details are as follows.
@article{wu2021cross, title={Cross-modal self-supervised representation learning for gesture and skill recognition in robotic surgery}, author={Wu, Jie Ying and Tamhane, Aniruddha and Kazanzides, Peter and Unberath, Mathias}, journal={International Journal of Computer Assisted Radiology and Surgery}, volume={16}, number={5}, pages={779--787}, year={2021}, publisher={Springer} }