Multimodal learning

How to use this repository:

Extract optical flows from the video. To do this, run the video_preprocessing.py script.
Train a model by running the training.py script.
Evaluate the trained model and get different results including U-map plots, gesture classification, skill classification, task classification by running the embeddings_cluster_explore.py script.

The paper associated with this repository can be found at https://link.springer.com/article/10.1007/s11548-021-02343-y.

The citation details are as follows.

@article{wu2021cross, title={Cross-modal self-supervised representation learning for gesture and skill recognition in robotic surgery}, author={Wu, Jie Ying and Tamhane, Aniruddha and Kazanzides, Peter and Unberath, Mathias}, journal={International Journal of Computer Assisted Radiology and Surgery}, volume={16}, number={5}, pages={779--787}, year={2021}, publisher={Springer} }

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
binary_search.py		binary_search.py
dataloader.py		dataloader.py
embeddings_cluster_explore.py		embeddings_cluster_explore.py
encoder_decoder_knot_tying.txt		encoder_decoder_knot_tying.txt
main.py		main.py
neural_networks.py		neural_networks.py
training.py		training.py
video_preprocessing.py		video_preprocessing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal learning

About

Releases

Packages

Contributors 2

Languages

License

arcadelab/multimodal_learning

Folders and files

Latest commit

History

Repository files navigation

Multimodal learning

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages