Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tpu support #277

Open
kbrajwani opened this issue Jun 11, 2021 · 7 comments
Open

Tpu support #277

kbrajwani opened this issue Jun 11, 2021 · 7 comments
Assignees
Labels
enhancement New feature or request tpu

Comments

@kbrajwani
Copy link

Do we have tpu support for training and inference models?

@innerlee innerlee added the tpu label Jun 11, 2021
@innerlee
Copy link
Contributor

Haven't tried that yet. Can TPU run normal pytorch training?

@kbrajwani
Copy link
Author

I think it's different on every model. But the answer is yes that normal PyTorch training can also run on tpu with some modification in training loop. We can check the difference here https://www.kaggle.com/tanulsingh077/pytorch-xla-understanding-tpu-s-and-xla .
This is old method. The new way is pytorchlightning which is automatically identify device and run same code on gpu & TPU.

@kbrajwani
Copy link
Author

Hi @innerlee , you have any planned to work on tpu because it will help too much to train model on large dataset?
Thanks

@innerlee
Copy link
Contributor

For TPU support, its better to build the infrastructure for the whole mm-eocsystem, and mmocr's support will be ready automatically. cc. @hellock, do you have plan for this?

BTW,

to train model on large dataset

how large is this? We use GPU and train on moderately large data.

@innerlee innerlee added the enhancement New feature or request label Jun 21, 2021
@kbrajwani
Copy link
Author

I am training on 10000 train, 1000 val, 1000 test images on psenet model for detection part for 600 epochs. Currently i am using colab ecosystem where it shows me around 23 days if I am using gpu. So it would be better if we can support the tpu then it will be too quick.
Thanks

@innerlee
Copy link
Contributor

It is relatively small. So you may double check the colab env

@kbrajwani
Copy link
Author

Colab env is giving t4 GPU with 16 Gb ram and it gives tpu v2 . So tpu are much faster then gpu that's the reason i want to make use of that.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request tpu
Projects
None yet
Development

No branches or pull requests

3 participants