Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance on TextOCR Dataset #259

Open
jkcg-learning opened this issue Jun 3, 2021 · 6 comments
Open

Performance on TextOCR Dataset #259

jkcg-learning opened this issue Jun 3, 2021 · 6 comments
Assignees
Labels

Comments

@jkcg-learning
Copy link

Motivation

Improve the benchmark performance of all algorithms based on TextOCR dataset released by Facebook AI research team

Related resources
https://textvqa.org/textocr

Overview
TextOCR requires models to perform text-recognition on arbitrary shaped scene-text present on natural images. TextOCR provides ~1M high quality word annotations on TextVQA images allowing application of end-to-end reasoning on downstream tasks such as visual question answering or image captioning.

Statistics
28,134 natural images from TextVQA
903,069 annotated scene-text words
32 words per image on average

@cuhk-hbsun
Copy link
Collaborator

Thanks for your suggestion. And we will take it into our July plan.

@jkcg-learning
Copy link
Author

Team, is this in consideration for the next release ?

@gaotongxiao
Copy link
Collaborator

We already support TextOCR dataset now (https://mmocr.readthedocs.io/en/latest/datasets.html)

@jkcg-learning
Copy link
Author

Thanks for adding this dataset for the purpose of training...

Shall we also expect a model checkpoint particularly trained based on this dateset from the team..

@gaotongxiao
Copy link
Collaborator

gaotongxiao commented Jul 13, 2021

Currently we only have DBNet pretrained on TextOCR. Do you have any requests for the model type and the specific datasets that it is pretrained on? We may add that to our plan if we believe that it also benefits our community.

@jkcg-learning
Copy link
Author

jkcg-learning commented Jul 13, 2021

https://mmocr.readthedocs.io/en/latest/textdet_models.html#icdar2015

image

Is it possible to update the DBNet model zoo with the details of your model training and the metric levels for TextOCR dataset ..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants