Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap of MMOCR #39

Open
jeffreykuang opened this issue Apr 9, 2021 · 10 comments
Open

Roadmap of MMOCR #39

jeffreykuang opened this issue Apr 9, 2021 · 10 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed

Comments

@jeffreykuang
Copy link
Collaborator

jeffreykuang commented Apr 9, 2021

We keep this issue open to collect feature requests from users and hear your voice. Our monthly release plan is also available here.

You can either:

  1. Suggest a new feature by leaving a comment.

  2. Vote for a feature request with 👍 or be against with 👎. (Remember that developers are busy and cannot respond to all feature requests, so vote for your most favorable one!)

  3. Tell us that you would like to help implement one of the features in the list or review the PRs. (This is the greatest things to hear about!)

@jeffreykuang jeffreykuang added good first issue Good for newcomers help wanted Extra attention is needed labels Apr 9, 2021
@innerlee innerlee pinned this issue Apr 9, 2021
@INF800
Copy link

INF800 commented Apr 11, 2021

I think it will be a good idea to have colab demo / tutorial for all available features so that developers can get familiar with the package

@innerlee
Copy link
Contributor

@rakesh4real colab demo is planned in the next iteration.

@huyhoang17
Copy link

@rakesh4real Hi, thanks for your great library
How do you think about integrating some features like end-to-end spotting, in which the detection and recognition process are merged in a single network to learn both tasks?. Some related papers:

@jeffreykuang
Copy link
Collaborator Author

@huyhoang17 end-to-end spotting is one important direction of OCR. Our framework is easy to support end2end methods. We would like to reimplementing them in the future. If you are interested in doing it, welcome to send pr to this repo.

@seekingdeep
Copy link
Contributor

  • Production Deployment: ability to easily deploy on arm-based devices such as Raspberry Pi, and cpu-only devices.
    Benefits: ordinary people can detect and recognize text documents without coding knowledge.
    Requirements: optimize the models for inferencing-only environments, tensorRT, onnx, quantization, etc..

  • Training Documentation: introduce detailed documentation on how to label the images, train and deploy models.
    Requirements: simple youtube videos and github documentations.

@zcuncun
Copy link

zcuncun commented May 26, 2021

There is no test speed / memory usage in results. Some algorithms with huge model or complicated post process are very slow .
This is important while deploying algorithms.

@innerlee innerlee mentioned this issue Jun 7, 2021
26 tasks
@SWHL
Copy link

SWHL commented Jun 20, 2021

Hope to have a online demo. So we can quickly test the images to look the ocr result.

@kbrajwani
Copy link

One more end to end text spotting model.
pgnet :- https://arxiv.org/pdf/2104.05458v1.pdf

@cpwan
Copy link

cpwan commented Jan 3, 2022

Hi there, I suggest adding pre-trained models for document visual question answering (vqa).
Motivation
Document VQA is an important task in OCR. It recognizes texts region and finds their relationship. They are useful for processing visually rich documents, such as tables, forms, receipts, invoices.
There are families of document vqa algorithms. However, they are maintained in different frameworks. It makes the comparison of downstream tasks' performance difficult.

Model paper source
LayoutXLM https://arxiv.org/abs/2104.08836 pytorch
StructuralLM https://arxiv.org/abs/2105.11210 Tensorflow
StrucTexT https://arxiv.org/abs/2108.02923 paddlepaddle

Features

  • Inference with pre-trained transformers
  • Training pipeline for downstream tasks, such as entity labeling, entity linking, document classification
  • Document datasets, such as DocVQA, FUNSD

@gaotongxiao
Copy link
Collaborator

@cpwan Hi, thanks for your suggestion - that sounds really interesting! We'll definitely take this into our plan.

@gaotongxiao gaotongxiao unpinned this issue Mar 2, 2022
gaotongxiao pushed a commit to gaotongxiao/mmocr that referenced this issue Jul 15, 2022
gaotongxiao pushed a commit to gaotongxiao/mmocr that referenced this issue Jul 15, 2022
* add sar, seg and other components

* [feature]: add textsnake_drrg

* documentation and dbnet related code

* [feature]: add code for kie and textsnake config

* [feature]: add CRNN and RobustScanner

* Revert "documentation and dbnet related code"

* [feature]: add textdet

* [feature]: dbnet and docs

* fix open-mmlab#9: [feature]: setting norms for contributing (open-mmlab#10)

* fix open-mmlab#9: [feature]: setting norms for contributing

* fix open-mmlab#9: [feature]: setting norms for contributing

* fix open-mmlab#9: [feature]: setting norms for contributing

* fix open-mmlab#9: [feature]: setting norms for contributing

* fix open-mmlab#11: update docs (open-mmlab#12)

* fix open-mmlab#11: update docs

* fix open-mmlab#11: update datasets.md for kie

* fix open-mmlab#13: update docs with toc

* fix open-mmlab#13: link pr to issue

* fix open-mmlab#13: rename section title

* fix open-mmlab#13: rename section title (open-mmlab#16)

* fix open-mmlab#17: update ckpt path of psenet (open-mmlab#18)

* Enhance/synthtext pretrain (open-mmlab#20)

* fix 19: add synthtext pretrained model

* fix 19: setup.cfg linting

* Format readme (open-mmlab#23)

* Format readme

Signed-off-by: lizz <lizz@sensetime.com>

* try

Signed-off-by: lizz <lizz@sensetime.com>

* Remove redudant config link

Signed-off-by: lizz <lizz@sensetime.com>

* fix open-mmlab#21: refactor kie dataset & add show_results

* fix open-mmlab#21: update sdmgr readme and config

* fix open-mmlab#21: update readme of segocr

* f-str

Signed-off-by: lizz <lizz@sensetime.com>

* format again

Signed-off-by: lizz <lizz@sensetime.com>

* Mkae sort_vertex public api

Signed-off-by: lizz <lizz@sensetime.com>

* fix open-mmlab#24: rm img_meta from inference (open-mmlab#25)

* Fix typos (open-mmlab#26)

* Fix typos

Signed-off-by: lizz <lizz@sensetime.com>

* Ohh

Signed-off-by: lizz <lizz@sensetime.com>

* [feature]: add nrtr (open-mmlab#28)

* [feature]: add nrtr

* Rename nrtr_top_dataset.py to nrtr_toy_dataset.py

Co-authored-by: Hongbin Sun <hongbin306@gmail.com>

* fix open-mmlab#29: update logo (open-mmlab#30)

* Feature/iss 33 (open-mmlab#34)

* fix open-mmlab#33: update dataset.md

* fix open-mmlab#33: pytest for transformer related

* Add Github CI

Signed-off-by: lizz <lizz@sensetime.com>

* rm old ci

Signed-off-by: lizz <lizz@sensetime.com>

* add contributing and code of conduct

Signed-off-by: lizz <lizz@sensetime.com>

* Fix ci

Signed-off-by: lizz <lizz@sensetime.com>

* fix

Signed-off-by: lizz <lizz@sensetime.com>

* fix

Signed-off-by: lizz <lizz@sensetime.com>

* Re-enable skipped test

Signed-off-by: lizz <lizz@sensetime.com>

* good contributing link

Signed-off-by: lizz <lizz@sensetime.com>

* Remove pytorch 1.3

Signed-off-by: lizz <lizz@sensetime.com>

* Remove test dependency on tools

Signed-off-by: lizz <lizz@sensetime.com>

* fix open-mmlab#31: pytest pass

* skip cuda

Signed-off-by: lizz <lizz@sensetime.com>

* try

Signed-off-by: lizz <lizz@sensetime.com>

* format

Signed-off-by: lizz <lizz@sensetime.com>

* again

Signed-off-by: lizz <lizz@sensetime.com>

* Revert "Remove pytorch 1.3"

This reverts commit b8d65afea82a9ba9a5ee3315aa6816d21c137c91.

* Revert me when rroi is moved to mmcv

Signed-off-by: lizz <lizz@sensetime.com>

* Revert "Revert "Remove pytorch 1.3""

This reverts commit 1629a64b9e5aecc5536698d988e7151e04c4772d.

* Let it pass

* fix open-mmlab#35: add nrtr readme; update nrtr config (open-mmlab#36)

* fix open-mmlab#37: remove useless code (open-mmlab#38)

* np.int -> np.int32

Signed-off-by: lizz <lizz@sensetime.com>

* out_size -> output_size

Signed-off-by: lizz <lizz@sensetime.com>

* Add textdet unit tests (open-mmlab#43)

* Fix open-mmlab#41: test fpn_cat

* Fix open-mmlab#41: test fpn_cat

* Fix open-mmlab#41: test fpn_cat

* fix open-mmlab#40: add unit test for recog config, transforms, etc. (open-mmlab#44)

* fix open-mmlab#45: remove useless (open-mmlab#46)

* fix open-mmlab#47: add unit test for api (open-mmlab#48)

* add Dockerfile (open-mmlab#50)

* Textsnake tests (open-mmlab#51)

* add textsnake unit tests

* Remove usage of \ (open-mmlab#49)

* Remove usage of \

Signed-off-by: lizz <lizz@sensetime.com>

* rebase

Signed-off-by: lizz <lizz@sensetime.com>

* typos

Signed-off-by: lizz <lizz@sensetime.com>

* Remove test dependency on tools/

Signed-off-by: lizz <lizz@sensetime.com>

* Remove usage of \

Signed-off-by: lizz <lizz@sensetime.com>

* rebase

Signed-off-by: lizz <lizz@sensetime.com>

* typos

Signed-off-by: lizz <lizz@sensetime.com>

* Remove test dependency on tools/

Signed-off-by: lizz <lizz@sensetime.com>

* typo

Signed-off-by: lizz <lizz@sensetime.com>

* KIE in keywords

Signed-off-by: lizz <lizz@sensetime.com>

* some renames

Signed-off-by: lizz <lizz@sensetime.com>

* kill isort skip

Signed-off-by: lizz <lizz@sensetime.com>

* aggregation discrimination

Signed-off-by: lizz <lizz@sensetime.com>

* aggregation discrimination

Signed-off-by: lizz <lizz@sensetime.com>

* tiny

Signed-off-by: lizz <lizz@sensetime.com>

* fix bug: model infer on cpu

Co-authored-by: Hongbin Sun <hongbin306@gmail.com>

* fix open-mmlab#52: update readme (open-mmlab#53)

* fix open-mmlab#39: update crnn & robustscanner. (open-mmlab#54)

* fix open-mmlab#55: update nrtr readme (open-mmlab#56)

Co-authored-by: HolyCrap96 <theochan666@gmail.com>
Co-authored-by: quincylin1 <quincylin.333@gmail.com>
Co-authored-by: YueXy <yuexiaoyu@sensetime.com>
Co-authored-by: yuexy <yuexy@users.noreply.github.com>
Co-authored-by: jeffreykuang <kuangzhh@gmail.com>
Co-authored-by: lizz <innerlee@users.noreply.github.com>
Co-authored-by: lizz <lizz@sensetime.com>
Co-authored-by: Theo Chan <46100303+HolyCrap96@users.noreply.github.com>
gaotongxiao pushed a commit to gaotongxiao/mmocr that referenced this issue Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

10 participants