Numbers are not detect well in psenet_r50_fpnf_600e_icdar2017 #274

kbrajwani · 2021-06-10T12:04:56Z

Psenet is working great in the alphabets but if we consider the float numbers then it will separate numbers by comma (,) or decimal sign( . ) .
Like 1,000,00.00 it will be detect line [ "1" , ",000" , ",00" , "00" ] So it will miss decimal sign or comma sometime. Also it will miss the 1 in 1,000,00.00 so we will only get ,000,00.00 .
Can you tell me how can we resolve this issue?

cuhk-hbsun · 2021-06-11T02:17:04Z

Can you provide the test image to us? We can use it to analyze the issue.
Btw, to solve this problem, the best solution may be: train a new model on your dataset instead of use the one trained on icdar2017, since most annotated boxes in icdar2017 is word instead of sentance.

kbrajwani · 2021-06-11T03:17:48Z

Hey i will try to train model. I want to finetune the model psenet_r50_fpnf_600e_icdar2017 on 200 images. what you think it's sufficient data for finetune?

I have done data preparation https://mmocr.readthedocs.io/en/latest/datasets.html according to this.
Then I am running
./tools/dist_train.sh configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2017.py /content/psenet 1
This command but I think it will start training from scratch rather than finetuning. Can you tell me where i can give the model path to start finetuning?

Here is the sample of my data.

cuhk-hbsun · 2021-06-11T06:16:44Z

I think 200 images is ok for finetuning the model.
Add load_from = "/path/to/pretrained_checkpoint.pth" to the config file psenet_r50_fpnf_600e_icdar2017.py to load pretrained model.

kbrajwani · 2021-06-15T13:29:39Z

Hi @cuhk-hbsun can you tell me that psenet_r50_fpnf_600e_icdar2017 is trained on synthtext or not? Currently i know that first it's trined on icdar2017 then icdar2015 so it's trained in synthtext also?
I am asking because i have trained it on my images but it's not performing well as i have created annotation from google document ocr which divides the words like date 21/10/1997 will be returning as [21, / , 10 , / , 1997] . so i can't show the correct annotation to the psenet training.
Thanks

innerlee · 2021-06-15T14:41:37Z

Please check out the model zoo page. Each model has its training configs.

kbrajwani · 2021-06-15T14:50:30Z

https://mmocr.readthedocs.io/en/latest/textdet_models.html#psenet
Yes i have checked this page but it didn't show the synthtext dataset that's the reason I have asked.

innerlee · 2021-06-15T15:23:02Z

The page should contain the full reproducible information. If not, then we will fix it.

kbrajwani · 2021-06-22T13:53:32Z

Hi, i have done the training on my images. I am getting number now but i have one problem of not detecting single or two character word or a small word in images. Can you guide me which parameter i can change to achieve the small text in image.
Thanks

kbrajwani · 2021-06-22T15:58:48Z

One more issue in robust_scanner model it will misclassify / as 1 also if number have .000 then . will classify as C and result we get c000 . Can you tell me how can i make this correct also.

innerlee · 2021-06-22T16:23:51Z

First thing to try is to check the data. Make sure there are sufficiently large number of high quality training samples for the bad cases

kbrajwani · 2021-06-23T11:37:15Z

Hey can you tell me how to convert synthtext dataset for text detection? it is given here https://github.com/open-mmlab/mmocr/blob/main/docs/datasets.md but steps are missing to convert. So how can i trained the psenet on synthtext ?

kbrajwani · 2021-06-24T14:56:10Z

@innerlee can you please guide me about synthtext training for text detection part and which config i can use for psenet to train on synthtext.

innerlee · 2021-06-24T15:03:13Z

@cuhk-hbsun

kbrajwani · 2021-06-28T08:58:31Z

@cuhk-hbsun

Hey can you tell me how to convert synthtext dataset for text detection? it is given here https://github.com/open-mmlab/mmocr/blob/main/docs/datasets.md but steps are missing to convert. So how can i trained the psenet on synthtext ?

can you please guide me about synthtext training for text detection part and which config i can use for psenet to train on synthtext.

Sasidev90 · 2021-07-06T13:20:58Z

Hi,
Single character numbers are not detecting properly in 'psenet_r50_fpnf_600e_icdar2017.pth' unable to extract the numeric values, attached the sample image for your reference. Thanks in advance.

innerlee added the community discussion label Jun 10, 2021

innerlee assigned cuhk-hbsun Jun 24, 2021

innerlee added the documentation Improvements or additions to documentation label Jun 24, 2021

rmdmohan20 mentioned this issue Dec 1, 2021

Poor detection of words with pretrained models #636

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numbers are not detect well in psenet_r50_fpnf_600e_icdar2017 #274

Numbers are not detect well in psenet_r50_fpnf_600e_icdar2017 #274

kbrajwani commented Jun 10, 2021

cuhk-hbsun commented Jun 11, 2021

kbrajwani commented Jun 11, 2021

cuhk-hbsun commented Jun 11, 2021

kbrajwani commented Jun 15, 2021

innerlee commented Jun 15, 2021

kbrajwani commented Jun 15, 2021

innerlee commented Jun 15, 2021

kbrajwani commented Jun 22, 2021

kbrajwani commented Jun 22, 2021

innerlee commented Jun 22, 2021

kbrajwani commented Jun 23, 2021

kbrajwani commented Jun 24, 2021

innerlee commented Jun 24, 2021

kbrajwani commented Jun 28, 2021

Sasidev90 commented Jul 6, 2021

Numbers are not detect well in psenet_r50_fpnf_600e_icdar2017 #274

Numbers are not detect well in psenet_r50_fpnf_600e_icdar2017 #274

Comments

kbrajwani commented Jun 10, 2021

cuhk-hbsun commented Jun 11, 2021

kbrajwani commented Jun 11, 2021

cuhk-hbsun commented Jun 11, 2021

kbrajwani commented Jun 15, 2021

innerlee commented Jun 15, 2021

kbrajwani commented Jun 15, 2021

innerlee commented Jun 15, 2021

kbrajwani commented Jun 22, 2021

kbrajwani commented Jun 22, 2021

innerlee commented Jun 22, 2021

kbrajwani commented Jun 23, 2021

kbrajwani commented Jun 24, 2021

innerlee commented Jun 24, 2021

kbrajwani commented Jun 28, 2021

Sasidev90 commented Jul 6, 2021