Skip to content

anhtu-phan/transformers-text-recognition

Repository files navigation

transformers-text-recognition

Architecture of transformer-text-recognition model

This project will try to apply transformer to recognize the text from image. The input of model is a image and the output of the model is word taken from image. The input image feature is extracted by convolution network and then the extracted feature is used as a input sentence to train transformer model to translate image to text.

How to run

  • Download dataset from here
  • The trained models can be downloaded from here

Install

#python3.7
pip install --upgrade pip
pip install -r requirements.txt

Demo

python run_demo_server.py --port PORT --model_folder FOLDER_PATH
  • PORT: port to run server (default server will run on http://localhost:9595)
  • model_folder: folder store trained model

Training

python training.py --model_type MODEL_TYPE
  • model_type:
    • 1: transformer-random-trg
    • 2: transformer-no-trg
    • 3: transformer-no-decoder
    • 4: transformer-trg-same-src
    • 5: transformer
  • The training model will be saved to ./checkpoints/{model_type}.pt

Eval

python evaluate.py --model_type MODEL_TYPE
  • model_type:
    • 1: transformer-random-trg
    • 2: transformer-no-trg
    • 3: transformer-no-decoder
    • 4: transformer-trg-same-src
    • 5: transformer

About

Apply transformers to text recognition problem

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published