This work helps in the preparation of OCR for the Kurdish language. In par- ticular, its focus is on Kurdish texts written in Persian-Arabic script. Currently, Kurdish OCR is in its early stages. This work can assist in preparing the environment for a full-fledged OCR application for Kurdish. Please cite the related paper when you use this resource as follows:
Harvard:
Idrees, S. and Hassani, H., 2021. Exploiting Script Similarities to Compensate for the Large Amount of Data in Training Tesseract LSTM: Towards Kurdish OCR. Applied Sciences, 11(20), p.9752.
bibtex:
@article{idrees2021exploiting,
title={{Exploiting Script Similarities to Compensate for the Large Amount of Data in Training Tesseract LSTM: Towards Kurdish OCR}},
author={Idrees, Saman and Hassani, Hossein},
journal={Applied Sciences},
volume={11},
number={20},
pages={9752},
year={2021},
publisher={{Multidisciplinary Digital Publishing Institute}}
}