Skip to content
RedFantom edited this page Apr 22, 2017 · 1 revision

To perform OCR on an image, the image first has to be grounded. Grounding is a process where different segments of the image containing the characters are classified, so the engine knows what each character looks like. The project offers various different options to ground your image interactively, or using a non-interactive Python function.

TextGrounder

TextGrounder().ground(imagefile, segments, text)

The TextGrounder class offers a non-interactive way of grounding an image. As arguments to the ground function, you pass an ImageFile object, the segments returned by a Segmenter. text is a str that contains the text of the characters that are in the segments. Please note that the amount of segments must be equal to the amount of characters, or a ValueError will be raised. This is the simplest way of grounding.

TerminalGrounder

TerminalGrounder().ground(imagefile, segments)

The TerminalGrounder``` offers an interactive shell to ground the segments. The function prints a small description of what needs to happen in order for the process to complete. Due to the limitations of the terminal, the TerminalGrounder` cannot show any visual indication of that the segment looks like. The process exits automatically after classifying all segments.

UserGrounder

UserGrounder().ground(imagefile, segments)

The UserGrounder offers the most sophisticated grounding experience. By providing a window in which the segment to ground is highlighted, the user can classify segments by looking at them and pressing the correct character on the keyboard. Grounding does not quit after completing, but instead loops until ESC is pressed by the user. If not all segments are classified, only the classified ones are saved.

Clone this wiki locally