-
Notifications
You must be signed in to change notification settings - Fork 177
Grounding
To perform OCR on an image, the image first has to be grounded. Grounding is a process where different segments of the image containing the characters are classified, so the engine knows what each character looks like. The project offers various different options to ground your image interactively, or using a non-interactive Python function.
TextGrounder().ground(imagefile, segments, text)
The TextGrounder
class offers a non-interactive way of grounding an image. As arguments to the ground
function, you pass an ImageFile
object, the segments
returned by a Segmenter
. text
is a str
that contains the text of the characters that are in the segments
. Please note that the amount of segments must be equal to the amount of characters, or a ValueError
will be raised. This is the simplest way of grounding.
TerminalGrounder().ground(imagefile, segments)
The TerminalGrounder``` offers an interactive shell to ground the
segments. The function prints a small description of what needs to happen in order for the process to complete. Due to the limitations of the terminal, the
TerminalGrounder` cannot show any visual indication of that the segment looks like. The process exits automatically after classifying all segments.
UserGrounder().ground(imagefile, segments)
The UserGrounder
offers the most sophisticated grounding experience. By providing a window in which the segment to ground is highlighted, the user can classify segments by looking at them and pressing the correct character on the keyboard. Grounding does not quit after completing, but instead loops until ESC
is pressed by the user. If not all segments are classified, only the classified ones are saved.