Creating a Wrapper for Tesseract is Several Times Faster Than PyTesseractby@nuralem
3,923 reads

Creating a Wrapper for Tesseract is Several Times Faster Than PyTesseract

tldt arrow
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

The basic idea is to use python’s built-in multiprocessing features to split documents into separate pages and run multiple tesseract engine instances for parallel page recognition. Tesseract uses one core to recognize images, in average cases, it will be enough, but if you have “heavy” documents, that have many sheets, it would be very slow. The technology is called OCR (Optical Character Recognition) One of the most popular and free OCR software is free and open source.

Company Mentioned

Mention Thumbnail
featured image - Creating a Wrapper for Tesseract is Several Times Faster Than PyTesseract
Nuralem Abizov HackerNoon profile picture

@nuralem

Nuralem Abizov


Receive Stories from @nuralem


Credibility

react to story with heart

RELATED STORIES

L O A D I N G
. . . comments & more!