How do I choose between Tesseract and OpenCV? [closed]

python opencv computer-vision ocr tesseract

Tesseract is an OCR engine. It's used, worked on and funded by Google specifically to read text from images, perform basic document segmentation and operate on specific image inputs (a single word, line, paragraph, page, limited dictionaries, etc.).
OpenCV, on the other hand, is a computer vision library that includes features that let you perform some feature extraction and data classification. You can create a simple letter segmenter and classifier that performs basic OCR, but it is not a very good OCR engine (I've made one in Python before from scratch. It's really inaccurate for input that deviates from your training data).

If you want to get a basic understanding of how hard OCR is, try OpenCV. Tesseract is for real OCR.

python opencv computer-vision ocr tesseract

I am the author of that digit recognition tutorial you mentioned, and I would say, that is no way substitute for tesseract.

Tesseract is a really good OCR engine, may be the best OpenSource OCR engine.

The tutorial you mentioned is just a try, to understand most simple working of OCR.

So, if you are looking for OCR app, I would recommend you to use OpenCV for preprocessing the image and then apply tesseract engine.

python opencv computer-vision ocr tesseract

The two can be complementary. If you read the paper on OpenCV:https://github.com/tesseract-ocr/docs/blob/master/tesseracticdar2007.pdf

It highlights that "Since HP had independently-developed page layout analysis technology that was used in products, (and therefore not released for open-source) Tesseract never needed its own page layout analysis. Tesseract therefore assumes that its input is a binary image with optional polygonal text regions defined."

This type of task can be performed by OpenCV and the resulting image handed off to Tesseract. You can find a sample of this type of code in the Git repo: https://github.com/Itseez/opencv_contrib/tree/master/modules/text/samplesThe samples use Tesseract APIs to do image to text conversion.

CodeHunter

How do I choose between Tesseract and OpenCV? [closed]

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last