Use pytesseract OCR to recognize text from an image

python image image-processing computer-vision ocr

Here is my solution:

import pytesseractfrom PIL import Image, ImageEnhance, ImageFilterim = Image.open("temp.jpg") # the second one im = im.filter(ImageFilter.MedianFilter())enhancer = ImageEnhance.Contrast(im)im = enhancer.enhance(2)im = im.convert('1')im.save('temp2.jpg')text = pytesseract.image_to_string(Image.open('temp2.jpg'))print(text)

python image image-processing computer-vision ocr

Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, its important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, apply a slight Gaussian blur, then Otsu's threshold to obtain a binary image. From here, we can apply morphological operations to remove noise. Finally we invert the image. We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more options.

Here's a visualization of the image processing pipeline:

Input image

Convert to grayscale -> Gaussian blur -> Otsu's threshold

Notice how there are tiny specs of noise, to remove them we can perform morphological operations

Finally we invert the image

Result from Pytesseract OCR

2HHH

Code

import cv2import pytesseractpytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"# Grayscale, Gaussian blur, Otsu's thresholdimage = cv2.imread('1.png')gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)blur = cv2.GaussianBlur(gray, (3,3), 0)thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]# Morph open to remove noise and invert imagekernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)invert = 255 - opening# Perform text extractiondata = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')print(data)cv2.imshow('thresh', thresh)cv2.imshow('opening', opening)cv2.imshow('invert', invert)cv2.waitKey()

python image image-processing computer-vision ocr

I have something different pytesseract approach for our community.Here is my approach

import pytesseractfrom PIL import Imagetext = pytesseract.image_to_string(Image.open("temp.jpg"), lang='eng',                        config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')print(text)

CodeHunter

Use pytesseract OCR to recognize text from an image

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last