Use pytesseract OCR to recognize text from an image Use pytesseract OCR to recognize text from an image python python

Use pytesseract OCR to recognize text from an image


Here is my solution:

import pytesseractfrom PIL import Image, ImageEnhance, ImageFilterim = Image.open("temp.jpg") # the second one im = im.filter(ImageFilter.MedianFilter())enhancer = ImageEnhance.Contrast(im)im = enhancer.enhance(2)im = im.convert('1')im.save('temp2.jpg')text = pytesseract.image_to_string(Image.open('temp2.jpg'))print(text)


Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, its important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, apply a slight Gaussian blur, then Otsu's threshold to obtain a binary image. From here, we can apply morphological operations to remove noise. Finally we invert the image. We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more options.


Here's a visualization of the image processing pipeline:

Input image

enter image description here

Convert to grayscale -> Gaussian blur -> Otsu's threshold

enter image description here

Notice how there are tiny specs of noise, to remove them we can perform morphological operations

enter image description here

Finally we invert the image

enter image description here

Result from Pytesseract OCR

2HHH

Code

import cv2import pytesseractpytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"# Grayscale, Gaussian blur, Otsu's thresholdimage = cv2.imread('1.png')gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)blur = cv2.GaussianBlur(gray, (3,3), 0)thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]# Morph open to remove noise and invert imagekernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)invert = 255 - opening# Perform text extractiondata = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')print(data)cv2.imshow('thresh', thresh)cv2.imshow('opening', opening)cv2.imshow('invert', invert)cv2.waitKey()


I have something different pytesseract approach for our community.Here is my approach

import pytesseractfrom PIL import Imagetext = pytesseract.image_to_string(Image.open("temp.jpg"), lang='eng',                        config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')print(text)