What is Optical character recognition?

In simpler terms, optical character recognition helps the computer to recognize texts from the images. It is also called an Optical character reader.

It reads the text from the given image or document. OCR is used to recognize printed text in the signboard, invoices, cheque books, handwritten documents, any image or documents. 


Ref: Andersonarchival

This article will not go into the mathematical concepts of how OCR works, rather it explains how OCR can be implemented using the ‘EasyOCR’ library package using Python. EasyOCR is based on deep learning models which involves text recognition and detection models. In the text recognition, it first detects and identifies the bounding boxes of the text present in the image or document. Second, it identifies the characters. In order to detect character and words in the image or document, deep learning models were used. For more information on how easyocr works, please check the documentation.

               Reference : EasyOCR

Let’s take this below sign image,


We want to read the text from the above image. Let’s see how we can read using EasyOCR library. I saved the image file as “SIGN.jpg”

Installing and importing ‘EasyOCR’’ library

Code


Reading the image file using PIL library. Storing the sign image in a “file” object.

Code

Output

Next step is to download the EasyOCR model. EasyOCR supports 40+ languages. For more information, please check here

Code

If we use GPU then the model downloads faster. It downloads detection and recognition models. I am running this in Google colab. 

‘en’ - en is a english language. If the image contains other languages, then we can add the same.

Imported the model, now we need to get the bounding regions for the image. 

Code

Easyocr supports multiple hyper-parameters for the readtext method. For more information on the different hyper-parameters can be found here. I have used the default one, you can try different hyper-parameters. Please note any OCR method won’t give hundred percent accuracy, however, we can play with hyper-parameters and get close to whatever possible for the images or documents. 

Output

From the above output, we can see the extracted text and it’s value. It exactly reads the text and also the values associated with each text called bounding values.

Next step is to draw a bounding box wherever the text appears in the image. Below function for drawing bounding boxes. 

Code

Output

From the above output, we can see that it exactly draws the bounding boxes for all the text.

Printing the text from the image

Code

Output

Let’s try another example with ‘tamil’ language. Please note EasyOCR supports 40+ languages, I am trying with Tamil and you can try with other languages

I wanted to read the below image

There is not much difference in the code. We need to download tamil model. 

Code

Here, language = ‘ta’ (tamil)

Code

Reading text and generating bounding boxes

Output

From the above output, we can see the extracted text and it’s value. It exactly reads the text.

Let’s draw the bounding boxes

Code

Output

Printing the text

Code

Output

In recent years, OCR has evolved a lot. However, it won’t give 100% accuracy. The example used here is a simple one. If we try with different examples, the result may not be 100% accurate. However, we can play with hyper-parameters and get close to whatever possible for the images or documents. 

Sources of Article

Image credit docupile

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in