Extract text from images using keras-ocr in Python

gcptutorials.com TensorFlow

This post explains how to extract text from images using keras-ocr. keras-ocr provides an end-to-end training pipeline to build new OCR models.

Extracting text with keras-ocr

Let's build keras-ocr pipeline to extract text from below two images.

1. Install `keras-ocr`

   
pip install keras-ocr

2. Import `keras-ocr` and download pretrained weights for the detector and recognizer

  
import keras_ocr 
pipeline = keras_ocr.pipeline.Pipeline()

3. Read images from urls to image object

   
images = [
    keras_ocr.tools.read(url) for url in [
        'https://storage.googleapis.com/gcptutorials.com/examples/keras-ocr-img-1.jpg',        
        'https://storage.googleapis.com/gcptutorials.com/examples/keras-ocr-img-2.png'
    ]
]

4. Check image objects for images

   
print(images[0])
print(images[1])

5. Run the pipeline recognizer on images

   
prediction_groups = pipeline.recognize(images)

6. Extract text from First Image

   
predicted_image_1 = prediction_groups[0]
for text, box in predicted_image_1:
    print(text)

7. Extract text form Second Image

   
predicted_image_2 = prediction_groups[1]
for text, box in predicted_image_2:
    print(text)

8. Complete code snippet to extract text with `keras-ocr` in Python

   
import keras_ocr 

pipeline = keras_ocr.pipeline.Pipeline()

images = [
    keras_ocr.tools.read(url) for url in [
        'https://storage.googleapis.com/gcptutorials.com/examples/keras-ocr-img-1.jpg',        
        'https://storage.googleapis.com/gcptutorials.com/examples/keras-ocr-img-2.png'
    ]
]

print(images[0])
print(images[1])

prediction_groups = pipeline.recognize(images)

predicted_image_1 = prediction_groups[0]
for text, box in predicted_image_1:
    print(text)

predicted_image_2 = prediction_groups[1]
for text, box in predicted_image_2:
    print(text)

Category: TensorFlow