This post explains how to extract text from images using
keras-ocr
. keras-ocr
provides an end-to-end
training pipeline to build new OCR models.
See also: How to convert PDF file to image using Python
See also: Extract text from images using pytesseract
Let's build keras-ocr pipeline to extract text from below two images.
keras-ocr
pip install keras-ocr
keras-ocr
and download pretrained weights for the
detector and recognizer
import keras_ocr
pipeline = keras_ocr.pipeline.Pipeline()
images = [
keras_ocr.tools.read(url) for url in [
'https://storage.googleapis.com/gcptutorials.com/examples/keras-ocr-img-1.jpg',
'https://storage.googleapis.com/gcptutorials.com/examples/keras-ocr-img-2.png'
]
]
print(images[0])
print(images[1])
prediction_groups = pipeline.recognize(images)
predicted_image_1 = prediction_groups[0]
for text, box in predicted_image_1:
print(text)
predicted_image_2 = prediction_groups[1]
for text, box in predicted_image_2:
print(text)
keras-ocr
in
Python
import keras_ocr
pipeline = keras_ocr.pipeline.Pipeline()
images = [
keras_ocr.tools.read(url) for url in [
'https://storage.googleapis.com/gcptutorials.com/examples/keras-ocr-img-1.jpg',
'https://storage.googleapis.com/gcptutorials.com/examples/keras-ocr-img-2.png'
]
]
print(images[0])
print(images[1])
prediction_groups = pipeline.recognize(images)
predicted_image_1 = prediction_groups[0]
for text, box in predicted_image_1:
print(text)
predicted_image_2 = prediction_groups[1]
for text, box in predicted_image_2:
print(text)
Category: TensorFlow