Learn how to extract a page from PDF as a JPEG or PNG file using Python. This post provides steps to convert PDF to an image using Python modules.
In the article we are using Python's pdf2image
module that
requires Poppler PDF rendering library. Let's set up Poppler for Windows.
1. Download Poppler from this Link.
2. Extract the downloaded binary and place the extracted folder in
C:\Program Files\
.
Now poppler set-up is complete let's proceed with
pdf2image
module installation and writing python code.
1. Install pdf2image
module.
pip install pdf2image
2. Import required module and function.
from pdf2image import convert_from_path
3. Define Poppler
executable
path.
poppler_path = r'C:\Program Files\poppler-0.68.0\bin'
4. Define pdf file path.
pdf_path = "sample.pdf" # Replace with path of PDF File
5. Convert PDF file to images using
convert_from_path
function.
images = convert_from_path(pdf_path=pdf_path, poppler_path=poppler_path)
6. Iterate over PDF pages and save each page as PNG image.
for count, img in enumerate(images):
img_name = f"page_{count}.png"
img.save(img_name, "PNG")
7. After successful execution you should see image for each PDF page in current working directory.
8. Complete code for converting PDF page to image
from pdf2image import convert_from_path
poppler_path = r'C:\Program Files\poppler-0.68.0\bin'
pdf_path = "sample.pdf"
images = convert_from_path(pdf_path=pdf_path, poppler_path=poppler_path)
for count, img in enumerate(images):
img_name = f"page_{count}.png"
img.save(img_name, "PNG")
Category: Python