How to convert PDF file to image using Python

gcptutorials.com Python

Learn how to extract a page from PDF as a JPEG or PNG file using Python. This post provides steps to convert PDF to an image using Python modules.

PDF to image using Python

In the article we are using Python's pdf2image module that requires Poppler PDF rendering library. Let's set up Poppler for Windows.

1. Download Poppler from this Link.

2. Extract the downloaded binary and place the extracted folder in C:\Program Files\.

Now poppler set-up is complete let's proceed with pdf2image module installation and writing python code.

1. Install pdf2image module.

   

pip install pdf2image

2. Import required module and function.

  
    
from pdf2image import convert_from_path

3. Define Poppler executable path.

   

poppler_path = r'C:\Program Files\poppler-0.68.0\bin'

4. Define pdf file path.

   

pdf_path = "sample.pdf" # Replace with path of PDF File

5. Convert PDF file to images using convert_from_path function.

  

images = convert_from_path(pdf_path=pdf_path, poppler_path=poppler_path)

6. Iterate over PDF pages and save each page as PNG image.

   

for count, img in enumerate(images):
  img_name = f"page_{count}.png"  
  img.save(img_name, "PNG")

7. After successful execution you should see image for each PDF page in current working directory.

8. Complete code for converting PDF page to image

   

from pdf2image import convert_from_path

poppler_path = r'C:\Program Files\poppler-0.68.0\bin'
pdf_path = "sample.pdf"

images = convert_from_path(pdf_path=pdf_path, poppler_path=poppler_path)

for count, img in enumerate(images):
    img_name = f"page_{count}.png"  
    img.save(img_name, "PNG")

Category: Python