How to use pandas in AWS Lambda

pandas is an open source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. pandas library is by default not available in AWS Lambda Python environments. If you try to import pandas in aws lambda function, you will get below error.

    
  import pandas
  
  def lambda_handler(event, context):
  
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }
  
  # Output 
  
  Response
  {
  "errorMessage": "Unable to import module 'lambda_function': No module named 'pandas'",
  "errorType": "Runtime.ImportModuleError",
  "requestId": "a9dfd983-8cd6-4fe2-85c3-5e107ac230a4",
  "stackTrace": []
  }
  
    
  

For using pandas library in Lambda function a Lambda Layer needs to attached to the Lambda function. This tutorials lists the required steps for creating and attaching Lambda Layer for pandas module.

Note: Step 1 to Step 6 needs to performed on EC2 instance which uses the same Amazon Linux version as AWS Lambda to have proper dependencies.The steps for this tutorial as are performed with Python 3.9.7, to follow the steps make sure you are using Python 3.9.7
Step 1: Create Python Virtual Environment
  
  
  python3.9 -m venv test_venv
  
  
  
Step 2: Activate Virtual Environment
  
  source test_venv/bin/activate
  
  
Step 3: Check Python Version
  
  
  python --version  
  
  
  
Step 4: Create directory with name python
  
  
  mkdir python
  
  
  
Step 5: Install pandas library in python directory created in Step 4
  
  
  pip install pandas -t python  
  
  
  
Step 6: Zip python directory
  
  
  zip -r pandas.zip python
  
  
  
Step 7: Login to AWS account and Navigate to AWS Lambda Service.

Step 8: In AWS Lambda select Layers from Additional resources.

Step 9: Click on create layer, enter the required information.
  • Name: pandas_layer
  • Description: Lambda layer for pandas module
  • Select Upload a .zip file, click on upload and choose pandas.zip created in Step 6
  • Compatible architectures - optional: x86_64
  • Compatible runtimes - Choose run time as per the python version from output of Step 3

Step 10: Click on Create
Step 11: Navigate to AWS Lambda function and select Functions

Step 12: Click on Create function
Step 13: Select Author from scratch
Step 14: Enter Below details in Basic information
  • Function name: test_lambda_function
  • Runtime: choose run time as per the python version from output of Step 3
  • Architecture: x86_64
Step 15: Click on create function
Step 16: In the Function overview pane click on Layers or Scroll down to select Layers section

Step 17: Click on Add a layer

Step 18: Select Custom layers , choose layer created in Step 9, select version 1 and click on Add.

Step 19: Write below code in lambda function and click on Deploy
  
  
  import logging
  import pandas as pd
  
  logger = logging.getLogger()
  logger.setLevel(logging.INFO)
  
  
  def lambda_handler(event, context):
    logger.info(pd.__version__)
    
  
  
  
Step 20: Click on Test, enter any name for Configure test event and click on create

Step 21: Click on Test again, you should see pandas version in the output.

If you don't want to create you own layer than you can directly use lambda layers from this link.

Category: AWS