How to Write and Delete batch items in DynamoDb using Python

In this tutorial we will learn how to write or delete DynamoDB table items in batches using Boto3 batch_writer API.

Prerequisite


Before starting this tutorial, follow the steps in How to create a DynamoDB Table.

If you have followed steps mentioned in How to create a DynamoDB Table, you should be having a dynamoDB with following attributes.

  • Table Name: sample-movie-table-resource
  • Partition Key: year
  • Sort Key: title
Download Data

Now, let's download a sample JSON file containing movies data from here moviedata.zip

Extract Data

Create a new folder with the name tutorial and extract the zip file and place the moviedata.json in the newly created folder and open the folder in vscode.

   
  PS C:\Users\welcome\Downloads> Expand-Archive moviedata.zip
  PS C:\Users\welcome\Downloads> copy .\moviedata\moviedata.json .\tutorial\
  PS C:\Users\welcome\Downloads> cd .\tutorial\
  PS C:\Users\welcome\Downloads\tutorial> code .
  PS C:\Users\welcome\Downloads\tutorial>
   
Item Attributes

Let's have a look at a sample item from the moviedata.json, it has year as an integer attribute, title as a string attribute which are the HASH and RANGE key for the table respectively.

   
{
  "year": 2013,
  "title": "We're the Millers",
  "info": {
      "directors": ["Rawson Marshall Thurber"],
      "release_date": "2013-08-03T00:00:00Z",
      "rating": 7.2,
      "genres": [
          "Comedy",
          "Crime"
      ],
      "image_url": "http://ia.media-imdb.com/images/M/MV5BMjA5Njc0NDUxNV5BMl5BanBnXkFtZTcwMjYzNzU1OQ@@._V1_SX400_.jpg",
      "plot": "A veteran pot dealer creates a fake family as part of his plan to move a huge shipment of weed into the U.S. from Mexico.",
      "rank": 13,
      "running_time_secs": 6600,
      "actors": [
          "Jason Sudeikis",
          "Jennifer Aniston",
          "Emma Roberts"
      ]
  }
}
   

batch_writer


batch_writer creates a batch writer object. The batch_writer writer automatically handles buffering and sending items in bathes. batch_writer also automatically handles any unprocessed items and resend them for processing.


Using batch_writer to put items

Create a new file demo.py inside the tutorial directory and copy the below code snippet.

   
import json
from decimal import Decimal

import boto3


dynamodb_resource = boto3.resource("dynamodb")
table_name = "sample-movie-table-resource"
file_path = "moviedata.json"
table = dynamodb_resource.Table(table_name)


def read_json_data(file_path):
    movies_data = []
    with open(file_path) as f:
        movies_data = json.loads(f.read())
        print(type(movies_data))
        print(len(movies_data))
    return movies_data[:100]


def write_in_batches(batch_items):    
    with table.batch_writer() as batch:
        for item in batch_items:
            item = json.loads(json.dumps(item), parse_float=Decimal)
            batch.put_item(Item=item)
          
    

if __name__ == "__main__":
    movies_data = read_json_data(file_path=file_path)
    write_in_batches(batch_items=movies_data)
    
    
   

In the above code snippet read_json_data function reads data from sample file and returns only the first 100 items for demo.

write_in_batches function than writes the batches of items in dynamoDB tables using table.batch_writer which takes care of buffering and processing of unprocessed items.


Using batch_writer to delete items

Create a new file delete_demo.py inside the tutorial directory and copy the below code snippet.

   
import json
from decimal import Decimal

import boto3


dynamodb_resource = boto3.resource("dynamodb")
table_name = "sample-movie-table-resource"
file_path = "moviedata.json"
table = dynamodb_resource.Table(table_name)


def read_json_data(file_path):
    movies_data = []
    with open(file_path) as f:
        movies_data = json.loads(f.read())
        print(type(movies_data))
        print(len(movies_data))
    return movies_data[:100]


def write_in_batches(batch_items):
    with table.batch_writer() as batch:
        for item in batch_items:
            item = json.loads(json.dumps(item), parse_float=Decimal)
            batch.put_item(Item=item)


def delete_in_batches(batch_items):
    batch_keys = [
        {"year": item["year"], "title": item["title"]} for item in batch_items
    ]
    with table.batch_writer() as batch:
        for key in batch_keys:            
            batch.delete_item(Key=key)


if __name__ == "__main__":
    movies_data = read_json_data(file_path=file_path)
    #write_in_batches(batch_items=movies_data)
    delete_in_batches(movies_data)
    
    
    
   

In the above code snippet read_json_data function reads data from sample file and returns only the first 100 items for demo.

delete_in_batches function than retrieves the keys from batches of items and deletes the items using batch.delete_item in table.batch_writer which takes care of buffering and processing of unprocessed items.


Follow US on Twitter:

Category: Python

Python For Beginners