Getting the Size of an S3 Bucket using Boto3 for AWS

I’m writing this on 9/14/2016. I make note of the date because the request to get the size of an S3 Bucket may seem a very important bit of information but AWS does not have an easy method with which to collect that info. I fully expect them to add that functionality at some point. As of this date, I could only come up with 2 methods to get the size of a bucket. One could list of all bucket items and iterate over all the objects while keeping a running total. That method does work, but I found that for a bucket with many thousands of items, this method could take hours per bucket.

A better method uses AWS Cloudwatch logs instead. When an S3 bucket is created, it also creates 2 cloudwatch metrics and I use that to pull the Average size over a set period, usually 1 day.

Here’s what I came up with:

import boto3
import datetime

now =

cw = boto3.client('cloudwatch')
s3client = boto3.client('s3')

# Get a list of all buckets
allbuckets = s3client.list_buckets()

# Header Line for the output going to standard out
print('Bucket'.ljust(45) + 'Size in Bytes'.rjust(25))

# Iterate through each bucket
for bucket in allbuckets['Buckets']:
    # For each bucket item, look up the cooresponding metrics from CloudWatch
    response = cw.get_metric_statistics(Namespace='AWS/S3',
                                            {'Name': 'BucketName', 'Value': bucket['Name']},
                                            {'Name': 'StorageType', 'Value': 'StandardStorage'}
    # The cloudwatch metrics will have the single datapoint, so we just report on it. 
    for item in response["Datapoints"]:
        print(bucket["Name"].ljust(45) + str("{:,}".format(int(item["Average"]))).rjust(25))
        # Note the use of "{:,}".format.   
        # This is a new shorthand method to format output.
        # I just discovered it recently. 
13 Responses to Getting the Size of an S3 Bucket using Boto3 for AWS

