Use Boto3 to open an AWS S3 file directly

In this example I want to open a file directly from an S3 bucket without having to download the file from S3 to the local file system.  This is a way to stream the body of a file into a python variable, also known as a ‘Lazy Read’.

import boto3
 
s3client = boto3.client(
    's3',
    region_name='us-east-1'
)
 
# These define the bucket and object to read
bucketname = mybucket 
file_to_read = /dir1/filename 

#Create a file object using the bucket and object key. 
fileobj = s3client.get_object(
    Bucket=bucketname,
    Key=file_to_read
    ) 
# open the file object and read it into the variable filedata. 
filedata = fileobj['Body'].read()

# file data will be a binary stream.  We have to decode it 
contents = filedata.decode('utf-8') 

# Once decoded, you can treat the file as plain text if appropriate 
print(contents)

And that is all there is to it. Be careful when reading in very large files. Also this example works will with text files. I use it alot when saving and reading in json data from an S3 bucket.

Good Luck.

Tagged , , , , , . Bookmark the permalink.

10 Responses to Use Boto3 to open an AWS S3 file directly

  1. Ayush Jain says:

    This worked for me when I replaced mybucket with ‘mybucket’ and the same for the filename.

  2. dhrumil says:

    : ‘utf-8’ codec can’t decode byte 0x8c in position 7: invalid start byte

    i am getting this error message while i am trying the read parquet file type

  3. dhrumil says:

    : ‘utf-8’ codec can’t decode byte 0x8c in position 7: invalid start byte

    i am getting this error message while i am trying the read parquet file type

  4. dhrumil says:

    : ‘utf-8’ codec can’t decode byte 0x8c in position 7: invalid start byte

    i am getting this error message while i am trying the read parquet file type

  5. Imre says:

    Thanks! Solved my problem easily

  6. Doesnt Matter says:

    You have an error on the line:
    contents = filedata.decode(‘utf-8’))

    Should be:
    contents = filedata.decode(‘utf-8’)

  7. Kapil says:

    filedata = fileobj[‘Body’].read()

    This line is throwing error for me always:

    file_object = self.client.get_object(Bucket=self.bucket_name, Key=self.get_mnp_checksum_file())
    log.info(f”File object : {file_object}, it’s type: {type(file_object)}”)

    file_content = file_object[‘Body’]
    > file_content = file_content.read().decode()(‘utf-8’)
    E TypeError: ‘str’ object is not callable

    Please help

  8. Gene M says:

    Aside from quoting the bucket name and input file path values, you MUST NOT include the leading slash in the input S3 file path.

Leave a Reply

Your email address will not be published. Required fields are marked *

Solve : *
25 × 19 =