In this article we will see in depth how to upload and download files to the s3 bucket, generate presigned urls to view and download files from the bucket, and do delete files from the bucket all using python.
What is boto3?
Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to write software that makes use of services like Amazon S3 and Amazon EC2.
We can install boto3 by
$ pip install boto3
after installing this we need to make sure we have certain things ready to get started. Let me list the prerequisites here.
- aws access key
- aws secret key
- bucket name
- bucket region
once we have all these we are good to go. Create an account in aws to get access key and secret key. Then go to s3 to create a bucket.
All the above process requires connection to be established first.
This is how we create a session with boto3. It is important to have your keys safe and protected. Make sure to have them in a separate file and import them here. In case of a django project have the keys in the settings file and import them from there. Now as we have established the connection we can proceed to the next steps.
Upload file to s3 bucket
We are using the following four parameters while uploading a file to the s3. They are
- Key (A key is unique identifier to a file. We can also use the key to retrieve the files. We are using random and string methods in python to generate a 10 digit unique strings like this[‘jqirKgzI4h’, ‘cMvkbMobUq’, ‘B5wDdcEiKa’]. You can increase the size of the string by increasing the value in the range() ).
- Body (Body will have the byte data of the file. Use the open method to get the byte data of the file).
- ACL or Access Control List lets you set the permissions for the files. If you want to make your file accessible directly from the s3 using the presigned URL you need to keep it as ‘public-read’, otherwise you cant be able to view your files directly using the presigned URL. But don’t worry about security we can set an expiry time for the presigned URL.
- ContentType is the type of the file. Since I am uploading a pdf file I am setting the ContentType as ‘application/pdf’
That is all you need to know about the upload function.
Download files from s3
The download_file function will take two arguments.
- key (The same key we used to upload the file)
- filename (The name with which the file has to be downloaded. The file will be downloaded in the current directory)
Generate presigned urls to view and download files
First of all let us understand what a presigned url is?
A presigned URL is a URL that you can provide to your users to grant temporary access to a specific S3 object.
Generate a presigned url to view the files
First let us create a presigned url to view the files. Anyone with this url can view the file directly from s3. Therefore it is necessary to provide an expiry time while creating the URL. The following code snippet will help you understand this.
The generate_presigned_url method from the client class helps us to generate a presigned_url to view the files. The parameters required are,
- ClientMethod – set this to ‘get_object’ (no need to go in detail)
- Params – A dictionary which has two keys, (Bucket – The name of the bucket, Key – The same key we used for uploading the files)
- ExpiresIn – value in seconds. After 600 seconds the url expires and others wont be able to see our file.
Generate a presigned url to download files
This is similar to the previous method. But, when a user click on this url the file starts downloading automatically. The code snippet for this function looks like this.
This snippet is almost identical to the above snippet the only additional key we use is ‘ResponseContentDisposition’. Add this to the Params to make the file downloadable by clicking the URL instead of viewing it.
Upload files to s3 using presigned url
What if we want others to upload files to our s3 bucket. This problem can also be solved by generating a presigned url. Any user with this url can upload files to our bucket.
Let us look at the code snippet.
We are using the generate_presigned_post to generate an url to post objects to the bucket. We are passing three params to this metho.
- Bucket – The name of the bucket to which the files are to be uploaded.
- Key – (A key is unique identifier to a file. We can also use the key to retrieve the files. We are using random and string methods in python to generate a 10 digit unique strings like this[‘jqirKgzI4h’, ‘cMvkbMobUq’, ‘B5wDdcEiKa’]. You can increase the size of the string by increasing the value in the range() ).
- ExpresIn – value in seconds. After 600 seconds the url expires and others wont be able to see our file.
Delete bucket files
we have come to the last sub topic in this article. We will see how to delete the bucket files using boto3. Here is the code snippet for this.
The delete_object() function can be used to delete the bucket files. We are providing two parameters to the functions.
- Bucket – The name of the bucket
- Key – The same key with which we uploaded the files to the bucket.
If we want to list the objects in the bucket we can use the following line of code.
objects = client.list_objects(Bucket=BUCKET_NAME)
If you want to delete files after certain period of time, you can look at the following code snippet for example. This code snippet will delete files which are older than an hour(3600 seconds).
That is it we have come to the end of this article. Thank you for making it all the way to the bottom. If you have any queries leave it in the comments, will try to reply as soon as possible.