*.* A deep dive into boto3 and how AWS built it AWS defines boto3 as a Python Software Development Kit to create, configure, and manage AWS services. In this article, we'll look at how boto3 works and how it can help us interact with various AWS services. Photo by from Kindel Media Pexels Boto3 Under the Hood Both, and are built on top of --- a low-level Python library that takes care of everything needed to send an API request to AWS and receive a response back. AWS CLI boto3 botocore Botocore: handles , credentials, and configuration, session gives fine-granular to all operations ( ) within a specific service ( ), access ex. ListObjects, DeleteObject ex. S3 takes care of input parameters, , and response data into Python dictionaries, serializing signing requests deserializing provides low-level** clients** and high-level abstractions to interact with AWS services from Python. resource You can think of botocore as a and use . package that allows us to forget about underlying JSON specifications Python (boto3) when interacting with AWS APIs Clients vs. Resources In most cases, we should use boto3 rather than botocore. Using boto3, we can choose to either interact with lower-level or higher-level object-oriented abstractions. The image below shows the relationship between those abstractions. clients resource Level of abstraction in boto3, aws-cli, and botocore based on S3 as an example --- image by author To understand the difference between those components, let's look at a simple example that will demonstrate the difference between an S3 and an S3 . We want to list all objects from the images directory, i.e., all objects with the client resource prefix images/. Already by looking at this simple example, you can probably spot the difference: with a , you directly interact with **response dictionary **from a deserialized API response, client in contrast, with the , you interact with standard **Python classes **and objects rather than raw response dictionaries. resource You can investigate the functionality of **resource **objects using and : help() dir() Overall, the abstraction results in a more readable code ( ). It also handles many low-level details such as pagination. methods usually return a generator so that you can lazily iterate over a large number of returned objects without having to worry about pagination or running out of memory. resource interacting with Python objects rather than parsing response dictionaries Resource * Both, client and resource code, are dynamically generated based on JSON models describing various AWS APIs. For clients, AWS uses , and for resource a as a basis for auto-generated code. This facilitates quicker updates and provides a consistent interface across all ways you can interact with AWS ( ). The only real difference between the JSON service description and the final boto3 code is that PascalCase operations are converted to a more Pythonic snake_case notation.* Fun fact: JSON service description resource description CLI, boto3, management console ? Why is resource often much easier to use than client Imagine that you need to list thousands of objects from an S3 bucket. We could try the same approach we used in the initial code example. The only problem is that method will allow us to only list a . To solve this problem, we could leverage : s3_client.list_objects_v2() maximum of one thousand objects pagination While the paginator code is easy enough, abstraction gets the job done in just two lines of code: resource ? Why you will still use clients for some of your work Despite the benefits of resource abstractions, provide more functionality, as they map almost 1:1 with the AWS service APIs. Thus, you will most likely end up using both, and , depending on a specific use case. clients client resource Apart from a difference in functionality, are , so if you plan to use multithreading or multiprocessing to speed up AWS operations such as file uploads, you should use clients rather than resources. More on that . resources not thread-safe here There is a way to access client methods directly from a resource object: . Note: s3_resource.meta.client.some_client_method() Waiters Waiters are polling the status of a specific resource until it reaches a state that you are interested in. For instance, when you create an EC2 instance using boto3, you may want to wait till it reaches a "Running" state until you can do something with this EC2 instance. Here is a sample code that shows this specific example: Boto3: using waiter to poll a new EC2 instance for a running state--- image by the author Note that ImageId from the above example is different for each AWS region. You can find the ID of the AMI by following the "Launch instances" wizard in the AWS console: Finding the AMI ID --- image by the author A more common waiter example --- wait until a specific S3 object arrives in S3 Let's be honest. How often are you launching new instances? So let's build a more realistic waiter example. Imagine that your until a specific file . In the example below, we wait until somebody from the marketing department will upload a file with current campaign costs. While you could implement the same with AWS Lambda using an S3 event trigger, the logic below is not tied to Lambda and can run anywhere. Probably not that often. ETL process is waiting arrives in an S3 bucket Using waiter in S3 resource --- image by author Or the same using a more configurable method: client Using waiter in S3 client --- image by author As you can see from the code snippet above, using the client's waiter abstraction, we can specify: : how many times should we check whether a specific object arrived --- this will prevent zombie processes and will fail if the object hasn't arrived during the time we would expect it to arrive, MaxAttempts : the number of seconds to wait between each attempt. Delay As soon as the file arrives in S3, the script stops waiting, which allows you to do something with this newly arrived data. Collections Collections indicate a group of resources such as a group of S3 objects in a bucket or a group of SQS queues. They allow us to perform various actions on a group of AWS resources in a single API call. Collections can be used to: get with a specific object prefix: all S3 objects get all S3 objects with a specific content type, for example, to : find all CSV files get all S3 object versions: specify a **chunk-size **of objects to iterate over, for instance when the default page size of 1000 objects is too large for your application: in a single API call ( delete all objects be careful about that!): A more common operation is to delete all objects with a specific prefix: Sessions: How to pass IAM credentials to your boto3 code? There are many ways you can pass access keys when interacting with boto3. Here is the order of places where boto3 tries to find credentials: 1. Explicitly passed to , or : boto3.client() boto3.resource() boto3.Session() 2. Set as environment variables: 3. Set as credentials in the file ( ): ~/.aws/credentials this file is generated automatically using aws configure in the AWS CLI 4. If you attach** IAM roles** with proper permissions to your AWS resources, you don't have to pass credentials at all but rather assign a policy with required permission scopes. Here is how it looks like in AWS Lambda: This means that with attached to resources such as Lambda functions, you don't need to manually pass or configure any long-term access keys. Instead, IAM roles are dynamically generating , making the process more secure. IAM roles temporary access keys A useful feature of AWS Lambda is that boto3 is already preinstalled in all Python runtime environments. This way, you can run any of the examples from this article directly in your Lambda function. Just make sure to add proper policy corresponding to the service you want to use in your Lambda's IAM role: Creating a function in AWS Lambda--- image by author If you plan to run a number of Lambda functions in production, you may explore --- an observability platform that will help you monitor and debug your serverless workloads. It's particularly valuable for building automated alerts on failure, grouping related resources based on a or domain, providing an , interactively , and . Dashbird project overview of all serverless resources browsing through the logs visualizing operational bottlenecks The tool is completely free to use and only takes 2 minutes to set up -- and you can start exploring your data immediately. How to change a default boto3 session? Boto3 makes it easy to change the default session. For instance, if you have several profiles ( ), you can switch between those using a single line of code: such as one for dev and one for prod AWS environment Alternatively, you can attach credentials directly to the default session, so that you don't have to define them separately for every new client or resource. Conclusion In this article, we looked at how to use boto3 and how it is . We examined the differences between and and investigated how each of them handles . We explored how can help us poll for specific status of AWS resources before proceeding with other parts of our code. We also looked at how allow us to perform actions on multiple AWS objects. Finally, we explored different ways of providing credentials to boto3 and how those are handled using IAM roles and user-specific access keys. built internally clients resources pagination waiters collections Thank you for reading! References and further reading: AWS re:Invent 2014 | (DEV307) Introduction to Version 3 of the AWS SDK for Python (Boto) Boto3 documentation Also published on: https://dashbird.io/blog/boto3-aws-python/