My first attempt to log EC2 instance names to PagerDuty and Airbrake broke most of our infrastructure. I failed to account for unpublished AWS rate limits, and when an unexpected volume of errors caused my code hit those rate limits, insufficient error handling led to an infinite loop when errors were thrown in our exception loggers.
I hope that this tutorial can save you some of my headache. I’ll walk you through how to use the
boto3 Python client to access the name of a running EC2 instance from that instance, and along the way I’ll include caveats and gotchas that will help you avoid some of my mistakes.
Most information about the instance is accessible with the boto3 Instance resource. To create that resource, we first need to retrieve the instance id and instance region.
AWS provides Instance Metadata and User Data via the url
http://169.254.169.254, which you can request from any running EC2 instance. In particular, we are interested in the Instance Identity Document, which is accessible at
r = requests.get("http://169.254.169.254/latest/dynamic/instance-identity/document")
response_json = r.json()
region = response_json.get('region')
instance_id = response_json.get('instanceId')
If you are not familiar with the
requests library, I would recommend checking out Response Status Codes, particularly the
raise_for_status function, as a starting point for error handling.
We can then use the instance id and region to retrieve the boto3 Instance resource.
ec2 = boto3.resource('ec2', region_name=region)
instance = ec2.Instance(instance_id)
instance_idbefore passing them to
The first step of boto3 error handling is to catch
BotoCoreError, both found in the
In my experience, the
boto3 client has pretty confusing error handling for invalid or
None region or instance ids. In addition to the errors mentioned above,
None values in either field will raise the Python built-in
ValueError. I would recommend that you do not attempt to use the
region && instance_id is false.
An instance’s “Name” is really an instance tag with the key “Name”. You can retrieve tags from the instance resource, and filter for
tags = instance.tags or 
names = [tag.get('Value') for tag in tags if tag.get('Key') == 'Name']
name = names if names else None
Because attributes are lazy-loaded, some invalid instance ids throw errors here
According to the boto3 documentation, resource attributes are lazy-loaded, meaning that the first API call is made when the attribute is first accessed. This means that while
None or empty strings are validated when creating the
ec2.Instance resource, non-empty string ids that are the right type but the wrong value will be validated here, with the first
DescribeInstances call. To combat this, you’ll want to attempt to catch the
botocore.exceptions Exceptions from the last section.
From the Open Guide To AWS section on EC2 gotchas and limitations:
❗If the EC2 API itself is a critical dependency of your infrastructure (e.g. for automated server replacement, custom scaling algorithms, etc.) and you are running at a large scale or making many EC2 API calls, make sure that you understand when they might fail (calls to it are rate limited and the limits are not published and subject to change) and code and test against that possibility.
boto3 client loads information about an instance with the
DescribeInstances API call. If you, for instance, make this API call to retrieve the instance name every time you log an error, you could easily hit the
DescribeInstances rate limit.
In addition to the error handling mentioned above, you will want to consolidate your calls to the AWS API to avoid hitting the unpublished AWS rate limits. Our solution was to fetch the instance name once at the startup of the API server, and cache the result in a global data structure. Instead of calling the EC2 API every time we need to log an error, we now call it only once when deploying new code to a machine.
Here is an example of what a final
get_instance_name function could look like.