Error handling with API Gateway and Go Lambda functions

This is a follow up to our previous article on Managing multi-environment serverless architecture using AWS API Gateway. To see an implementation of the approach described in this article, see the sample project on GitHub.

Error handling in microservices or serverless architectures can be tricky. Different components may be integrated using different protocols and be built using different stacks, and yet any client facing error responses should honour the same API contract.

The API Gateway pattern relies on a facade/gated proxy service.

In this article we are going to look at a simple serverless To-Do app built using SAM (Serverless Application Model), AWS CloudFormation, API Gateway and Lambda functions written in Go.

Our goal is to meet the following requirements in relation to error handling:

Consistent error responses, regardless of the error’s type or origin
Clear separation of external and internal interfaces
No leaking of private error details

Client facing errors

This is our API contract for error responses. The clients expect any non 2XX responses to contain an application/json body with this shape.

AWS API Gateway

Let’s investigate the request flow with AWS API Gateway and AWS Lambda. As we can see below, there are two sources for errors, the API Gateway itself (Gateway Responses) and the integration function (Integration Responses).

Errors can originate from various sources within a serverless application

Gateway responses

Gateway Responses represent errors that occur before reaching the integration (such as access control errors, internal configuration errors, etc), or when the integration response cannot be mapped to a method response. These can be customised to fit our error schema by using a simple mapping template. Here’s the relevant section of our CloudFormation template.

AWS provides a full list of response types that can be used to define response mappings.

For the purpose of this example we chose to map only the catch-all 4XX and 5XX types. In the case of 4XX errors we return error.responseType as our code, and error.messageString as our message, which will provide validation and access control error details to clients. But in the case of 5XX errors we hardcode the code and message in order to avoid leaking internal configuration error details to clients.

Integration responses

The strategy for handling errors returned by a Lambda function is dependent on how the function is integrated with the endpoint, which can be done using either a proxy integration or a custom integration. The former proxies HTTP requests to the Lambda whereas the latter decouples the function from the original HTTP request further and completely relies on request and response mappings.

Our Lambda handlers use the custom integration type. This allows us to write handlers with clearly defined inputs and outputs, without any knowledge of the HTTP request initially made to the API gateway.

Not only does this keep the function simple, it also makes it easier to invoke and test our command with a simple event payload and inspect the result. We found sam-local to be a useful tool during development and leverage it for integration testing, as will be explained later.

Lambda error responses

AWS Lambda uses its own error schema which can later be inspected and modified by API Gateway. This is why the following Lambda handler may not do what you expect.

Instead of outputting a simple error string, the Go Lambda runtime wraps the message string in a custom error type which results in the following response.

The Lambda error response includes the actual type of the original error, see Go source, and the error value.

However we want to retain error codes and reliably separate private and public error details. In order to do that our handler needs to return a structured error of an error type that produces a json encoded string upon a call of its Error() method, thereby resulting in a json-within-json integration response.

We introduced a lambdaError type, meant to be used by a Lambda handler function to wrap errors before returning them.

The error value is the JSON encoded structured error.

As you can see, our internal error schema contains a code, a public_message and a private_message. The code will be useful for matching on the gateway later, the public_message is a human readable string that does not leak any technical details, and the private_message is the detailed error string.

Invoking our handler now returns the following error response.

The handler does not know how this response will be processed by the gateway before being sent to the client.

This is not particularly elegant, but it’s the only way to return a structured error message when using custom Lambda integration in API Gateway.

Response mappings in API Gateway

A response received from the Lambda function then gets mapped by API Gateway in order to conform to our external API contract. To map errors we rely on integration response mappings, which regular expressions, to match error codes.

Our example app’s matching strategy is to first create a matching rule for the absence of error (success response) and then for any expected errors that can be proxied to the client, and finally a catch-all for unexpected errors that simply maps to a hardcoded 500 Internal error.

Below is a snippet from the template that deals with integration response mapping, which deals with ‘not found’ errors.

Note that a 404 method response must also be defined for that endpoint.

The regular expression matches a specific error code inside the errorMessage string value (which happens to be our json encoded structured error), then maps it to a 404 status code and uses the velocity template language to decode the json string and fit it to our expected output shape, keeping out the private message, which still gets logged.

The resulting error returned by the gateway to our client is just as we expect.

The most important thing to note here is that we can manipulate the integration errors in any way that we see fit, for example, changing a code like TASK_NOT_FOUND to RESOURCE_NOT_FOUND if that’s what the client expects.

Below is the flow diagram from before, now annotated with the error payloads at different stages:

Conclusion

The good news is that we were able to fulfil our error handling requirements using AWS API Gateway and Lambda, but we did find that aspects of the solution have some drawbacks.

The json-within-json hack used to return a structured error along with the regex matching seem particularly brittle. This could perhaps be ameliorated by the introduction of support for a protocol such as gRPC on the integration side.

In the end the trade-off is some clunkiness vs the time that it would take to roll out your own gateway service.

One more thing: Integration testing with SAM Local

Full integration testing including all gateway mappings, validation, authorisers, etc. would have to be done in a testing CloudFormation stack. However, SAM Local helps considerably with testing the Lambda responses and gives us a sense of how our handler command will behave when executed in the real Lambda environment.

Using sam local invoke we can execute a handler cmd in a local environment by providing it with an event file similar to what API Gateway would send it.

We decided to automate this process by writing tests that execute sam invoke with a prebuilt cmd binary and check the response payloads.

You can find our helper invoke function here.

Note: SAM Local is also able to bring up a local API gateway however custom integration is currently unsupported which meant that we couldn’t use that functionality.

This investigation was carried out with Christian Klotz at 2PAx (a startup that aims to revolutionise how restaurants allocate covers). Thanks to Christian for helping with this article.

References

Sample App Sourcehttps://github.com/smalleats/serverless-todo-example
AWS Lambda Gohttps://github.com/aws/aws-lambda-go
AWS SAM Localhttps://github.com/awslabs/aws-sam-local
Dave Cheney’s error packagehttps://github.com/pkg/errors
VTL Referencehttp://velocity.apache.org/engine/devel/vtl-reference.html
Set up API Gateway Request and Response Data Mappings https://docs.aws.amazon.com/apigateway/latest/developerguide/mappings.html
API Gateway Mapping Template Reference https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html
Set up Gateway Responses to Customize Error Responses https://docs.aws.amazon.com/apigateway/latest/developerguide/customize-gateway-responses.html
Domain errors discussion https://softwareengineering.stackexchange.com/a/351062
API Gateway pattern http://microservices.io/patterns/apigateway.html