GraphQL authorization with multiple data sources using AWS AppSync

I recently gave a talk at React Amsterdam on running GraphQL at scale, covering some different use cases and techniques that come up in production. One such scenario was related to authorization in a cascading manner using multiple data sources which motivated this article. This is a bit of an advanced subject so if you’re just getting started, I recommend starting with more introductory information here and here.

Granting access to data in GraphQL can be a tricky subject, with multiple strategies and emerging best practices starting to pop up as the technology becomes more widely adopted. GraphQL gives you powerful techniques to enforce different authorization controls for use cases like:

Completely public API
Private and Public access to sections of an API
Private and Public records, checked at runtime on fields
One or more users can write/read to a record(s)
One or more groups can write/read to a record(s)
Everyone can read but only record creators can edit or delete

Fundamentally when implementing access control in any system, some metadata must exist about who or what can access a resource. The classical way of defining this is with Butler Lampson’s Access Control Matrix where granted permissions are the intersection of rows and columns comprised of resources and actors (which can be users, roles, groups, etc.). When interacting directly with a database this “authorization metadata” many times is a part of that system and the access control is performed at connection time or runtime. In a service architecture fronted by a GraphQL API, a level of indirection exists where database interactions are happening on behalf of the caller. GraphQL also allows authorization to take place at the field level for partial results to be returned on a query. While this may seem daunting at first, in fact you end up with powerful controls which allow you to store authorization data on a resource, in a separate data source, or mix/match for combinations of controls. You can even cascade checks at different levels to meet your unique business needs.

This article demonstrates how you can use GraphQL techniques for authorization using AWS AppSync, however these strategies can be applied to custom GraphQL solutions as well. While AppSync has a first class concept called a “data source”, the generalization can be applied to GraphQL resolvers as the location they use to fetch data. Using the above description on performing access control checks, you can see in the below diagram how this can be done by storing the metadata for authorization on a resource (the “author” column of a database record) and threading identity information of the caller through a request to perform a conditional check when a GraphQL resolver is invoked. Read more here.

While many scenarios for access control can use a single data source, some authorization use cases require the use of multiple data sources. For instance in AppSync you might do this because you want to first perform some logic in a Lambda function before fetching or writing data to a DynamoDB table, or it could be to first perform a lookup such as allowing only people who you are “friends” with to read your data (1:many checks).

GraphQL makes this possible by using resolvers on your fields and walking the application graph, fetching data from different data sources and performing authorization checks where appropriate.

You can use this technique along with the built-in fine grained access controls of AWS AppSync for many advanced scenarios. For example while fine grained access controls allow you to perform conditional logic inside the resolver, such as user or group checks, nested resolvers and aggregated context allow you to perform this logic using results from the parent object. To demonstrate this we’ll start with a simple use case.

Nested resolvers

In the AWS AppSync fine grained access control examples, authorization metadata is stored directly on the resource (such as an attribute for an item in a table called “owner”). For demonstrative purposes, suppose these were separate tables and I want to read data in a “Data” table but first run an authorization check against data in an “Auth” table which lists who is the “Owner” of an item in the “Data” table. Only the “Owner” listed in the “Auth” table has rights to read the corresponding information in the “Data” table (the IDs in each table have an implicit association). The table layout is in the below image with some records and pseudocode of how you would want the logical check to work.

In this layout, user Nadia would have access to items 1&3 in the “Data” table while Shaggy has access to item 2. To use separate data sources, you can “nest” the data you want to retrieve inside a GraphQL type that gets your authorization metadata from a separate data source and essentially walk the object graph in your query. An example GraphQL schema to perform this nesting is below.

type AuthCheckData {

id: ID!

data: Data

}

type Data {

id: ID!

title: String

content: String

}

type Query {

getData(id:ID!): AuthCheckData

}

In the above GraphQL schema the getData query will invoke a resolver against the Auth table and return a type AuthCheckData. Then a resolver set on the data:Data field will have access to the returned data of it’s parent field to perform authorization checks. A diagram of the flow is below.

The getData resolver request template is fairly standard — it actually only needs to run a GetItem against the Auth DynamoDB data source with no other arguments:

{"version": "2017-02-28",

"operation": "GetItem",

"key": {

"id": $util.dynamodb.toDynamoDBJson($ctx.args.id),

}

The response template is just passing through the data:

$util.toJson($context.result)

Since the whole result is being returned, any of the attributes from the Auth table (in this case Owner) are returned and available for the child fields to use in a resolver template via the $context.source object (aliased as $ctx.source).

This is where things get interesting. Set a resolver on data: Data and again use a GetItem, but now use the ID from the parent with $ctx.source.id in the resolver request template:

{

"version" : "2017-02-28",

"operation" : "GetItem",

"key" : {

"id" : { "S" : "${ctx.source.id}" }

}

This is important because you’re essentially ensuring that the authorization metadata from the parent check matches the data you’re performing validation against, which just a single argument passed into your GraphQL query. Now you can filter the results in the response template for the data: Data resolver to only return values from the Data table if the Owner (available in $context.source.Owner from the Auth table) is equal to the current user, as seen in the below resolver response template:

#if ($context.source.Owner == $context.identity.username)

$util.toJson($ctx.result)

#else

$utils.unauthorized()

#end

At this point a GraphQL query of getData(id:1) or getData(id:3) can be run and if Nadia is logged in she’ll see her data. The query would look similar to this:

query {getData(id:1){data{idtitlecontent}}}

If Nadia runs a query of getData(id:2) she’ll get an unauthorized message, but if Shaggy runs that query he’ll see the data as he’s listed as the owner in the Auth table.

AWS Lambda Authorizer

Suppose you have more complex business logic for authorization, token validation, or you need to interact with a data source that AWS AppSync does not yet support. However, you want to return results from an DynamoDB table if an authorization check passes. You can use the same architecture as before but this time, have the getData() resolver use an AWS Lambda function for the first layer of access to perform authorization and pass the results to your DynamoDB resolver.

To set this up use the same schema from earlier, but now add a new AppSync data source of an AWS Lambda function. If you’re not familiar with setting up and using Lambda functions in AWS AppSync take a look at this tutorial first. You’ll need to create the Lambda function first before using it with AppSync, for this example use the below function written in Node.JS:

'use strict';

exports.handler = (event, context, callback) => {console.log(event);

let valid = allow(event);  
  
switch(event.field) {  
    case "getData":  
        var id = event.arguments.id;  
        if (valid) {  
            callback(null, {id: event.arguments.id});  
        } else {  
          let result = {};  
          result.errorMessage = "Error with the authorization";  
          result.errorType = "AUTHORIZATION\_ERROR";  
          callback(null, result);  
        }  
        break;  
    case "addData":  
        // Write similar authorization check here  
        callback(null, event.arguments);  
        break;  
    default:  
        callback("Unknown field, unable to resolve" + event.field, null);  
        break;  
}

};

function allow(event) {const allowedKeys = ["abcdef", "ghijkl"];const allowedUsers = ["Nadia", "Caesar"];const allowedIps = ["192.168.0.1"];

if (allowedUsers.includes(event.identity.username))  
  return true;  
else if (allowedKeys.includes(event.request.headers.x-api-key))  
  return true;  
else if (allowedIps.includes(event.identity.sourceIp))  
  return true;  
else  
  return false;

}

The code above has an allow() function that authorizes requests from a set of Users, API Keys, or IP Addresses. If any of them is a match from then authorization will be allowed. For instance if a client passes in a valid API Key, is in a whitelisted IP address list, or is one of a valid list of users then access will be granted. You can see a full list of identity object threaded into the context object of an AppSync request here. If the client calling your GraphQL API doesn’t pass these checks then the Lambda will return an errorType of AUTHORIZATION_ERROR. This is a fictitious example, but the point is that you can perform your custom authorization logic in your Lambda and cascade the data fetching invocation to other resolvers, which you’ll do next, without needing to put everything in a Lambda function.

Change the resolver on getData(): AuthCheckData to use this AWS Lambda data source. The resolver request template needs to specify the GraphQL field being invoked, and also the identity and arguments:

{"version" : "2017-02-28","operation": "Invoke","payload": {"field": "getData","identity" : $utils.toJson($context.identity),"arguments": $utils.toJson($context.arguments)}}

However, unlike with the previous scenario where you passed down $context.source.Owner from the parent and did the authorization check in the child, with the Lambda returning the result.errorType directly you can have your check on the resolver response template of the getData() itself:

#if ($context.result.get("errorType") == "AUTHORIZATION_ERROR")$util.unauthorized()#else$util.toJson($context.result)#end

With a pattern like this you’re essentially “wrapping” all of the auth logic in the parent getData() resolver so that the child resolver strictly focuses on data fetching and the separation of concerns is cleaner. Now the resolver request template on the field AuthCheckData:data still does a lookup with the data source on the “Data” table using $ctx.source.id (since that’s what the Lambda returned in a valid authorization case) but the response template is simply:

$util.toJson($ctx.result)

The nice thing about this pattern is the parent resolver, getData(), essentially “wrapped” the authorization logic and only invoked the child to return data if it passed a validity check.

Only let my friends read

Another very common use case with authorization is where an individual entity will allow several other entities to access a resource in a 1:MANY manner. In social media applications this might be stated as “only allow Nadia’s friends to see her information”. The typical way that you model this using DynamoDB would be to have a “friendship table” that does the check before accessing the actual record. The friendship table would have a composite key comprised of the user ID and the friend ID.

Create a “Friend” table in DynamoDB with a Primary Key of Username and a Sort Key of Friend. Use strings for both. For convenience, also add a column of Valid so that you can quickly switch of someone is no longer a friend (you could always delete the row too). Add this table as an AWS AppSync data source and edit your schema to have a query type of friendGetData(id:ID!, friend:String!) like below.

type Query {getData(id: ID!): AuthCheckDatafriendGetData(id: ID!, friend: String!): AuthCheckData}

The idea is a user can access a record if they happen to be friends with someone. To keep things simple we will pass in an ID of a record we wish to receive from the same Data table as before, but first do a friendship check against the new Friend table. The request template for friendGetData() looks like the following:

{"version" : "2017-02-28","operation" : "GetItem","key" : {"Username" : { "S" : "${ctx.identity.username}" },"Friend" : { "S" : "${ctx.args.friend}" }}}

This will return the attribute of Valid from your table if you are a friend with the person. Like the last example, we’ll keep the authorization logic in the top level resolver’s response template to keep responsibilities separate. However unlike the last two cases, the “Friend” table doesn’t have a primary key of “ID” so it won’t get automatically passed in the source object. Since we want to use the id passed as an argument for a key lookup in the “Data” table when the child resolver is invoked, we also add this to the context object in the response template:

$util.qr($context.result.put("id", "$context.arguments.id"))#if ($context.result.get("Valid") == "TRUE")$util.toJson($context.result)#else$util.unauthorized()#end

Now we can access the id as in the other examples through $context.source.id. If you run a query:

query friend {friendGetData(id:1 friend:"Nadia"){data{titlecontent}}}

This runs successfully for logged in users whom Nadia is friends with.

Mutations

All of the examples so far have covered queries as conceptually, I wanted to demonstrate the controls that GraphQL gives you to manipulate your authorization schemes. You can apply these techniques to mutations as well, however there will be a subtle difference in that you might need to perform conditional checks in the request pipeline of your resolver rather than a filter on the response. The way that you control this will be very specific to the data source implementation that is used for your mutation.

For example, AWS AppSync supports several data sources including Amazon DynamoDB which supports condition expressions that can be evaluated by the database engine itself. If you think back to the patterns shown in this article, one showed how you can pass authorization metadata from the parent to the child and make a decision vs. “wrapping” all of the authorization decisions in the parent resolver. If your design is using the first technique then you won’t be able to apply authorization logic on the response of a database operation as the write would have already happened. Instead, you’ll need to use that database engine’s execution criteria along with the metadata passed from the parent resolver. In a DynamoDB resolver with AppSync you would add something similar to the following to your resolver request template:

"operation" : "UpdateItem","condition" : {"expression" : "contains(Edit,:canEdit)","expressionValues" : {":canEdit" : { "S" : "${context.source.canedit}" }}}

The above uses the owner from the parent resolver and at runtime if an attribute of canedit was passed down to the child resolver. Several examples can be found here. Of course it may be the case that your database implementation doesn’t support these capabilities, isn’t performant for these types of runtime evaluations, or your schema design is simply to separate the authorization logic into the parent in which case the other method can still be used.

Bonus: Mocking and Testing

When building out authorization schemes, it can tricky to mock the scenarios and test out different flows for multiple users or groups to see how the authorization rules will actually run in production. You should look to implement mocking and simulation techniques for authorization with any system which will return data to clients. AWS AppSync provides a few different tools for this. First, AppSync supports a full test and debug flow that allows you to mock the GraphQL request & response context. You can use this to see what the behavior will be with different scenarios and information passed or received in resolvers. For instance if I edit the getData resolver from the AppSync console, and select the Select test context button I can create a mock context object simulating the user as well as any response. Using the Lambda resolver from earlier I’ll pass in the AUTHORIZATION_ERROR result to simulate an unauthorized request:

Pressing the Test button in the console will evaluate the request object, including field and identity information, as well as perform any conditional checks. If the logic results in an unauthorized request that will be printed to the screen:

After mocking your data, you can also run this “live” from the Queries page of the AppSync console which can live stream results from Amazon CloudWatch Logs. But don’t stop there, the console also allows you to login with a valid user account from Amazon Cognito User Pools to perform a real authorization check:

You can use this to test conditional rules against user accounts, groups, claims or other properties of the identity context object.

Summary

These are just a few examples of using GraphQL along with the existing AWS AppSync authorization use cases and techniques. Security controls can be a complex subject in general, so it’s always best to take a look at all of the options and when possible, start simple and only add more when your business requirements change.

Richard Threlkeld (@undef_obj) works at AWS Mobile and was part of the teams that launched AWS AppSync and AWS Amplify,

All opinions expressed herein are my own.