Back in late 2016 I wrote an article called GraphQL: Tips after a year in production. Since then, GraphQL started offering native subscriptions, Relay got so good you could replace Redux with Relay Modern, and I learned a few neat tricks along the way. I’ve also made a bunch of mistakes. Looking back on my GraphQL journey, here’s what I’d change.
DataLoader is a small cache that is beautiful in its simplicity. Instead of caching the entirety of the GraphQL response, it caches database queries to be used in resolve functions. I put off implementing it in my app for fear that it was a premature optimization. Boy was I wrong. Looking back, I should have done it a lot sooner. Aside from the generous performance boost, it simplified my resolve functions by standardizing how I fetch things from my database. Less code = less room for me to write bugs. The only drawback is that the cache wasn’t designed to work for subscriptions. To fix that, I wrote my own little (100 LOCs) add-on package called dataloader-warehouse. Instead of caching data for each subscriber, it gives you the option to cache data for each publish, essentially turning an O(n) operation into O(1), which is nice.
If you have a real-time app (and who doesn’t? It’s 2018!) You’ve probably written a few GraphQL subscriptions to keep all the data on your page fresh without a pesky refetch. The biggest mistake I made in GraphQL is how I organized my subscriptions. I started out by building 1 subscription for each page view in my app, but that meant my back-end had to change whenever the front-end changed. Next, I tried breaking subscriptions into CRUD types for each entity, e.g. CreateTaskSubscription, UpdateTaskSubscription, DeleteTaskSubscription. That was awful for 2 reasons: I had 3x more code to maintain, and I still had to write hacks because sometimes I needed to know how it was updated. For example, was a single task deleted, or was a user deleted, which triggered 10 calls to DeleteTaskSubscription?
Finally, I arrived at something I call the Hybrid Strategy. It works by first breaking the mutation payload into a fragment.
fragment UpdateTaskMutation_task on UpdateTaskPayload {task {dueDate}}
Then, using the power of GraphQL, I include that fragment in both my mutation and my subscription:
mutation UpdateTaskMutation($task: Task!) {updateTask(task: $task) {error {message}...UpdateTaskMutation_task}}
subscription TaskSubscription {taskSubscription {__typename...CreateTaskMutation_task...DeleteTaskMutation_task...UpdateTaskMutation_task}}
Because the subscription shares the mutation fragment and handler, I’m guaranteed that if the mutation works, the subscription works. To learn more, see The Hybrid Strategy for GraphQL Subscriptions.
In the code sample above, you probably noticed that I included an error object in the response for updateTask
. It goes back to the timeless question, “If I succeed at failing, was I successful?”
Errors are the same way. I used to throw them, but that made it difficult to figure out if the error was something I threw or if it was something unexpected. If I threw it, I wanted to use it in a client-side error message, but if it was unexpected, I wanted a generic “Server Error” message to hide the gory details. By writing every mutation with a succeed-by-failing mentality, I can replace any thrown error with that generic message. I can also extend the returned error object to make it as helpful as possible, because nothing makes me hate an app more than hitting an error and not knowing how to fix it. To track what errors folks are seeing, you can even send an alert to your exception tracker whenever you return an error. I wrote a whole bunch on the topic in The Definitive Guide to Handling GraphQL Errors.
In my original post, I advocated for breaking your queries into folders by entity type. Since then, my app has grown from medium-sized to large, and the hierarchy didn’t scale with it. Some mutation files were growing to well over 1000 LOCs, which were just a pain to look at. Now, I advocate for a flatter hierarchy of 4 folders: 1 for each of your queries, mutations, subscriptions, and types. Each file contains a single query (or type) and life is a lot simpler. Sure, there are a a lot more files, but just get yourself an IDE that auto-imports as you type and you’ll never need to rummage through the folder. The only exception is for Connection and Edge types — I create those with a helper and export them from the same file as the base type.
The boilerplate advice is to use an interface for things that are related, and a union for things that aren’t, but have common fields (whatever that means). In practice, I tried using unions plenty of times, but always refactored them to interfaces. In fact, the only unions in my entire app are my subscription payloads (since they’re the amalgam of many mutations). Since interfaces can share fields, your queries will be cleaner since you can extract shared fields before you fragment on the specific types. Additionally, as your data structures get more complicated, you can sub-class them, which becomes very useful.
For example, let’s say I have a Vehicle
, which is either a Car
or a Truck
. Every Vehicle
has an Engine
, but a Car
has a CarEngine
and a Truck
has a TruckEngine
:
const vehicleFields = () => ({engine: {type: Engine}}
const Vehicle = new GraphQLInterfaceType({fields: vehicleFields})
const Car = new GraphQLObjectType({fields: () => ({...vehicleFields(),engine: {type: CarEngine}})})
const Truck = new GraphQLObjectType({fields: () => ({...vehicleFields(),engine: {type: TruckEngine}})})
Sidenote: By thunkifying my shared fields, it makes schema changes pretty painless.
Now, I can write my queries in a very concise manner, without the need for extra fragments. For example, if a truck engine is just a car engine with an auxiliary power unit, I can get everything I need with a single fragment, instead of having to fragment on both Engine
and Truck
.
vehicle {engine {horsepower}... on Truck {bedSizeengine {APU}}}
While it doesn’t look like much here, this makes component fragments in your app much cleaner. It also means Relay-generated flow types are as accurate as possible, which saves me from myself. You may be asking yourself, if 2 types are almost the same, why not just use the superset and leave a few fields blank? That was my strategy to avoid interfaces, and it served me for awhile, until it didn’t. Given enough time, your types will evolve to the point to where you’ll need to interface them. A good rule of thumb is if you know the best path forward, spend the extra time to do it right, if you don’t, pick the fastest path. For me, that means building interfaces from the start.
If you use typescript or flow, chances are you’ve found yourself building typings that look a lot like pieces of your schema. For the Relay folks, you already get bespoke flow types for every fragment you create, but those don’t include things like enums and query input variables. For things like these, I use gql2ts (the successor to gql2flow) to generate types for general use. You can even use these types on your Node server, which can be pretty helpful when writing more complex resolve functions. This single source of truth is a huge benefit because now when you extend an interface, you don’t have to remember to extend your flow type, too.
With these tips, I hope you manage to avoid some of the time-consuming pitfalls that got me. GraphQL continues to be a great pleasure to use, and that’s largely because of the growing, active community around it. There are still plenty of best practices I haven’t touched on here, including schema stitching, internal schemas, using GraphQL to transport OT/CRDT changes, persisted queries, etc. If you’ve found some neat patterns yourself, or think some of mine are baloney, be sure to let me know!