The upside down world of DynamoDB

DynamoDB can be hard to grasp if you come from the relational database world — but the benefits for serverless apps are too good to ignore.

In the path to serverless, one of the hardest parts has been getting to grips with DynamoDB, the mysterious Amazon NoSQL database that’s so simple to learn it takes six months to master. It’s a beguiling, frustrating, wondrous tool that alternately has me crying with joy or yelling obscenities at my mouse.

I’ve worked all my career with various relational databases, so it’s been to hard to walk away from baked-in patterns and practices and try to think about data in a completely different way. I’ve read many pieces about how it wasn’t for everyone, possibly isn’t for anyone, and if you try it, you’ll end up switching back to RDS. But we stuck with it and found it was essential for our serverless data needs.

Credit: https://dzone.com/articles/cloud-data-services-sprawl-its-complicated

Why not just use RDS?

There are a few reasons why relational databases and serverless don’t mix so well. First, if your function scales up dramatically — and it can, that’s the whole point — classic databases aren’t typically designed to handle large numbers of new connections appearing.

Back in the good ol’ days of when we had servers, these connections were shared by many end-users since the middle layer might use a single connection per server. This isn’t necessarily the case in the FaaS world, where your provider can spin up new instances of your function as needed and you might more connections appearing than you expected.

Second, database connections use ‘keep alive’ pings to, well, keep the connection alive — but your Lambda function freezes when it’s not in use so this stops. To the database, this looks like the connection has gone away when it stops responding. In using RDS, you’re forcing a connection method that natively just doesn’t work well with Lambda.

Third, relational databases don’t scale automatically. Many have phenomenal architectures and ways to handle scaling but it’s not automatic. If you’re in the business of building scalable web/mobile apps, you can spend much more time managing a database that you might think. DynamoDB is a truly managed option when the design is right, and you’ll really appreciate this if you’re coding an app where the usage is spiking.

Still, RDS usage is probably fine for low-use serverless apps but if you have any potential to scale up quickly, it’s all a little tenuous, compared with the drop-in “it works” world of DynamoDB. It’s possible that Aurora Serverless may bridge this gap when it gets out of technical preview, but so far there’s no indication that the interaction will be different.

The benefits of DynamoDB for serverless.

The original DynamoDB paper from 2007 explains the main objectives of the platform in Amazon’s e-commerce store. High availability and consistent performance are the top two features, and these are achieved at the cost of ‘traditional’ database properties such as consistency and transaction support.

As a result, DynamoDB shines when providing data support in serverless apps where the data needs are separated by user — user profiles, shopping carts, private collections of items. Your users can access and update these items at practically unlimited scale and your only choke point will be the throughput allocated on a table (which itself is auto-scalable).

If you’re planning to launch a web or mobile app with potentially rapid growth, DynamoDB’s primary design features will serve you well. You don’t have to worry about managing connections, scaling or latency — it will, for the most part, ‘just work’.

Still you have to be careful where users are accessing common data, even in simple cases— lookup tables or counters, for example — since the partitioning and throughput model can cause some unexpected side-effects.

The World’s Fastest Primer on DynamoDB.

DynamoDB is designed to give you very consistent performance in an almost maintenance-free way if you create the tables properly, so it’s a natural fit for serverless apps (yet surprisingly hard to get right):

Compared to relational databases: tables are tables, records are items and fields are attributes. An attribute can be a flat or nested JSON structure. Each item can have different attributes but no attributes can be a zero-length string.
The size of the table is potentially unlimited but access patterns are important to make that work well (the ‘hot key’ problem).
You can change or update any data except the partition and sorts keys, and your choice of those keys is a critical design choice. The keys form a hash that you need to ensure is randomly accessed throughout your data set, ideally.
You pay for throughput (RCUs/WCUs) and storage space used. When you query on an item, the size of the item matters since you’ll be charged for the whole thing (even if you filter out or return a subset of the data).
Local Secondary Indexes must use the original partition key (but different sort key) and must be defined when the table is created. They also share the table’s throughput (often forgotten).
Global Secondary Indexes can be created later on a new hash key but use their own throughput. Hashes do not have to be unique. The naming is unhelpful and neither index works the way you would expect.
There are no traditional SQL query operators like sum, average or count, no group by clauses. Mostly it’s select something (projections) from somewhere (tables) where some condition is satisfied (partition ID, sort key and filters). And then the rest is up to you.
Consistency is something you pay for if you really need it. Without it, you might find a query returns stale data if an update just happened. In many cases this doesn’t matter but you wouldn’t want to run a banking app this way.

Some query patterns work really well — shopping carts, for example. You have a single item in a carts table mapped to a user ID, containing a nested JSON list of items in the cart. It’s great because it will scale easily, the lookup will be blazingly fast and the eventual consistency model shines in this kind of use case.

Even some classic relational database patterns are fairly obvious — like returning a list of orders for a customer. Here you simply make the customerId the partition key, the orderId the sort key, and you can quickly pull the list of orders.

But… if you want to know how many orders or the average size of an order… well, we can’t do that. In fact we can’t even tell you the total number of items in the table without a table scan or an insanity-induced counter system. Welcome to the upside-down world of DynamoDB.

Think of your DynamoDB table as a filing cabinet

Coming from relational databases, I love tables and believe they can solve most of the world’s ills (well almost). But what DynamoDB calls a table doesn’t seem much like a table — it doesn’t have columns, required values (beyond the keys) and the indexing behavior reveals it’s not much like a table. I think it’s more like a filing cabinet.

Imagine this: DynamoDB is run by the world’s fastest librarian — Alice — who bends the laws of physics and the Dewey Decimal System, capable of finding and organizing your data in a dependably snappy way. When you create a table, you tell the librarian two things — an ID (partition) and a range (sort key) — and she sets up some space for you.

A DynamoDB table is a filing cabinet where the partition key highlights which drawer to search, and the sort key refers to the number of files in that drawer. The file itself can contain many documents or nothing. But critically when the librarian comes back with your files, she doesn’t look inside, peek at the data or even care what’s there.

The size of the drawer is limited (10GB), and can only be opened and closed so fast (3000 RCUs/1000WCUs) regardless of the number of files. If you go over these limits, Alice will find you another drawer. But once you have more than a single drawer, you will never shrink back to one drawer.

Alice will always find space for your information but is motivated by charging you for the combined weight of all the files brought back for each of your requests. The librarian does not care if the files contain things you didn’t want.

If you tell your librarian from the outset that you need another index, you’ll be given an LSI — this has the same partition key (drawer) but allows you to put more than one label on each file. This is why the LSI uses the same throughput resources as the main table, because it’s part of that table.

But once you have an allocated filing cabinet, what happens when you ask Alice for another index? You’ve already been set up so the librarian does the next best thing — she finds you another filing cabinet that contains a read-only duplicate of some of the items in the main drawer. This is why GSIs use additional resources outside of the table, and why you cannot update items retrieved via a GSI.

It’s a brave new world.

For our team, learning serverless apps has been about discovering the right tools for the job. DynamoDB has provided a critical piece in allowing consistent performance and reliable scaling in customer-facing data access, but there is a significant learning curve in getting the design right.