In this article, we'll deep dive into all the basics to help you decide if AWS RDS is the right decision for your architecture and help you hit the ground running if you do end up AWS RDS.
For many decades now, relational databases (RDS) have been the place to store your data. They are pretty flexible often use some kind of SQL dialect, which is one of the main languages taught in computer science classes, and widely understood by the average developer.
Before the NoSQL movement started in 2009, there wasn't a question of whether relational databases were a good fit. The mainstream hosting providers didn't offer any alternatives.
Amazon Relational Database Service, or short AWS RDS, is AWS's relational database offering and one of the first cloud services they had in their portfolio. It's also the most mature one. But this means that it came long before the serverless movement at AWS started, so it isn't the first pick for serverless architectures.
If you want to leverage all your SQL knowledge but still get some serverless benefits in your next application, read on! This article will talk about AWS RDS and how it could be the database of choice.
First things first, RDS is just a service on top of EC2. While this might also be true for DynamoDB, the abstractions for RDS are much less sophisticated. In its most basic configuration, you don't tell AWS that you need a database with a specific number of tables; you tell them how many EC2 instances you need and what type of database should be installed on them.
For example, you need a cluster of four memory-optimized EC2 instances with MySQL installed. Then you can use the AWS RDS endpoint and login credentials with a MySQL client of your choice to connect from your application. AWS will then keep track of OS and DB updates for you and keep the cluster running, starting and stopping instances as your traffic changes over time.
This also means that you have to meddle with VPCs when using AWS RDS. A common issue with AWS Lambda to AWS RDS connections is timeouts. If your Lambda isn't in the same VPC as your RDS cluster, the function might not have permission to access it. The VPC binding is optional for Lambda functions, and you usually want to avoid it because it comes at quite some performance impact. But if you want to access a relational database, you have to bite the bullet.
The billing for basic AWS RDS is by the hour your instances run, even if you don't have anyone using them. Not the serverless way, right? But at least you know what you have to pay upfront.
Yes, AWS RDS also has a free tier instance and 20GB of general-purpose storage for one year. It's only a tiny db.t2.micro EC2 instance means one core and 1GB of memory, but it should be enough for a small or experimental workload.
A few years ago, AWS released its own database engine, Amazon Aurora. It's an alternative to MySQL and PostgreSQL, to which it's also API compatible. According to AWS, it's faster than both open-source database engines on the same instance types.
While this meant some savings for people who already used this kind of database, Aurora later became part of AWS' serverless strategy. In 2018, they released Aurora Serverless, a particular configuration of Aurora that allows for on-demand billing and scaling.
With Aurora Serverless, you can use all your SQL skills and still have some of the serverless benefits known from DynamoDB. Sadly not all of them. While it frees you from thinking about instances and allows you to think more abstract in capacity units, the experience won't be as seamless as with DynamoDB.
Aurora Serverless allows scaling down to zero capacity, but AWS discourages this for production environments because it takes a few seconds to scale up and down. Also, the fact that this service is only available with the Aurora engine means it won't work with SQL Server or an Oracle DB.
Aurora serverless even got more awesome lately by removing even more non-serverless restrictions.
RDS is the base service, Aurora is a database engine you can use with RDS, and Aurora Serverless is a special serverless configuration for Aurora.
RDS allows to scale up and down, too, so you can use these essential features with other database engines, but it can't go down to zero, and it's much slower in doing so. Aurora serverless can scale up and down in below 30 seconds.
While billing and scaling are huge problems serverless tries to solve, there is another problem relational databases pose to serverless applications, their long-living connections.
Their creators build these databases with a long-running app server in mind. This means that considerable resources for a connection will be set up on the client and server-side at startup, and the app will reuse it for all requests.
Things are the opposite with serverless backends built on AWS Lambda. If we take cold-starts into account, they need to reconnect to the database for every event they handle. This means the overhead of establishing a connection has to be done on every cold-start, which can exhaust a database server quickly.
RDS Proxy tries to solve this issue by placing a proxy server between the database and the Lambda functions. This proxy server is now the one that gets hammered by Lambda invocations that constantly establish new connections to the database. The actual database server only has to keep the proxy connected and has more resources to do other work.
Amazon RDS Proxy is now generally available for Aurora MySQL, Aurora PostgreSQL, RDS MySQL, and RDS PostgreSQL.
With its latest update, Dashbird now includes monitoring RDS instances, clusters, and proxies. So, it's not only DynamoDB anymore if you want to build something with Dashbird insights!
Also Published Here