Why Are Databases Exposed as APIs?

Written by datastax | Published 2023/04/17
Tech Story Tags: apis | json | database | apache-cassandra | data-science | nosql | stargate | good-company

TLDRStargate is an open source project that simplifies how developers can work with APIs. It provides an abstraction layer that hides the complexities of the database. The API is a piece of infrastructure that provides access to data via various styles of gRPC, REST and others.via the TL;DR App

An application isn’t worth much until it’s put into production. For developers, getting to this point quickly means easy access to data they need to build without having to worry about the details of spinning up, managing, and maintaining databases.

APIs have become the de facto standard for connecting applications to databases, but it wasn’t always that way. Here we’ll discuss what has changed in the software world to elevate the importance of exposing databases as APIs. We’ll also discuss Stargate, the open source project that simplifies how developers can work with these APIs (the latest version of Stargate, which includes improvements in scalability and flexibility, was released recently).

How We Build Software That Uses Databases

Database administrators (DBAs) used to be the ones in charge of queries, as it required a database expert to design the way that developers interacted with data. But until recently, developers were expected to know a lot about interacting with databases, even if they weren’t experts at SQL, queries or data models.

I was a proficient developer in the 1990s, but databases intimidated me. It took years for me to feel comfortable working with SQL. Simply offering an “SQL lookalike” isn’t enough, either. Take CQL (Cassandra Query Language), which was developed to offer a query language similar to SQL for communicating with Apache Cassandra. The idea was to provide an abstraction for communicating with Cassandra, making it easier for those comfortable with relational databases to do so.

But in the network world, we have new concepts of identity, permissions and security that are completely separate from the query language. The notion of a driver simply communicating on a binary protocol doesn’t work well in the cloud. HTTP is the foundational application layer protocol for the cloud but most HTTP-based APIs are slow. A low-latency option, like gRPC, is critical for real-time communications in distributed systems.

How Software Talks to Software

The standard way that clients or app servers used to talk to databases involved drivers that ran within the data center. Now everything is running in the cloud, but developers write in a wide range of languages. JavaScript, Python or any of a host of different frameworks might be used, so the means by which software accesses the data needs to be abstracted in ways that are different from classic database drivers — using APIs (JSON, gRPC, GraphQL or Document) that developers are comfortable with.

The modern way that software talks to software is an API; it’s the abstraction layer that hides the complexities of the database.

The Nature of Data

Data used to be much more uniform; it fit neatly into rows and columns in database tables. But the dynamics of data have changed. Data moves from in-memory representations back to the database in a seamless way without much intervening software. And we’re dealing with new types of data formats that are much more robust than the data primitives that people used to deal with — or the half-dozen or so data types that SQL could handle.

Database APIs

APIs are how today’s developers work with databases. Here’s a summation of why:

  • HTTP is the network protocol of the cloud. Many developers are already familiar with web APIs, and using HTTP makes cloud application deployment much easier.
  • There’s no need to install and run databases locally. Installing a database locally requires effort and creates yet another environment in which issues have to be debugged.

A Gateway to Simplicity, Scalability and Extensibility

A data API gateway is a piece of software infrastructure that provides access to data via APIs of various styles including REST, gRPC and others. The gateway abstracts the details of storing and retrieving data using one or more persistent stores. This enables application developers to focus on writing business services that access data via easy-to-use APIs instead of having to learn the intricacies of a database query language.

Stargate is an open source data API gateway that sits between an app and the databases it needs to query. It was first introduced in September 2020, and Apache Cassandra was chosen as the first database in part because it solves the world’s hardest scalability and availability challenges.

A data API gateway is a powerful way to enable your developers to work in frameworks and structures they’re familiar with, providing a range of tradeoffs between productivity and performance. Stargate offers the power of Cassandra by presenting REST, Document, and GraphQL as simple APIs. It also offers a set of gRPC libraries for doing CQL over gRPC as an easier, lightweight and more cloud-friendly alternative to native drivers for CQL without sacrificing performance.

This year, the Stargate engineering team at DataStax has been working on an architectural update to Stargate. Stargate v2, as we’re calling it, was released in October 2022. Based on feedback from the Stargate developer community, in Stargate v2 we’ve made some significant updates that make it easier for developers to use and the community to contribute to the project. Most importantly, Stargate v2’s high-performance gRPC API can offer speed that’s equivalent to native database drivers. This enables developers to use cloud-friendly network protocols like HTTP to connect apps and databases, with no performance loss over the wire.

Extensible and Adaptable

No top-down strategy or API flavor of the month ever survived contact with a large pool of developers. A key goal of Stargate v2 is to enable the community to add new API services quickly and easily by making the implementation itself easier to understand, debug, enhance and extend.

Adding a new API service is much simpler now, and the source code for the REST, GraphQL and Document API services provide developers with instructive example code that shows what a finished API service should look like.

The API tier should be multimodel; developers want to find their preferred API readily available, rather than be forced to adapt their development work to a different API. And if the API isn’t available, the API tier should be able to adapt.

Cloud Native

Throwing code in front of a database has been done for years. But actually building a platform that can scale with your database — and be adaptable and reliable — is something new. If you’re using Cassandra, you’re probably already a high-growth app — or aspiring to become one. So everything that sits in front of it must facilitate a high-growth environment. Stargate began life as a fork of Cassandra coordinator code, so it inherits much of the reliability and availability for which Cassandra is well known.

The API tier has to be fully capable of operating at scale, so another Stargate v2 goal was to make it more cloud-friendly. Several changes to facilitate Stargate’s scalability include:

  • Stargate is now fully containerized and runs within Kubernetes pods, which gives operators more control over how workloads can scale.

  • The API services have been moved out of the monolithic Stargate node into separate microservices, which will enable each API to be scaled independently.

  • Storage nodes and coordinator nodes are independently deployable and scalable, also giving operators more control over how workloads can scale.

If a workload is query- or storage-intensive, it can be tuned without resorting to scaling the entire cluster as a whole.

Developing with Something Familiar

Cloud data services have become the dominant theme in the technology world, so it’s no surprise that developers tend to think in terms of data abstractions like JSON instead of idioms unique to particular databases. Stargate is the culmination of a lot of hard work to meet developers where they are, enabling them to work in frameworks and structures that they’re familiar with.

Learn more about Stargate here.

By Ed Anuff.

Also published here.


Written by datastax | DataStax is the real-time data company for building production GenAI applications.
Published by HackerNoon on 2023/04/17