One of our goals at DataStax is to enable every developer — regardless of the language they build in — to deliver AI applications to production as fast as possible.
We
Cassandra is well-known to be the most powerful, scalable, and production-ready database. With the addition of vector search, Cassandra and Astra DB have become a critical foundation for building enterprise-grade Gen AI applications. But we also need to ensure that this powerful technology is accessible and easy to use for the broadest set of developers, regardless of their preferred language or skill set.
Today we take a big step in that direction by providing the huge population of JavaScript developers access to the most powerful vector database in the world, through a simple API: introducing the JSON API for Astra DB.
In the JavaScript world, document databases are prominent. No surprise there, as JSON is a native notation in JavaScript, so an ability to store and retrieve JSON documents to and from the database tremendously accelerates the development.
The new JSON API is designed to provide a smooth developer experience for JavaScript developers creating new AI applications. We set out to ensure that if you’re a JavaScript developer, you can fire up an instance of Astra DB and start coding right away using paradigms and frameworks you’re familiar with.
Exposing Astra DB as a document database provides multiple improvements in developer experience:
You think in terms of JSON objects, which gives you natural alignment with the JavaScript ecosystem.
There is no data modeling step, as it’s taken care of by the database itself. You just save and retrieve documents.
You can start developing quickly, and focus on the application logic rather than on what is happening on the backend.
We also noticed that many members of the JavaScript community work with document databases through object data modeling (ODM) libraries, specifically MongooseJS. MongooseJS is a popular framework for object modeling on top of document databases. With
The new JSON API for Astra DB is fully compatible with MongooseJS. This means that it takes just a couple lines of code to point MongooseJS to an Astra DB instance:
// Import MongooseJS.
const mongoose = require("mongoose");
// Import the driver for Astra DB (shipped as a part of stargate.io).
const { driver } = require("stargate-mongoose");
// Tell MongooseJS to use the Astra DB driver instead of the default one.
mongoose.setDriver(driver);
// Connect to Astra DB.
await mongoose.connect(astraDbUri, {
isAstra: true,
});
Once connected, you can use MongooseJS APIs, and Astra DB will take care of the heavy lifting of storing your documents in an efficient way, indexing them, and scaling out when needed.
Even better, when developing with MongooseJS backed by Astra DB, you get full access to Astra DB Vector, the only database designed for simultaneous search and update on distributed data and streaming workloads with ultra-low latency, as well as highly relevant vector results that eliminate redundancies. As a result, you get the ease of use and familiarity of MongooseJS, combined with the rich vector support and scalability of Astra DB. Developing AI applications in JavaScript has never been easier!
Let’s take a look at a simple example of how to use Astra DB’s vector search within a MongooseJS application. In this example, we’ll create a collection of movies with their text descriptions and some other information, like title, production year, and genre. In addition, we’ll instruct MongooseJS that we want to store vector embeddings for the descriptions. Here is how the model definition will look:
const Movie = mongoose.model(
"Movie",
new mongoose.Schema(
{
title: String,
year: Number,
genre: String,
description: String,
$vector: {
type: [Number],
validate: (vector) => vector && vector.length === 1536,
},
},
{
collectionOptions: {
vector: {
size: 1536,
function: "cosine",
},
},
},
),
);
Those familiar with MongooseJS will find this to be a typical MongooseJS model, except for two additional pieces that Astra DB’s driver allows for:
$vector field
that is used to store vector embeddings.collectionOptions.vector
object that tells Astra DB how to index the vector embedding field. With the model above, you can insert documents along with the embeddings:
await Movie.insert({
title: "In the Border States",
year: 1910,
genre: "Drama",
description: "In the Border States is a 1910 American drama film...",// Generate embedding for the description,
// for example by invoking the OpenAI API.
$vector: embedding("In the Border States is a 1910 American drama film..."),
});
Your application can now provide functionality to enter a free-form query to search movies by their descriptions. For that, you will use the same model to generate an embedding for the user’s query, and use Astra DB’s vector search to find the most relevant entries in the database:
await Movie.find({})
.sort({ $vector: { $meta: embedding("Something funny") } })
.limit(3);
Of course, in many cases, vector search on its own isn’t enough, as you might want to combine it with filtering based on other fields in the document. For example, here is how you can find relevant movies similar to the previous example, but only looking at dramas:
await Movie.find({ genre: "Drama" })
.sort({ $vector: { $meta: embedding("Criminals and detectives") } })
.limit(3);
With MongooseJS and Astra DB, you are not limited to simple CRUD operations. You can accompany them with relevancy search using vectors, or even combine the two into powerful hybrid search queries.
The new JSON API is currently in public preview and is available on Astra DB to anyone who wants to try it out. Follow these three simple steps to get started:
Go to Astra DB and create a vector database.
Once the database is active, switch to the “Connect” tab, choose “JSON API” as your preferred method, and follow the instructions.
Enjoy developing!
More details on how to use the JSON API can be found in the documentation.
By introducing the JSON API, our vision is clear: we want Astra DB to be the first choice for JavaScript developers building AI applications. This is just the beginning — stay tuned for further improvements and additions.
Got questions, feedback, or maybe you’re just as excited as we are? Drop us a message at [email protected].
By Val Kulichenko, DataStax
Also published here.