Artificial intelligence is the most transformative paradigm shift since the internet took hold in 1994. And it’s got a lot of corporations, understandably, scrambling to infuse AI into the way they do business.
One of the most important ways this is happening is via generative AI and large language models (LLMs), and it’s far beyond asking ChatGPT to write a post about a particular topic for a corporate blog or even to help write code.
In fact, LLMs are rapidly becoming an integral part of the application stack.
Building generative AI interfaces like ChatGPT — ”agents” — atop a database that contains all the data necessary and can “speak the language” of LLMs is the future (and, increasingly, the present) of mobile apps.
The level of dynamic interaction, access to vast amounts of public and proprietary data, and ability to adapt to specific situations make applications built on LLMs powerful and engaging in a way that’s not been available until recently.
And the technology has quickly evolved to the extent that virtually anyone with the right database and the right APIs can build these experiences. Let’s take a look at what’s involved.
When some people hear “agent” and “AI” in the same sentence, they think about the simple chatbot they’ve experienced as a popup window that asks how it can help when they visit an e-commerce site.
But LLMs can do much more than respond with simple conversational prompts and answers pulled from an FAQ.
When they have access to the right data, applications built on LLMs can drive far more advanced ways to interact with us that deliver expertly curated information that is more useful, specific, rich — and often uncannily prescient.
Here’s an example.
You want to build a deck in your backyard, so you open your home improvement store’s mobile application and ask it to build you a shopping list.
Because the application is connected to an LLM like GPT-4 and multiple data sources (the company’s own product catalog, store inventory, customer information, and order history, along with a host of other data sources), it can easily tell you what you’ll need to complete your DIY project.
But it can do much more.
If you describe the dimensions and features you want to incorporate in your deck, the application can offer visualization tools and design aids. Because it knows your postal code, it can tell you which stores in your vicinity have the items you need in stock.
It can also, based on the data in your purchase history, suggest that you might need a contractor to help you with the job – and provide contact information for professionals near you.
Then it can tell you, based on variables like the amount of time it takes deck stain to dry (even incorporating the seasonal climate trends where you live) and how long it’ll be until you can actually have that birthday party on your deck that you’ve been planning.
The application could also assist with and provide information on a host of other related areas, including details on project permit requirements and the effect of the construction on your property value. Have more questions?
The application can help you at every step of the way as a helpful assistant that gets you where you want to go.
This isn’t science fiction. Many organizations, including some of the largest DataStax customers, are working on multiple projects that incorporate generative AI right now.
But these projects aren’t just the realm of big, established enterprises; they don’t require vast knowledge about machine learning or data science, or ML model training.
In fact, building LLM-based applications requires little more than a developer who can make a database call and an API call.
Building applications that can provide levels of personalized context that were unheard of until recently is a reality that can be realized with anyone who has the right database, a few lines of code, and an LLM like GPT-4.
LLMs are very simple to use. They take context (often referred to as a “prompt”) and produce a response. So, building an agent starts with thinking about how to provide the right context to the LLM to get the desired response.
Broadly speaking, this context comes from three places: the user’s question, the pre-defined prompts created by the agent’s developer, and data sourced from a database or other sources (see the diagram below).
The context provided by the user is typically simply the question they input into the application.
The second piece could be provided by a product manager who worked with a developer to describe the role the agent should play (for example, “You’re a helpful sales agent who is trying to help customers as they plan their projects; please include a list of relevant products in your responses”).
Finally, the third bucket of provided context includes external data pulled in from your databases and other data sources that the LLM should use in constructing the response.
Some agent applications may make several calls to the LLM before outputting the response to the user in order to construct more detailed responses.
This is what technologies such as ChatGPT Plug-ins and LangChain facilitate (more on these below).
AI agents need a source of knowledge, but that knowledge has to be understandable by an LLM. Let’s take a quick step back and think about how LLMs work. When you ask ChatGPT a question, it has very limited memory or “context window.”
If you’re having an extended conversation with ChatGPT, it packs up your previous queries and the corresponding responses and sends that back to the model, but it starts to “forget” the context.
This is why connecting an agent to a database is so important to companies that want to build agent-based applications on top of LLMs. But the database has to store information in a way that an LLM understands: as vectors.
Simply put, vectors enable you to reduce a sentence, concept, or image into a set of dimensions. You can take a concept or context, such as a product description, and turn it into several dimensions: a representation of a vector.
Recording those dimensions enables vector search: the ability to search on multidimensional concepts, rather than keywords.
This helps LLMs generate more accurate and contextually appropriate responses while also providing a form of long-term memory for the models. In essence, vector search is a vital bridge between LLMs and the vast knowledge bases on which they are trained.
Vectors are the “language” of LLMs; vector search is a required capability of databases that provides them with context.
Consequently, a key component of being able to serve LLMs with the appropriate data is a vector database that has the throughput, scalability, and reliability to handle the massive datasets required to fuel agent experiences.
Scalability and performance are two critical factors to consider when choosing a database for any AI/ML application. Agents require access to vast amounts of real-time data and require high-speed processing, especially when deploying agents that might be used by every customer who visits your website or uses your mobile application.
The ability to scale quickly when needed is paramount to success when it comes to storing data that feeds agent applications.
As engagement becomes agent-powered, Cassandra becomes essential by providing the horizontal scalability, speed, and rock-solid stability that makes it a natural choice for storing the data required to power agent-based applications.
For this reason, the Cassandra community developed the critical
There are a few routes for organizations to create agent application experiences, as we alluded to earlier.
You’ll hear developers talking about frameworks like
But the most important way to move forward with building these kinds of experiences is to tap into the most popular agent on the globe right now: ChatGPT.
It became the social network platform, with a huge ecosystem of organizations building games, content, and news feeds that could plug into it. ChatGPT has become that kind of platform: a “super agent.”
Your developers might be working on building your own proprietary agent-based application experience using a framework like LangChain, but focusing solely on that will come with a huge opportunity cost.
If they aren’t working on a ChatGPT plugin, your organization will miss out on a massive distribution opportunity to integrate context that is specific to your business into the range of possible information ChatGPT can supply or actions it can recommend to its users.
A range of companies, including Instacart, Expedia, OpenTable, and Slack
Building ChatGPT plug-ins will be a critical part of the AI agent projects that businesses will look to engage in.
Having the right data architecture — in particular, a vector database — makes it substantially easier to build very high-performance agent experiences that can quickly retrieve the right information to power those responses.
All applications will become AI applications. The rise of LLMs and capabilities like ChatGPT plugins is making this future much more accessible.
To learn more about vector search and generative AI? Join us for Agent X: Build the Agent AI Experience, a free digital event on July 11.
By Ed Anuff, DataStax