paint-brush
Bringing AI to the Datacenterby@datastax
194 reads

Bringing AI to the Datacenter

by DataStaxJune 20th, 2023
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Bringing AI to the data center, and not just the cloud, is another very important step to making the transformational AI technology wave something all companies can be a part of.
featured image - Bringing AI to the Datacenter
DataStax HackerNoon profile picture

With all the assumptions we make about the advancements in enterprise data and cloud technologies, there’s a plain fact that often gets overlooked: The majority of the most important enterprise data remains in the corporate data center.


There are plenty of reasons for this — some reasonable, some not so much. In some cases, it’s because of the highly sensitive nature of data, whether it’s HIPAA compliance, sensitive banking data or other privacy concerns. In other cases, the data resides in systems (think legacy enterprise resource planning data or petabyte-scale scientific research data) that are difficult to move to the cloud. And sometimes, it’s just inertia. It’s not a great excuse, but it happens all the time.


Whatever the reason, housing data on racks of corporate servers has proved to be a real hindrance to many enterprises’ ability to take advantage of AI to transform their business, because it’s been all but impossible to provide the significant compute power necessary to drive AI on the infrastructure underpinning most data centers.


But there’s a movement under way, via a small constellation of startups and big device makers, to optimize machine learning models and make AI available to companies whose data isn’t in the cloud. It’s going to be a game changer.

The processing power problem

The graphical processing unit, or GPU, was developed to handle high-intensity video-processing applications like those required by modern video games and high-resolution movies. But the ability of these processors to break down complex tasks into smaller tasks and execute them in parallel also makes these high-powered application-specific integrated circuits (ASICs) very useful for artificial intelligence. AI, after all, requires massive streams of data to refine and train machine learning models.


CPUs, on the other hand, are the flexible brains of servers, and, as such, they are built to handle a wide variety of operations, like accessing hard drive data, or moving data from cache to storage, but they lack the ability to do these tasks in parallel (multicore processors can handle parallel tasks, but not to at the level of GPUs).They simply aren’t built to handle the kind of high-throughput workloads that AI demands.


High-performance GPUs are very expensive and until recently, they’ve been scarce, thanks to the reliance of crypto miners on these high-performance chips. For the most part, they’re the realm of the cloud providers. Indeed, high-performance computing services are a big reason companies move their data to the cloud. Google’s Tensor Processing Unit, or TPU, is a custom ASIC developed solely to accelerate machine learning workloads. Amazon also has its own chips for powering AI/ML workloads.

Optimizing for AI

GPUs have been the foundation of the rush of AI innovation that has recently taken over the headlines. Much of these high-profile developments have been driven by companies pushing the envelope on what’s possible without thinking too much about efficiency or optimization. Consequently, the workloads produced by new AI tools have been massive, and so, by necessity, managed in the cloud.


But in the past six months or so, that’s been changing. For one thing, the sprawling ML models that drive all of these cutting-edge AI tools are getting condensed significantly, but still generating the same powerful results.


I installed the Vicuna app on my mobile phone, for example. It’s a 13 billion-parameter model that does ChatGPT-like execution and runs in real time, right on my phone. It’s not in the cloud at all – it’s an app that resides on a device.


The Vicuna project emerged from the Large Model Systems Organization, a collaboration between UC Berkeley, UC Davis and Carnegie Mellon University with the mission to “make large models accessible to everyone by co-development of open datasets, models, systems, and evaluation tools.”


It’s a mission that big tech isn’t ignoring. Apple’s latest desktops and iPhones have specialized processing capabilities that accelerate ML processes. Google and Apple are doing a lot of work to optimize their software for ML too.


There’s also a ton of talented engineers at startups that are working to make hardware more performant in a way that makes AI/ML more accessible.


ThirdAI is a great example. The company offers a software-based engine that can train large deep-learning models by using CPUs. DataStax has been experimenting with the ThirdAI team for months and has been impressed with what they have developed — so much so that last week we announced a partnership with the company to make sophisticated large language models (LLMs) and other AI technologies accessible to any organization, regardless of where their data resides. (Read more about the partnership news here).

Bring AI to the data

Because of all this hard work and innovation, AI will no longer exclusively be available to organizations with data in the cloud. This is extremely important to privacy, which is a big reason many organizations keep their data on their own servers in the first place.


With the AI transformation wave that has washed over everything in the past 18 months or so, it’s all about data. Indeed, there is no AI without data, wherever it might reside. Efforts by teams like ThirdAI also enable all organizations to “bring AI to the data.”


For a long time, companies have been forced to do the opposite: bring their data to AI. They had  to dedicate massive resources, time and budget to migrate data from data warehouses and data lakes to dedicated machine learning platforms before analyzing for key insights.


This results in significant data transfer costs, and the required time to migrate, analyze, and migrate affects how quickly organizations can learn new patterns and take action with customers in the moment.


Bringing AI to the data is something we have focused on a lot at DataStax with our real-time AI efforts, because it’s the fastest way to take actions based on ML/AI, delight customers and drive revenue. Bringing AI to the data center, and not just the cloud, is another very important step to making the transformational AI technology wave something all companies can be a part of.


*Learn about the new DataStax AI Partner Program, which connects enterprises with groundbreaking AI startups to accelerate the development and deployment of AI applications for customers.*


By Ed Anuff, DataStax


Also published here.