DataStax on Wednesday said that it was partnering with Houston-based startup ThirdAI to bring large language models (LLMs) to its database offerings, such as DataStax Enterprise for on-premises and NoSQL database-as-a-service AstraDB.
The partnership, according to DataStax’s chief product officer, Ed Anuff, is part of the company’s strategy to bring artificial intelligence to where data is residing.
ThirdAI can be installed in the same cluster, on-premises or in the cloud, where DataStax is running because it comes with a small library and the installation can be processed with Python.
“The benefit is that the data does not have to move from DataStax to another environment, it is just passed to ThirdAI which is adjacent to it. This guarantees full privacy and also speed because no time is lost in transferring data over a network,” a DataStax spokesperson said.
“ThirdAI can be run as a Python package or be accessed via an API, depending on the customer preference,” the spokesperson added.
Enterprises running DataStax Enterprise or AstraDB can use the data residing in those databases and ThirdAI's tech and LLMs to spin up their own generative AI applications. The foundation models from ThirdAI can be trained to understand data and answer queries, such as which product recommendation would likely result in a sale, based on a customer's history.
The integration of ThirdAI’s LLMs will see DataStax imbibe the startup’s Bolt technology, which can achieve better AI training performance on CPUs compared to GPUs for relatively smaller models. The advantage of this is that CPUs are generally priced lower than GPUs, which are usually used for AI and machine learning workloads.
“The Bolt engine, which is an algorithmic accelerator for training deep learning models, can reduce computations exponentially. The algorithm achieves neural network training in 1% or fewer floating point operations per second (FLOPS), unlike standard tricks like quantization, pruning, and structured sparsity, which only offer a slight constant factor improvement,” ThirdAI said in a blog post.
“The speedups are naturally observed on any CPU, be it Intel, AMD, or ARM. Even older versions of commodity CPUs can be made equally capable of training billion parameter models faster than A100 GPUs,” it added.
Bolt can also be invoked by “just a few” line changes in existing Python machine learning pipelines, according to ThirdAI.
The announcement with ThirdAI is the first in a new partnership program that DataStax is setting up to bring in more technology from AI startups that can help enterprises with data residing on Datastax databases develop generative AI applications.