Kinetica, a relational database provider for online analytical processing (OLAP) and real-time analytics, is harnessing the power of OpenAI’s ChatGPT to let developers use natural language processing to do SQL queries.
Kinetica, which offers its database in multiple flavors including hosted, SaaS and on-premises, announced on Tuesday that it will offer the ChatGPT integration at no cost in its free developer edition, adding that the developer edition can be installed on any laptop or PC.
The ChatGPT interface, which is built into the front end of Kinetica Workbench, can answer any query asked in natural language about proprietary data sets in the database, the company said.
“What ChatGPT brings to the table is it will turn natural language into Structured Query Language (SQL). So, a user can type in any query and it can send an API call off ChatGPT. And in return, you get that SQL syntax that can be run to generate results,” said Philip Darringer, vice president of product management at Kinetica.
“Further, it can understand the intent of the query. This means that the user doesn't have to know the exact names of columns for running a query. The generative AI engine infers from the query and maps it to the correct column. This is a big step forward,” Darringer said.
In order to infer from queries in natural language so lucidly, Kinetica’s product managers incorporated some prompts and context based on their knowledge of already deployed databases into ChatGPT.
“We're sending certain table definitions and metadata about the data to the generative AI engine,” said Darringer, adding that no enterprise data was being shared with ChatGPT.
The database, according to the company, can also answer up-to date, real-time analytical queries as it continuously ingests streaming data.
Vectorization speeds query processing
Kinetica says that vectorization boosts the speed with which its relational database processes queries.
“In a vectorized query engine, data is stored in fixed-size blocks called vectors, and query operations are performed on these vectors in parallel, rather than on individual data elements,” the company said, adding that this allows the query engine to process multiple data elements simultaneously, resulting in faster query execution on a smaller compute footprint.
In Kinetica, vectorization is made possible due to the combined use of graphical processing units (GPUs) and CPUs, the company said, adding that the database uses SQL-92 for a query language, just like PostgreSQL and MySQL, and supports text search, time series analysis, location intelligence and graph analytics — all of which can now be accessed via natural language.
Kinetica claims that the integration of ChatGPT will make its detabase easier to use, increase productivity and improve insights from data.
“Database administrators, data scientists, and other practitioners will use this methodology to accelerate, refine, and extend the command line interface and API work they're doing programmatically,” said Bradley Shimmin, chief analyst at Omdia Research.
Kinetica is one of the first database companies to integrate ChatGPT or generative AI features within a database, according to Shimmin.
“Within databases themselves, however, there's been less effort to integrate natural language querying (NLQ), as these platforms are used by database administrators, developers, and other practitioners who are accustomed to working with SQL, Spark, Python, and other languages,” Shimmin said, noting that that vendors in the business intelligence (BI) market have made more progress in integrating NLQ.
Kinetica’s use of ChatGPT for natural language querying is not, strictly speaking, actual database querying, according to Shimmin.
“What Kinetica's talking about isn't using natural language to query the database. Rather, Kinetica works the same way Pinecone, Chroma, and other vector databases work, by creating a searchable index (vectorized view) of corporate data that can be fed into natural language models like ChatGPT to create a natural way to search the vectorized data. It's super slick,” Shimmin said.
“One very popular implementation of this kind of conversational query is the combination of Chroma, LangChain, and ChatGPT,” Shimmin added. LangChain is a software development framework.
Though there may be competition, Kinetica satnds to gain by integrating natural language, Shimmin said.
“Vector databases will be the hot ticket later in 2023 as enterprise practitioners begin looking for ways to put large language models (LLMs) to work behind the firewall without having to spend a ton of money on training their own LLM or fine-tuning an existing LLM using company data,” Shimmin said.
Kinetica said that it is open to working with other LLM-providers as and when new use cases arise.
“We do think over time, there will be other use cases where it will make sense for us to fine tune models or even work with other models,” said Chad Meley, chief marketing officer at Kinetica.
The company, which derives more than half of its revenue from US governent organizations such as NORAD, has customers in the connected car space along with clients in logistics, financial services, telecom and the entertainment sector.
(This story has been updated to clarify information aboout Kinetica's customer base.)