Data lakehouse provider Databricks on Monday said that it was acquiring large language model (LLM) and model-training software provider MosaicMLL for $1.3 billion in order to boost its generative AI offerings.
Databricks, which already offers an LLM named Dolly, is expected to add MosaicMLL’s models, training and inference capabilities to its lakehouse platform for enterprises to develop generative AI applications, the company said, underlining its open source LLM policy.
Dolly was developed on open data sets in order to cater to enterprises’ demand to control LLMs used to develop new applications, in contrast to closed-loop trained models, such as ChatGPT, that put constraints on commercial usage.
MosaicMLL’s models, namely MPT-7B and the recently released MPT-30B, are open source, putting them in line with Databricks’ existing policy.
Another advantage of these models, according to MosaicMLL, is the “zero human intervention” feature that allows the training systems to be automated.
“We trained MPT-7B with zero human intervention from start to finish: over 9.5 days on 440 GPUs, the MosaicML platform detected and addressed 4 hardware failures and resumed the training run automatically, and — due to architecture and optimization improvements we made — there were no catastrophic loss spikes,” MosaicMLL wrote in a blog post.
The deal calls for MosaicMLL’s entire team of over 60 employees, including co-founder CEO Naveen Rao, to move to Databricks, where they will continue to work on developing more foundation models, the companies said.
MosaicMLL’s existing customers, according to a company post, will still be able to access their LLMs and inference offerings. Existing customers include Allen Institute for AI, Generally Intelligent, Hippocratic AI, Replit and Scatter Labs. The San Francisco-based startup, which was founded in 2021, has raised nearly $64 million to date from investors including Lux Capital, DCVC, Future Ventures, Maverick Ventures, and Playground.
The $1.3 billion deal includes retention packages for MosaicMLL employees, Databricks said.
In May, the company acquired AI-centric data governance platform provider Okera for an undisclosed sum.
Databrick’s acquisition of MosaicMLL also comes just weeks after a rival, Snowflake, acquired Mountain View-based AI startup Neeva in an effort to add generative AI-based search to its Data Cloud platform.