Aug 15, 2023 2:00 AM

The looming battle over where generative AI systems will run

Putting AI on cloud versus on-premises systems may seem like a simple decision, but it's much more complex (and potentially expensive).

Thinkstock

David Vellante of “the Cube” fame wrote an awe-inspiring post addressing the cloud versus on-premises issues for artificial intelligence and came up with new survey data, raw meat to bloggers. The most valuable takeaway was something I already assumed: “While most customers are reporting a modest spend increase of 10% or less [on generative AI], 36% say their spending will increase by double digits.”

Generative AI may increase technical debt

I called this earlier in the year when the writing was on the wall regarding the “perceived value” of generative AI, and that fast adoption would drive much of the cloud growth in 2024 and 2025. But that’s not exactly going out on a limb. 

What is becoming more apparent is that the location where most generative AI systems will reside (public cloud platforms versus on-premises and edge-based platforms) is still being determined. Vellante’s article points out that AI systems are running neck-and-neck between on-premises and public cloud platforms. Driving this is the assumption that the public cloud comes with some risk, including IP leakage, or when better conclusions from your data appear at the competition.

Also, enterprises still have a lot of data in traditional data centers or on edge computing rather than in the cloud. This can cause problems when the data is not easily moved to the cloud, with data silos being common within most enterprises today. AI systems need data to be of value, and thus it may make sense to host the AI systems closest to the data.

I would argue that data should not exist in silos and that you’re enabling an existing problem. However, many enterprises may not have other, more pragmatic choices, given the cost of fixing such issues. Generative AI is considered a priority for most enterprises, even if it means working with underoptimized infrastructure that they are unwilling or cannot afford to change. Indeed, this means generative AI could drive another layer of technical debt for many businesses. 

To prem or not to prem

One thing that bugged me in Vellante’s article is that I saw enterprises making many of the same mistakes in the early days of cloud computing. However, this time companies are putting things within data centers and not on the cloud. There’s also the problem of moving applications and data to the cloud without enough preparation and planning. Both extremes leave you with underoptimized solutions.

The cloud has many advantages that may never be found within traditional legacy platforms. You can’t match the availability of tools and technology on public clouds, nor the speed to deploy these solutions. They already do generative AI well and have the infrastructure to scale and adapt to technology evolutions.

Maintaining these platforms is somebody else’s problem on public clouds. While most enterprises already have support and facilities management in place or are using a managed service, this is another rack of servers that do all the bad things that physical servers you own and operate do.

However, just as we saw with the rise in repatriations, if the generative AI systems are indeed collocated with the training data, and the use of that data is going to be relatively easy to predict, on-premises systems could be half the cost of public cloud platforms.

It depends

Generative AI systems are primarily purpose-built to do things such as automate supply chains with intelligent processing, automate repeatable manual work to reduce headcount (pointed out in the article), provide marketing intelligence, etc. Where the systems run depends mainly on the type of problem you’re looking to solve and the attributes of that generative AI system. People hate that answer (very consultant), but it’s true. In many respects, it’s not much different from any other system you build and deploy.

I do get concerned with assumptions about either on-premises or cloud that are only sometimes true. I’m unsure if the cloud is actually more vulnerable to “IP leakage”; many core systems, such as security, operations, and scalability, are better on public clouds. Public clouds can be more expensive than on-prem systems but, depending on the use case, be a good fit. This trend in pre-solving problems (“cloud-only” or “cloud-never”) without understanding the situation fully has gotten us in trouble in the past. We’re making similar mistakes with net-new generative AI systems.

I suspect that I’ll have many difficult conversations in 2025 about why generative AI costs twice as much as it should. It’s likely on the wrong platform for the wrong reasons. I would rather not have those conversations. Here’s your chance.