Data by itself isn’t very useful. Data only becomes useful as it’s understood and as it infuses application experiences. This desire to put data to work has driven a boom in cloud-based analytics. Though a relatively small amount of IT spending currently goes to cloud—roughly 6% according to IDC in 2020—all of the momentum is away from on-premises, legacy business intelligence tools toward more modern, cloud-native options like Google BigQuery, Amazon Redshift, Databricks, or Snowflake. The popularity of bringing data and cloud together shows up in Snowflake’s rocketing rise up the DB-Engines database popularity rankings, from number 170 back in November 2016 to number 11 in January 2023. Some of Snowflake’s success absolutely comes down to performance, scalability, the separation of storage and compute, and other benefits.
But arguably an even bigger benefit is simply cloud. Snowflake was born in the cloud and offers a natural path for enterprises looking to move to the cloud. Yes, that same cloud keeps propelling new databases forward against legacy alternatives. That same cloud promises to continue to upend the world of data in 2023.
All cloud, all the time?
While I don’t agree fully with my InfoWorld colleague David Linthicum that “2023 could be the year of public cloud repatriation,” I can agree that we shouldn’t blindly fall in love with a technology or see it as a hammer and hence treat every business problem as a nail. Cloud solves many problems, but not all. In areas related to advanced data-driven applications, however, cloud is indispensable, as Linthicum acknowledges: “When advanced IT services are involved (AI, deep analytics, massive scaling, quantum computing, etc.), public clouds typically are more economical.”
Not only more economical, but also more practical.
Years ago AWS executive Matt Wood made this case to me, and it’s as persuasive today as it was in 2015. “Those that go out and buy expensive infrastructure find that the problem scope and domain shift really quickly,” he said. “By the time they get around to answering the original question, the business has moved on.” As he continued, “If you drop a huge chunk of change on a data center that is frozen in time,” the questions you can ask of your data are stuck in a time warp. Even in straitened economic times, the exact wrong way to think of cloud is through a narrow lens of cost. Elastic infrastructure begets flexibility in making sense of data. Dollars from sense, as it were, rather than dollars and cents. That’s cloud-based analytics tools.
Companies seem to understand this. At a recent analyst conference, Snowflake CFO Mike Scarpelli talked about competitive dynamics in the data warehousing market. “We are never competing with Teradata [an incumbent data analytics company founded in the on-premises software era]. When a customer has made the decision to go off-prem, it is never against Teradata. They’ve made the decision to leave.” If the enterprise is already looking to cloud when going through an exercise in digital transformation, where do they look? “According to Scarpelli, “When we are competing for an on-premises migration, it is always [against] Google, Microsoft, [and] AWS [but AWS] tends to partner with us more [out of] the gate.”
The customer, in other words, has likely spent years with their on-premises data warehouse or BI solution, but that’s not where they’re betting their future. Their future is cloud. If they’re considering a next step, it’s not likely to be Oracle unless they’re in so deep with Oracle as to make introducing a new system seem hard. Most of the time, enterprises will be looking for a cloud-based database, data warehouse/lakehouse, or machine learning/artificial intelligence system. More Google BigQuery, in other words, and less SAP BusinessObjects.
Democratizing data
One other reason for cloud’s success is simplicity, or it can be. Cloud, of course, isn’t inherently more user-friendly, but many cloud systems have emphasized a SaaS approach that puts a premium on user experience. Take, for example, this comment from a Reddit board, describing their experience with Snowflake: “If you need a PhD in physics to use your SaaS tool, your tool is useless. MySQL users love it (analysts), the C-suite loves it, the only people it struggles to win over are the nerdy engineers like myself who had enough hubris to think they could do it all themselves and everyone in the world would learn PySpark one day.”
I’ve written recently about data democratization, how enterprises are trying to give more employees access and ability to work with more and different data. I noted that if enterprises want to truly democratize data, they’ll need to teach employees how to effectively use cloud-based tools to probe cloud-based data.
Fortunately, the cloud also enables machine learning systems to take some of the heavy load. As my MongoDB colleague Adam Hughes writes, “Combining real-time, operational, and embedded analytics—what some call translytics, HTAP, or augmented transaction databases—now enables analytics driven by application data to help determine, influence, and automate decision-making for the app and provide real-time insights for the user.” This doesn’t mean machines do the thinking for us, but rather that they remove the undifferentiated heavy lifting of computation-heavy data processing, leaving the user with the more thoughtful work of understanding what that data means for an application and, ultimately, the business.
All of this isn’t entirely driven by cloud but is absolutely enhanced and accelerated by cloud. Data has never been more important, and accessing and understanding data has never been easier, thanks to cloud computing. If you wanted to pick a near-certain prediction for 2023, it’s that this trend will continue and accelerate.