Data issues are not new: data integration, data security, data management, and defining single sources of truth. However, what is new is combining these issues with multicloud deployments. Many of these problems can be avoided with a bit of upfront planning and using common data architecture best practices that have been understood for years.
The core problem, as I see it, is enterprises attempting to lift and shift data to multicloud deployments without good forethought around the common problems that are likely to arise:
Forming data silos. The use of multiple cloud services can result in isolated data silos, making it difficult to integrate and manage data across multiple platforms. This should come as no surprise to anyone, but in many respects, multicloud has made data silos more numerous.
These need to be dealt with using data integration approaches such as leveraging data integration technology, data abstraction/virtualization, or other tricks that are well understood by now. Or, just design your data storage systems not to be a silo.
Neglecting data security. Ensuring the security of sensitive data across multiple cloud services can be a complex task and often increases security risks.
It is essential to have a robust data security strategy in place that addresses the unique security needs of each cloud service but does not increase the complexity of dealing with data security. This often means abstracting native security services by using a central security manager or other technology that exists over the public cloud provider, in other words, a supercloud or metacloud. This layer of logical technology exists above the clouds and is a concept that seems to be inflecting right now.
Not considering data portability. Migrating data from one cloud service to another can be challenging. It is important to have a solid data portability strategy in place that considers data format, size, and dependencies.
Most of those moving to multicloud can’t answer this question: “What would it take to migrate this data set from here to there?” This needs to be in your back pocket, as we’re seeing some data sets move from single and multicloud deployments back to on premises. You must give yourselves options.
No centralized data management. Managing data across multiple cloud services can be a resource-intensive task if you attempt to do everything manually. It is essential to have a centralized data management system in place that can handle diverse data sources and ensure data consistency. Again, this needs to be centralized, abstracted above the public cloud providers and native data management implementations. You need to deal with data complexity on your terms, not the terms of the data complexity itself. Most are opting for the latter, which is a huge mistake.
Lacking interoperability. The big issue is interoperability. It’s really a combination of the problems listed thus far—data silos, data portability, and lacking centralized data management—but it’s good to call it out on its own.
Ensuring the interoperability of different cloud services and cloud data can be a huge pain in the neck. It is important to have a clear understanding of the data exchange standards supported by each cloud service and a plan for bridging any gaps.
Most data is just tossed into multicloud deployments with little thought and no interoperability mechanisms. Interoperability then becomes a tactical effort when it should be strategic and well understood before and after deployment.
The frustrating thing about all these challenges is that they are very solvable, with well-established solution patterns and enabling technologies. Enterprises are making dumb mistakes by dashing to multicloud deployments as quickly as they can, and then they don’t see the ROI from multicloud or cloud migrations in general. Most of the damage is self-inflicted.
Do your homework. Plan. Leverage the proper technology. It’s not that hard, and it will save you and your enterprise a ton of time and money in the long run.