The two most mentioned advantages of cloud-based platforms are pay-per-use billing and the ability to scale up to an almost unlimited number of resources. No more buying ahead of resource demand, attempting to guess the amount of physical hardware and software that you’ll need.
But enterprise IT needs to understand that scale and costs are coupled concepts in cloud computing. The more resources you use, either self-scaling or auto-scaling, the more you pay. How much you pay may depend on the architecture patterns as much as on the cost of the resources themselves. Here’s why.
In building cloud-based systems, I’ve discovered that cloud architecture is really just making a bunch of the right decisions. Those who make bad decisions are not punished, they are just underoptimized. That everything works conceals the fact that you’re paying twice as much as you would if the architecture were fully optimized as to scaling and cost.
This can be a major factor when deciding whether or not to refactor/rewrite applications to be optimized for a specific cloud platform (cloud native). Or when selecting core enabling technology such as microservices, event driven, containers, container orchestration, etc. These decisions determine what you’ll see in your cloud bill at the end of the month.
What should architects think about when it comes to cost and scalability? I have a few general architecture patterns to follow:
Learn to tune cloud-based applications for the optimization of all cloud services that the applications may need. In other words, optimize applications so that they use the fewest number of resources to process data and drive function.
This kind of architectural optimization was commonplace back in the early days, when we were dealing with 8KB memory stacks on 1970s-era equipment. These days developers are not that good at writing applications that are optimized to take a minimalist approach to resources. But in doing so you’ll find that the application can scale fast and forever, with reduced incremental costs.
Learn to deallocate services as soon as they are no longer needed. In many instances I’ve seen developers provision cloud resources, such as virtual servers, and then not deprovision them at the instant they are not needed. Or worse, never deprovision them at all and have zombie processes eating resources and adding to the bill. If you look at what’s running now on your cloud, I bet at least 20 processes are eating money and doing nothing.
Understand the scalability trade-offs. It’s fine to allocate resources such as storage and compute when you need them, but the granularity of what you allocate and how makes a huge difference.
For instance, if you’re allocating a terabyte of storage when you could get away with a few gigabytes, then you’re not being optimal. The notion of leveraging resources with “headroom” gets you in trouble: You’re not likely to return the unneeded resources back to the pool to reduce costs, or you’ll leave the excess running.
Serverless computing comes in handy here. It only allocates the resources to process the application and then returns those resources to the pool instantly. However, not all applications are economically portable to serverless systems; for those the architect needs to make some good choices as outlined above.
There’s no free lunch with cloud computing. It’s easy to get things running, but the ability to optimize workloads for scalability and costs is the new place where we need to grow talent. Reading this post is a good first step.