Automation is a great tool. Instead of solving a problem once, you can automate a solution to automatically adapt to changing needs, no humans required.
Cloud scalability is the best example of this. We no longer manually need to provision finite static resources such as storage and compute. Instead, we set up automation (typically provided for us) that can leverage the number of resources needed without developers or architects even thinking about it.
The number and types of automated scaling mechanisms vary a great deal, but serverless is the best example of automated scalability. With serverless computing now a part of standard infrastructure, such as storage and compute resource provisioning, it is now a part of containers, databases, and networking as well. Many resources that used to be statically configured now can “auto-magically” configure and provision the exact number of resources needed to do the job and then return them to the pool after use.
Pretty soon, it will be easier to list the number of resources that are not serverless, given that cloud providers are all in on serverless, and serverless cloud services are increasing each month. The serverless computing market had an estimated value of $7.29 billion in 2020. Furthermore, it’s projected to maintain a compound annual growth rate of 21.71% for the period 2021 to 2028. Serverless is expected to reach a value of $36.84 billion by 2028.
The question then is are we always being cost-effective and fully optimized in terms of spending and resource utilization by leaving the scalability to automated processes, such as serverless and cloud-native autoscaling?
Of course, this is a complex issue. There’s seldom one correct path, and automation around scalability is no exception.
The pushback on automated scalability, at least “always” attaching it to cloud-based systems to ensure that they never run out of resources, is that in many situations the operations of the systems won’t be cost-effective and will be less than efficient. For example, an inventory control application for a retail store may need to support 10x the amount of processing during the holidays. The easiest way to ensure that the system will be able to automatically provision the extra capacity it needs around seasonal spikes is to leverage automated scaling systems, such as serverless or more traditional autoscaling services.
The issues come with looking at the cost optimization of that specific solution. Say an inventory application has built-in behaviors that the scaling automation detects as needing more compute or storage resources. Those resources are automatically provisioned to support the additional anticipated load. However, for this specific application, behaviors that trigger a need for more resources don’t actually need more resources. For instance, a momentary spike in CPU utilization is enough to trigger 10 additional compute servers coming online to support a resource expectation that is not really needed. You end up paying 5 to 10 times as much for resources that are not really utilized, even if they are returned to the resource pool a few moments after they are provisioned.
The core point is that using autoscaling mechanisms for the purpose of determining resource need is not always the best way to go. Leaving scalability just up to automation means that the likelihood of provisioning too many or too few resources is much higher than if the resources are provisioned to the exact needs of the application.
So, we can turn on autoscaling, let the cloud provider decide, and end up spending 40% more but never worry about scalability. Or we can do more-detailed system engineering, match the resources needed, and provide those resources in a more accurate and cost-effective way.
There’s no one answer here. There are some systems I build that are much more reliable and cost-effective with automated scaling. They are often more dynamic in their use of resources, and it’s better to have some process attempt to keep up.
But we’re leaving money on the table for many of these use cases. Most system capacity calculations are well understood and so the number of resources needed is also well understood. In these cases, we’ll often find that if we take back control of resource provisioning and de-provisioning, we end up with more cost-effective approaches to cloud-based application deployments that can save hundreds of thousands of dollars over the years. Just saying.