When you embark on a transformational journey, you need to have an endpoint in sight—some form of target. The target can and probably will change, but today, as the team is working through changes and implementing new capabilities, its members should align them with measurable goals. They can then use key performance indicators (KPIs) to determine whether and how fast the activities of the team are helping to achieve these goals.
Setting measurable goals and communicating key performance indicators is critical for devops transformations. Devops drives a new set of practices, culture changes, and new ways for IT to collaborate on development and operational goals.
It may be a significant investment for larger organizations and one that requires an ongoing review of priorities: What applications should you develop continuous integration and delivery (CI/CD) pipelines? How do you prioritize what deployment jobs to automate? What patterns of infrastructure are worth automating with infrastructure as code (IaC)? What is the minimal standard for implementing application monitoring and continuous testing?
These are all questions of priority, but there are also questions of impact: Are you really delivering faster with devops? Is quality improving? Are applications more stable? Are you recovering from issues faster?
The art of defining KPIs is in selecting the ones that are most relevant to the goals. To get started, this article surveys 15 KPIs that you can use in your transformation program. I have divided the KPIs into four categories:
- Business KPIs, which directly impact business goals
- Change KPIs, which demonstrate IT’s ability to drive application, infrastructure, or service improvements
- Operating KPIs, which demonstrate operational excellence
- Cultural KPIs, which target how the IT organization collaborates
When devops has a direct business impact
If you can identify goals that have direct impact on strategic business goals and performance indicators, you can help drive the importance of the transformation and potentially help in funding it. Consider an e-commerce company that has a slow or unstable application that improves on those operating metrics by autoscaling cloud infrastructure based on user demand and automating recovery procedures to application alerts. Another example may be a city government that regularly surveys constituents for satisfaction around its technology services and improves it over time as the development team releases more features and capabilities faster. A third example could be a university that drives costs down by standing a select set of computing stacks as infrastructure as code and demonstrating lower operating costs as departments and professors shift workloads to them off legacy environments.
These examples illustrate several business-impacting KPIs:
- Financial KPIs, where devops has a direct impact on revenue or costs
- Usage metrics, for customer-facing applications when there are defined business goals to increase
- User satisfaction KPIs, which can be affected by improving release frequency and quality or other operating metrics
The key factor in selecting these KPIs is if there is a close correlation between what practices devops is targeting and the business KPI. If devops is affecting the business KPIs indirectly or if it is one of several factors that is driving changes to them, IT might want to consider other KPIs instead that have a more direct correlation.
When devops drives faster and higher quality changes
One of the key devops objectives is in driving more frequent application releases, more features and technical capabilities released to users, and fewer defects and operational issues stemming from changes.
Let’s face it: Many IT organizations are trying to improve on their reputation of being slow to respond and error-prone. How long does it take to get a fix to production? After you release a new feature, how many patch releases do you have to make to get the application stable again? How many critical defects are found in production?
Even well-performing application development teams strive to improve release frequency, application quality, and the velocity of new and improved features. To do this, they continually improve the automation in their CI/CD pipelines, automate more test cases, and shift left more application security practices.
It’s also not just about application development. How fast can IT spin up new Hadoop, Spark, and other data science platforms to run experiments? How fast are patches pushed out? How fast are firewall rules validated and implemented?
This second group of KPIs are designed to capture the speed and quality in driving changes:
- Change lead time. This defines the full duration from the time the change is requested to the time it’s fully implemented in production. Examples include making an application change, fulfilling a request for a computing environment, and implementing a critical systems patch.
- Features released per quarter. Business leaders think in quarters, so this is a KPI that they can interpret. You’ll need to have a reasonable definition of what is a feature and not get too caught up in differentiating what is a “small” or “large” feature. The main impact to demonstrate is that, with automation and fewer defects, the application development team is getting a lot more done.
- Deployment frequency. Are you deploying quarterly, monthly, weekly, daily, or hourly? The automation developed in CI/CD pipelines directly drives more frequent deployments.
- Test case automation. Three metrics worth tracking are the number of test cases that have been developed, the percent of these that are automated, and the duration it takes to run different tests. Some groups also find ways to measure test coverage, the percent of application flows and application interfaces that have defined tests. The faster the automation, the more tests that can be incorporated as continuous testing in the CI/CD pipeline.
- Defect escape rates. This defines the number of defects discovered in production. This can be reviewed by time period or as a ratio to the number of deployments.
Measure the devops impact on operational excellence
The next area that devops can impact is operational. Many IT department metrics such as uptime and application performance are often impacted by devops programs. Going beyond these baseline KPIs, devops can also be measured in the following ways:
- Mean time to recovery (MTTR) and mean time to discovery (MTTD). These metrics better represent business impact of operational improvements. All IT departments experience operational issues, and the key question is how fast they are discovered and resolved. Teams investing in monitoring, automating response to alerts, application logging, and escalation procedures are likely to affect both MTTR and MTTD.
- Business disruption hours. This can be a tricky KPI to measure, but it speaks to the duration where the business, customers, or users were experiencing some form of technical disruption, including planned outages, unplanned outages, slow performance, delays in data processing, and workflow-impacting application defects.
- Technical debt closed. Technical debt captures a to-do backlog of technology improvements. Some technical debt comes from shortcuts that enable releasing or fixing something faster but were implemented less than optimally. Other technical debt comes from implementation or design improvements that are only discovered and need improvements after the technology is used by users. This KPI directly speaks to how much technical debt is addressed. As technology teams improve their productivity through automation, there should be more time available to address technical debt issues.
- Devops-driven cost reductions. Some aspects of costs can be reduced through devops automation. Automating the shuttering or ramping down of unused or lightly used environments can generate savings. Monitoring applications and optimally selecting architecture and infrastructure can also generate savings.
Measure the cultural impact of devops programs
If devops is designed to address the cultural gaps that have traditionally existed between developers and operational administrators and engineers, the question is whether this can be defined and measured with a KPI?. The answer is yes, but it requires some thinking on definitions and easy approaches to measure. Some example KPIs:
- Team happiness. IT teams that were at each other’s throats were likely to blame each other for operational issues, missing deadlines, and for poor communications. As devops aligns teams on objectives, drives automation, and reduces handoffs from one team to another, it should make everyone’s jobs easier and more pleasant. This can be measured using employee surveys and other employee-engagement metrics.
- Meeting efficiency. Organizations that aren’t aligned often require more and longer meetings to agree on priorities and resolve conflicts. Development and operational teams that become more aligned through devops should require fewer, shorter, and more punctual meetings.
- Learning KPIs. Most organizations have learning and development objectives, and devops programs are a great way to engage IT on learning new technologies and practices. Learning can best be measured by asking members of the organizations to teach others what they’ve learned though events like lunch-and-learns, demos, discussions, and hackathons. Learning KPIs can aim to capture the frequency of these learning programs and their impacts.
Pick the right devops KPIs that drive business objectives
No organization should try to implement 15 KPIs all at once. Smart organizations will select the appropriate ones that best align with medium-term goals and then will design lightweight approaches to measure them. They will weed out KPIs that are too hard to define or measure.
Like any transformational program, they will start with a small handful of metrics where they can demonstrate quick wins and then move to more challenging objectives.