Devops is primarily associated with the collaboration between developers and operations to improve the delivery and reliability of applications in production. The most common best practices aim to replace manual, error-prone procedures managed at the boundaries between dev and ops teams with more robust automations. These include automating the delivery pipeline with CI/CD (continuous integration and continuous delivery), standardizing configurations with containers, and configuring infrastructure as code. On the ops side, devops best practices to improve application reliability include improving apps’ observability, increasing monitoring, and automating cloud and infrastructure operations.
But what about improving the performance of applications, databases, data pipelines, and cloud infrastructure? In this post, I consulted with experts and identified seven opportunities where devops practices and methodologies can impact performance and the user experience.
1. Build security practices into apps from day one
The last thing devops teams need is to deploy new capabilities with security vulnerabilities. A security outage or degradation impacts user experiences and creates significant business issues. A devops best practice is to shift-left security by collaborating with infosec on requirements, testing for code vulnerabilities within CI/CD pipelines, and implementing other security practices in software development.
Mike Elissen, senior developer advocate at Akamai, says, “A critical component of app reliability is availability, and taking the appropriate measures to properly secure an app against web application attacks, DDoS attacks, and more can mean the difference between staying online and offline.”
Elissen says that shifting left is part of transforming from devops to devsecops. He continues, “We’re seeing the ‘shift left’ on adding security into devops become more and more pronounced, ultimately creating a stronger devsecops culture and making more developers aware and responsible for the security of their apps.”
2. Standardize architecture and infrastructure blueprints
Amir Rozenberg, vice president of product management at Quali, states a problem that impacts team performance. He says, “Many devops teams are finding themselves organizing the chaos of organically grown application infrastructure definitions, which were developed in good faith early on to enable team efficiencies in the software development life cycle.”
Rozenberg asks whether devops teams should apply a do-it-yourself approach to creating environments or whether the organization should create standards. He says, “The recommended approach is to establish a central team to model environments in the form of blueprints so that they are reliable, reusable, and compliant. They then need the ability to distribute those infrastructure definitions so they are available for the business constituents to consume via rapid self-service, whether integrated into the automated pipeline or in a manual fashion.”
3. Institute observability and continuous testing in the CI/CD pipeline
Matt Sollie, director of devops at 66degrees, believes that CI/CD can do more than just package and push code. He says, “Not all devops principles are as glamorous or visible as continuous delivery or building everything as code, but they are just as important. Continuous integration is one component of a mature devops posture that can add a great deal to the reliability of an application, but it takes purpose, vision, and time to build in a meaningful way.”
Sollie acknowledges that more than having a vision, attaining reliability and performance objectives requires investing in practices and optimal architectures. “Observability is a critical and expensive principle because reliability is not an on or off state and requires nuanced data gathering. With all the cloud computing services, selecting the right tool or service for the job can bring inherent reliability and performance benefits.”
What should agile dev teams implement in their pipelines to improve performance? Here are some recommendations:
- Implement continuous testing before increasing deployment frequencies
- Consider service virtualization to test microservices and third-party APIs
- Ensure observable CI/CD pipelines to improve fault detection and isolate pipeline issues
4. Control deployments with feature flags and canary releases
Deployments don’t have to be absolute cutovers where all users get all changes in one shot. Feature flags in the code help configure and control a feature’s availability, whereas canary release strategies enable devops teams to roll out new capabilities slowly and methodically.
John Kodumal, CTO and cofounder of LaunchDarkly, adds, “Feature management, specifically feature flags, are quickly becoming go-to devops practices that improve apps’ overall reliability and performance while allowing developers to innovate continuously. By employing feature flags, developers can test feature updates before production to troubleshoot issues before release.”
The controls improve reliability and performance but also help dev teams minimize disruptions. Kodumal says, “Feature flags improve performance while giving development teams the necessary controls to update applications without disruptions or downtime.”
5. Establish rigorous observability and monitoring standards
Focusing now on the ops side of devops responsibilities, teams should consider several best practices to improve app performance, including developing observability standards and improving monitoring.
Frédéric Harper, director of developer relationships at Mindee, says, “Devops must implement rigorous monitoring and observability processes to ensure that every piece of the application is working correctly and that server processes are running smoothly. By securing this element, the devops teams can gather valuable information to understand how users utilize applications, possibly prevent future issues, make it easier to support customers, and improve business or architecture decisions based on real data.”
6. Extend monitoring with AIops and automations
In the web 2.0 days, ops had just a handful of log files and monitoring tools to review when there was an outage or performance issue. Today, running microservices, serverless applications, and multicloud databases implies significantly more data and tools to consult when resolving incidents and identifying problem root causes. AIops platforms that centralize monitoring data, use machine learning to correlate alerts, and help ops automate response and recovery across multiple platforms can help minimize performance impacts.
Mohan Kompella, vice president of product marketing at BigPanda, agrees, “AIops platforms can help devops teams preserve tooling autonomy and flexibility while also giving centralized incident responders the visibility they need to be the first line of defense for outages.”
7. Define SLOs and error budgets
Devops teams should balance which practices yield the most benefit and address risks. That requires teams to measure, learn, and collaborate on devops priorities, which isn’t easy when the benefits may not be realizable for months or years after implementing the practices and tools.
One method to prioritize is adopting site reliability engineering practices, defining service-level objectives (SLOs), and using error budgets. When an app or microservice exceeds its error budget, it signals the devops team to identify causes and focus on solutions.
Kit Merker, COO at Nobl9, says, “Service-level objectives set clear goals for engineering teams to make better decisions on how to prioritize their work. Devs and ITops can’t just be tech-centric but can move to service-centric.”
Devops teams have a lot on their plate, and devops best practices help teams balance their focus between accelerating dev and improving reliability and performance. The key to success may be in defining problem statements, debating approaches, iterating on solutions, and measuring impact.