As cloud native computing tooling and practices mature, there’s an increasing need to bring them into existing management and monitoring environments. That means integrating native cloud tooling, often built around the OpenTelemetry standard, with the monitoring frameworks and dashboards that you’re already using.
Building on those existing toolsets is key to adoption, as devops and cloudops teams already have built workflows around familiar interfaces. Getting monitoring and management in place is a priority, because time spent building new interfaces and integrations is time that could be better spent keeping applications and services running.
There’s an additional challenge here. Cloud native computing requires a new layer in your enterprise stack, comprising the new platforms that host cloud native applications. These bring their own management and monitoring issues, as you need to track resource usage and scaling, ensuring that new application nodes are operating correctly. While these tools, especially Kubernetes, come with their own monitoring services, it’s essential that these be integrated with your existing infrastructure and application monitoring.
Luckily for us, adoption of OpenTelemetry and time-series storage in monitoring tools like Prometheus make it relatively easy to merge application, infrastructure, and platform telemetry stores, using tools like Azure Monitor to query across data and logs.
Azure Monitor for Prometheus
At Microsoft Build last month, Microsoft announced general availability for a managed Prometheus time-series store in Azure Monitor. Originally unveiled in the fall of 2022, this managed service brings Prometheus to Azure (working in both Azure Monitor and in Azure Managed Grafana), giving you access to the familiar open-source visualization tools along with Microsoft’s own container monitoring tools.
Bringing Prometheus to Azure Monitor is a no-brainer. While Microsoft has its “own” Kubernetes distribution in Azure Kubernetes Service, it’s a managed implementation of the open-source platform, so has all the same APIs and support for all the standard Kubernetes tools. It has always been possible to run your own Prometheus instances in Azure, an approach that works well for relatively small systems, where you don’t need to think about scaling anything besides your application.
Things get harder with larger-scale AKS deployments, where you need to think about scaling storage and adding high availability. There are additional issues that come with running Kubernetes inside a regulated industry, as you need to consider how your data retention requirements affect your Prometheus store. Switching to a managed Prometheus service simplifies this, as Microsoft provides tools that automate much of the process of scaling and securing your data, as well as making sure that Prometheus is up-to-date and running with the latest patches. You don’t need to factor in the workload associated with managing Prometheus; all you need to do is write, read, and analyze the data stored in it.
You’re not limited to using Azure Monitor with managed Prometheus. Any existing PromQL tools and scripts can be used with Azure, ensuring any rules you’ve built around Prometheus data will still run. As far as your code is concerned, Azure’s managed Prometheus looks like any other Prometheus endpoint, with the same support for data ingestion and queries. This approach lets you migrate from other Kubernetes environments to Azure, and ensures that the metrics you consider important remain accessible.
Prometheus at Azure scale
Because Azure’s managed Prometheus is built on Azure storage, it can be used as extended storage for on-premises applications. This allows you to use Azure Monitor and Grafana as a single pane of glass for monitoring both on-prem and cloud-hosted Kubernetes clusters, while still supporting existing PromQL code. As managed Prometheus is designed to support multiple clusters, Microsoft notes that common usage is a separate instance per Azure region. Queries work across regions, allowing you to build custom dashboards in Grafana or in Azure Monitor.
Microsoft has designed its Prometheus service to be scalable and resilient, with a high availability mode that runs collectors on each node in your Kubernetes infrastructure. At the same time, much like other Azure managed services, data is stored in your chosen region and in another region in the same Azure geography. So, if your primary Prometheus store is in West US, your secondary will be somewhere like East US, ensuring that even if your default data center has an outage your metrics will be stored on the secondary.
Getting started with Azure Monitor for Prometheus
Enabling the Prometheus service for use with AKS is easy enough. Start by creating an Azure Monitor workspace to store your metrics. Then connect your Kubernetes instances to Prometheus, either directly or through Container Insights. Once you have a workspace, connect it to Azure Managed Grafana to set up dashboards and visualizations. Azure Monitor will host rules and alerts, which are written in PromQL and are used to trigger actions or send notifications. Usefully Azure Managed Prometheus is a supported source of events for KEDA (Kubernetes Event-driven Autoscaling), so you can use rules to drive scaling outside of the basic Kubernetes resource-driven model.
Configuring an AKS cluster to use the service is relatively simple. Both direct delivery and Container Insights options install a containerized version of the Azure Monitor agent, which collects metrics from your cluster and any running nodes. There is one limitation on your cluster: It must use managed identity authentication, which you should have in place. It’s best practice if you’re using AKS with other Azure services.
Microsoft has automated much of the process of setting up the monitoring agent for Linux containers—Azure Monitor will configure and deploy it as necessary. If you’re using Windows containers with AKS, then you must (for now) configure much of the monitoring service manually, including running Microsoft-supplied YAML and configmaps. Once the agent has been deployed you can use kubectl to check that it is running on your node pools.
The default settings for metric collection should be enough for most applications. Microsoft provides a long list of available metrics and targets, along with automatically provisioned dashboards in Grafana (with source code in GitHub). You can then add your own dashboards and rules to manage your cloud native applications your way.
Prometheus monitoring for AKS and Azure Arc
Usefully, managed Prometheus can even operate as an endpoint for Azure’s new OpenCost support. It’s also available as part of the existing Container Insights tooling, so can be quickly added to monitored clusters as a new source for Azure Managed Grafana. This way your account will be automatically provisioned with a set of sample dashboards that simplify getting started and can be used as a basis for your own dashboards.
Pricing is reasonable, with ingestion starting at $0.16 for 10 million samples and queries at $0.001 per 10 million samples processed. There are no additional storage charges, and data is retained for 18 months.
In addition to working with AKS and your own self-hosted Kubernetes instances, Microsoft’s managed Prometheus works with Azure Arc-hosted Kubernetes, providing support on the edges well as in the cloud. With Kubernetes’ edge role becoming increasingly important, support for Azure Arc is an attractive option for running managed Kubernetes on your own servers.
There’s a lot to like in this release. Microsoft continues to play to its enterprise strengths, while keeping close to the Kubernetes ecosystem’s open-source roots. The result is tooling that remains familiar but operates in a way that lets you focus on the results you want—not on managing your metrics platform. That’s a win for everyone.