Whether you are a developer, site reliability engineer, IT Ops specialist, program manager, or a DevOps practitioner monitoring is something you definitely care about! With modern applications evolving from an on-premises world to becoming more hybrid or microservices based, there is also a need to evolve skill sets and adopt some best practices for a successful monitoring strategy on a hybrid/public cloud.
Azure Monitor is Microsoft’s unified monitoring solution that provides full-stack observability across applications and infrastructure. Depending on the hat you are wearing at the moment, you can start with end-to-end visibility across the health of your resources, drill down to the most probable root cause of a problem, even to actual lines of code, fix the issue in your app or infrastructure, and re-deploy in a matter of minutes. If you have a robust monitoring pipeline setup, you should be able to find and fix issues way before it starts impacting your customers.
Continuous Monitoring
Many of you already know how Continuous Integration and Continuous Deployment (CI/CD) as a DevOps concept can help you deliver software faster and more reliably to provide continuous value to your users. Continuous Monitoring (CM) is a new follow-up concept where you can incorporate monitoring across each phase of your DevOps and IT Ops cycles. This ensures the health, performance, and reliability of your apps and infrastructure continuously as it flows through the developer, production, and customers.
Azure Monitor is one such tool that can really enable Continuous Monitoring throughout your workflows. It works seamlessly with Visual Studio and Visual Studio Code during development and tests, while also integrating with Azure DevOps for release management and work item management during deployment and operations. It even integrates across the ITSM and SIEM tools of your choice to help track issues and incidents within your existing IT processes.
Seven best practices for Continuous Monitoring
Enable monitoring for all your apps
The first step for full observability is to enable monitoring across all your web apps and services. If you are working in code, you should add Azure Monitor Application Insights SDKs to your apps written in .NET, Java, Node.js, or any other programming languages. This is the recommended way since now you can also specify any custom events, metrics, or page views that are relevant to your app or business.
If you don’t have access to the code, there are many other mechanisms to enable monitoring, release templates in Azure Pipelines, Azure DevOps projects, the status monitor for .NET apps on Windows Servers, extensions in Azure VMs or Azure App Services, and more. Once you have monitoring enabled across all your apps you can easily visualize end-to-end transactions and connections across all the components.
Enable monitoring for all relevant components of your infrastructure
It is usually difficult to predict what components of your application stack might have an issue, so it is important to monitor all the relevant components. Azure Monitor can help you track the health and performance of your entire hybrid infrastructure, be it VMs, Containers, Storage, Network, or any other Azure services. You automatically get platform metrics, activity logs, and diagnostics logs from most of your Azure resources and can enable deeper monitoring for virtual machines or AKS clusters with a simple button click on the Azure Portal or installing an agent on your servers.
For scalability and infrastructure as code, it is recommended to take advantage of DevOps projects, Azure policy, PowerShell, or ARM templates for enabling monitoring and configuring alerts over a large set of resources. Having monitoring enabled across your entire infrastructure will help you achieve full observability and make it easier to discover a potential root cause when something fails.
Bucket related resources in resource groups
A typical application on Azure today consists of monolithic apps deployed on VMs/App Services or microservices hosted on Cloud Services, AKS clusters, ACS, or Service Fabric. These apps frequently utilize dependencies like Event Hubs, Storage, SQL, Service Bus, and more to complete tasks. Many customers use resource groups to bucket all the resources that make up their applications, and if you use the same, Azure Monitor for resource groups provides a simple way to keep track of the health and performance of your entire full-stack application, while enabling you to drill down into respective components for any investigations or debugging.
Ensure quality through Continuous Deployment
Azure Pipelines is a great way to setup Continuous Deployment and you can automate the entire process from code commit to production if your CI/CD tests are successful. Integrating monitoring as part of your pre or post-deployment Quality Gates can ensure that you are also keeping the key health and performance metrics (KPIs) on track as your apps go from developer to production, and any differences in the infrastructure environment or scale are not negatively impacting your KPIs.
It is also a good practice to maintain separate monitoring instances between the different deployment environments like development, test, canary, production, and more. So the data remains relevant across the associated apps and infrastructure. If needed to correlate you can always design multi-resource charts in Metrics Explorer or run cross-resource queries in Log Analytics.
Setup actionable alerts with notifications and/or remediation
Having a robust alerting pipeline is essential for any monitoring strategy and it is recommended to set up actionable alerts for all predictable failure states. You could configure your alerts based on static or even dynamic thresholds and setup actions on top of them. These actions could be as simple as SMS, emails, push notifications or voice calls for simple notifications. You could even connect to your existing ITSM Tools or any other alert management systems through Webhooks. When possible, there are ways to design remediation as well with Azure Automation Runbooks or Auto-scaling in case of elastic workloads.
Prepare role-based dashboards and workbooks for reporting
When I ask large groups of people in conference presentations, whether any of them already use the same monitoring solutions across dev and ops in their companies, the answer is overwhelmingly negative! For successful Continuous Monitoring and ensuring fast MTTD and MTTR, it is imperative that your devs and ops have access to the same telemetry and same tools. Azure Monitor is designed as a unified monitoring solution for the entire team, and you can easily prepare custom role-based dashboards based on common metrics & logs.
Workbooks is another useful capability that can help with knowledge sharing between devs and ops. These could be prepared as dynamic reports with metric charts and log queries, or even as troubleshooting guides prepared by devs helping customer support or ops to handle basic problems.
Continuously optimize with “Build-Measure-Learn”
Building the right solution for your customers is never achieved in the first go and it is often an iterative process. Monitoring is one of the cornerstones of the popular “Build-Measure-Learn” philosophy which recommends continuously tracking your KPIs or user behavior metrics and striving to optimize them through planning iterations. With Azure Monitor, you can collect all the relevant custom events, metrics, or logs relevant to your business and it is extremely simple to add a new data point in the next deployment if something seems missing.
You can track and optimize your health, availability, performance, and reliability using the various tools from Azure Monitor and can even track end-user behavior and engagement for optimizing your customer experience. Azure Monitor provides Impact Correlation as well which can help you prioritize which areas to focus on to drive to important KPIs!
Next steps
A Continuous Monitoring pipeline across your apps and infrastructure throughout your DevOps and IT Ops processes can really help you reduce MTTD and MTTR while ensuring that you strive towards delivering the best solution for your customers.
Learn more about Azure Monitor in our documentation, and check out some good tutorials and videos within Azure Monitor Overview in Azure Portal that can help you implement some of these practices in further detail.