Monitor Azure Virtual Machines

This document provides comprehensive guidance on monitoring your Azure Virtual Machines (VMs) to ensure optimal performance, availability, and security.

Why Monitor Azure VMs?

Effective monitoring is crucial for understanding your VM's health and performance. It helps in:

Key Monitoring Tools and Services

Azure Monitor

Azure Monitor is the foundational service for collecting, analyzing, and acting on telemetry from your Azure and on-premises environments. It provides a comprehensive solution for collecting, analyzing, and acting on telemetry from your cloud and on-premises environments.

VM Insights

VM Insights is a feature within Azure Monitor that provides performance and health monitoring for your Azure VMs and Azure Scale Sets. It leverages Azure Monitor's data collection and analysis capabilities to offer pre-built workbooks for common monitoring scenarios.

Collecting VM Data

Log Analytics Agent / Azure Monitor Agent

To collect logs and performance counters from your VMs, you need to install an agent. The Azure Monitor Agent (AMA) is the recommended agent for new deployments, replacing the legacy Log Analytics Agent.

Log Analytics Agent (legacy):

# Example: Enable performance counters for collection
az vm extension set --resource-group MyResourceGroup --vm-name MyVM --name Microsoft.EnterpriseCloud.Monitoring.MicrosoftMonitoringAgent --publisher Microsoft.EnterpriseCloud.Monitoring --version 1.0 --settings '{"xmlCfg": "...", "omsAgentVersion": "1.13.5-0"}'

Azure Monitor Agent (recommended):

Configure data collection rules (DCRs) to specify which data to collect from your VMs and where to send it.

Key Metrics to Monitor

Focus on these critical metrics for a healthy VM:

Metric Description Typical Threshold
% CPU utilization Percentage of time the CPU is busy executing threads. < 80% (sustained)
% Disk Read/Write Bytes/sec Rate of data being read from or written to the disk. Depends on disk type and workload
% Disk Read/Write Operations/sec Number of read or write operations per second. Depends on disk type and workload
Network In/Out (Bytes/sec) Data transfer rate to and from the VM. Depends on application needs
Available Memory (Bytes) Amount of physical memory available to the operating system. > 200MB (Windows), > 100MB (Linux)
RDP/SSH connection failures Failed connection attempts to the VM. 0 (ideally)

Configuring Alerts

Set up alerts to be notified immediately when issues arise. Azure Monitor allows you to create alerts based on metrics, log queries, and activity logs.

  1. Navigate to Azure Monitor in the Azure portal.
  2. Select 'Alerts' and then 'Create alert rule'.
  3. Define the scope (your VM or resource group).
  4. Configure the condition (e.g., CPU utilization > 90% for 15 minutes).
  5. Define the action group (e.g., send an email, trigger a webhook).
  6. Name and save the alert rule.
Tip: Start with common alerts like high CPU, low disk space, and network connectivity issues. Refine your alerts as you understand your workload better.

Using Dashboards and Workbooks

Visualize your VM data effectively:

Accessing VM Insights workbooks:

  1. In the Azure portal, navigate to your VM.
  2. Under the 'Monitoring' section, select 'Insights'.
  3. Explore the 'Performance' and 'Map' tabs.

Monitoring VM Health with Azure Advisor

Azure Advisor provides personalized recommendations for optimizing your Azure resources, including VMs. It offers recommendations related to performance, cost, security, high availability, and operational excellence.

Regularly check Azure Advisor for recommendations specific to your VMs.

Further Reading