This guide provides steps and common solutions for diagnosing and resolving disk-related problems with your Azure Virtual Machines (VMs).
Troubleshooting Steps
1. Check Disk Status in Azure Portal
Begin by verifying the status of your VM and its disks in the Azure portal.
- Navigate to your Virtual Machine resource in the Azure portal.
- Under the "Disks" section, check the status of attached disks (OS disk and data disks). Ensure they are attached and healthy.
- Review the VM's "Overview" page for any status messages or alerts.
2. Monitor Disk Performance
Performance issues are often the first sign of disk problems. Use Azure Monitor to track key metrics.
- Key Metrics: Disk Read/Write Operations Per Second (IOPS), Disk Read/Write Bytes Per Second, Disk Latency, Disk Queue Depth.
- Tools: Azure Monitor Metrics, Log Analytics.
- Action: If metrics indicate high latency, low IOPS, or a full queue depth, investigate further. This could be due to undersized disks, excessive workload, or underlying Azure platform issues.
3. Disk Not Appearing or Accessible in OS
If a disk is attached in Azure but not visible in the VM's operating system:
For Windows:
- Connect to the VM via RDP.
- Open Disk Management (diskmgmt.msc).
- Check if the disk is listed. If it is "Offline" or "Unknown," right-click and bring it online.
- If it's uninitialized, initialize it and create a new volume.
- If a drive letter is missing, right-click the volume and change the drive letter.
For Linux:
- Connect to the VM via SSH.
- List available disks:
lsblk or fdisk -l.
- If the disk is not partitioned, use
fdisk or parted to create partitions.
- Format the partition (e.g.,
mkfs.ext4 /dev/sdc1).
- Create a mount point:
sudo mkdir /mnt/mydatadisk.
- Mount the disk:
sudo mount /dev/sdc1 /mnt/mydatadisk.
- Add an entry to
/etc/fstab for automatic mounting on boot.
4. Disk Full Errors
If your OS disk or a data disk is full:
- Identify large files/directories: Use tools like WinDirStat (Windows) or
du -sh * | sort -rh | head (Linux) within the VM.
- Delete unnecessary files: Remove old logs, temporary files, or application data.
- Resize the disk: If more space is permanently required, you can often resize data disks (and sometimes OS disks for managed disks) through the Azure portal. You may need to extend the volume within the OS afterward.
- Add a new data disk: For significant space needs, attaching a new, larger data disk is a common solution.
5. VM Performance Degradation
If your VM is experiencing slow disk I/O:
- Verify disk caching: Ensure the appropriate host caching setting (Read-only, Read/write, None) is selected for your workload. Read-only is generally recommended for data disks hosting databases or frequently read files.
- Check disk tier: Ensure you are using an appropriate disk type (e.g., Premium SSD, Standard SSD, Standard HDD) for your performance needs.
- Analyze workload: Is the VM experiencing an unusually high I/O load? Can the workload be optimized?
- Consider Availability Sets/Zones: In rare cases, issues might be related to underlying hardware. Distributing VMs across different fault domains or availability zones can mitigate this.
6. OS Boot Issues (OS Disk)
If your VM fails to boot and the issue is suspected to be the OS disk:
- Capture the OS Disk: Detach the OS disk from the failing VM and attach it as a data disk to a healthy troubleshooting VM in Azure.
- Diagnose from Troubleshooting VM: Mount the captured disk and perform checks:
- Windows: Use System File Checker (
sfc /scannow), check Event Logs, run Chkdsk.
- Linux: Mount the disk, check file systems, examine boot logs (e.g.,
/var/log/boot.log, journalctl).
- Re-attach or Replace: Once issues are resolved, re-attach the OS disk to the original VM or create a new VM from a snapshot of the repaired disk.
7. Managed vs. Unmanaged Disks
Azure now strongly recommends using managed disks. If you are still using unmanaged disks, consider migrating.
- Managed disks offer better reliability, scalability, and simplified management.
- Migration tools and processes are available in the Azure portal.
8. Storage Service Limits
Be aware of Azure Storage service limits for your chosen disk type and VM size. Exceeding these limits can cause performance throttling or errors.
Refer to Azure VM sizes documentation and Azure subscription and service limits for details.