Implementing Disaster Recovery for SQL Server on Azure Virtual Machines
This tutorial guides you through setting up a robust disaster recovery (DR) solution for your SQL Server instances hosted on Azure Virtual Machines (VMs). We will leverage Azure Site Recovery (ASR) and SQL Server Always On Availability Groups to ensure high availability and data resilience.
Prerequisites
- An active Azure subscription.
- Existing Azure VMs running SQL Server.
- Necessary Azure networking configured (Virtual Networks, Subnets, Load Balancers).
- SQL Server Enterprise Edition (for Always On Availability Groups).
Step 1: Prepare Your Azure VMs
1.1. Ensure Network Connectivity
Verify that your primary and secondary Azure VMs can communicate with each other over the network. This typically involves ensuring they are in the same or peered virtual networks and that firewall rules allow SQL Server traffic (default port 1433) and ASR replication traffic.
1.2. Configure Windows Failover Clustering
For Always On Availability Groups, a Windows Server Failover Cluster (WSFC) is required. Follow these steps:
- Install the Failover Clustering feature on both your primary and secondary SQL Server VMs.
- Create a cluster object in Active Directory.
- Configure the cluster using PowerShell or Failover Cluster Manager, ensuring a Quorum configuration is set up.
Refer to Microsoft documentation on Windows Server Failover Clustering for detailed guidance.
1.3. Enable Always On Availability Groups
On both SQL Server instances:
- Open SQL Server Configuration Manager.
- Right-click on 'SQL Server Services' and select 'Properties'.
- Go to the 'Always On High Availability' tab.
- Check the box for 'Enable Always On Availability Groups'.
- Restart the SQL Server service.
Step 2: Configure Azure Site Recovery
2.1. Create a Recovery Services Vault
In the Azure portal, create a new 'Recovery Services vault'. This vault will manage replication and failover for your VMs.
2.2. Enable Replication for SQL Server VMs
Within your Recovery Services vault:
- Under 'Getting Started', select 'Site Recovery'.
- Choose 'Azure virtual machines' for 'Protect, replicate and recover applications'.
- Click 'Enable replication'.
- Select the source region (where your primary VM is located) and target region.
- Choose the virtual machine scale set or individual VMs you want to protect.
- Configure replication policies, including Recovery Point Objective (RPO) and retention.
Figure 1: Azure Site Recovery replication configuration.
2.3. Monitor Replication Health
Once replication is enabled, monitor the health status of your VMs in the ASR dashboard. Ensure the initial replication completes successfully and that ongoing replication is healthy.
Step 3: Configure SQL Server Always On Availability Group
3.1. Create the Availability Group
On your primary SQL Server instance:
- In SQL Server Management Studio (SSMS), expand 'Always On High Availability'.
- Right-click 'Availability Groups' and select 'New Availability Group Wizard...'.
- Follow the wizard, providing a name for your AG, selecting databases, and specifying replicas.
- For replicas, add your secondary SQL Server instance. Configure availability mode (e.g., Synchronous Commit) and failover mode (e.g., Automatic).
- Configure listener information, which will be used by applications to connect to the AG.
3.2. Configure Availability Group Listener
The listener provides a single point of connection for your applications. Ensure it is configured with a dedicated IP address that can be accessed from both primary and secondary locations.
Step 4: Testing Failover
4.1. Perform a Manual Failover
Before an actual disaster, it's crucial to test your failover process:
- In SSMS, right-click on your Availability Group and select 'Failover...'.
- Follow the wizard to initiate a manual failover to the secondary replica.
- Verify that applications can still connect to the AG via the listener and that data is consistent.
4.2. Test Failover with Azure Site Recovery
ASR allows you to perform test failovers without impacting your production environment:
- Navigate to your Recovery Services vault and select 'Replicated items'.
- Choose your SQL Server VM and click 'Test failover'.
- Select a recovery point and a target virtual network for the test.
- Once the test failover is complete, ASR will create isolated VMs in the target network. Connect to these test VMs to validate your data and application connectivity.
- When testing is complete, remember to 'Clean up test failover'.
Figure 2: Performing a test failover in Azure Site Recovery.
Conclusion
By combining SQL Server Always On Availability Groups with Azure Site Recovery, you can achieve a highly available and resilient SQL Server solution on Azure VMs. Regular testing of failover procedures is essential to ensure your DR plan is effective.
For more advanced configurations, consider using Azure SQL Managed Instance for built-in high availability and disaster recovery capabilities, or explore advanced ASR features like multi-site failover.