Azure Documentation

High Availability with ExpressRoute

Ensuring high availability for your network connectivity is crucial for mission-critical applications. Azure ExpressRoute provides mechanisms to build resilient connections between your on-premises network and Azure. This document outlines strategies and configurations for achieving high availability with ExpressRoute.

Understanding High Availability Requirements

High availability in the context of network connectivity means minimizing downtime and ensuring continuous service. For ExpressRoute, this typically involves:

  • Redundant physical connections: Multiple ExpressRoute circuits to provide path diversity.
  • Redundant network devices: Ensuring your on-premises and Azure networking equipment has failover capabilities.
  • Geographic redundancy: Distributing critical resources across multiple Azure regions.
  • Route diversity: Ensuring traffic can find alternative paths in case of a failure.

ExpressRoute High Availability Design Patterns

1. Dual ExpressRoute Circuits

The most fundamental approach to high availability is provisioning two ExpressRoute circuits. These circuits should terminate at different peering locations to ensure geographic diversity and protection against single points of failure at the provider's edge.

Diagram showing two ExpressRoute circuits connecting on-premises to different Azure regions.
Dual ExpressRoute Circuits for Redundancy
  • Configuration: Establish two ExpressRoute circuits with different providers or at different locations with the same provider.
  • Routing: Configure Border Gateway Protocol (BGP) with different Autonomous System Numbers (ASNs) and independent peering configurations for each circuit. This allows for independent route advertisement and acceptance.
  • Failover: In case of a circuit failure, BGP will automatically withdraw routes from the failed circuit, and traffic will reroute through the active circuit.

2. ExpressRoute Premium and Global Reach

ExpressRoute Premium offers features that enhance availability, such as more BGP prefixes and connectivity to multiple Azure regions. ExpressRoute Global Reach extends your on-premises network to multiple Azure regions and to other on-premises sites through the Azure backbone.

  • When combined with dual ExpressRoute circuits, Global Reach can provide a highly available and global private network.
  • Ensure your BGP configurations are optimized for advertising routes across multiple regions.

3. Geo-Redundant Azure Network Gateways

When connecting to ExpressRoute, Azure Virtual Network Gateways are used. To achieve high availability, ensure that your Virtual Network Gateways are also configured for redundancy.

  • Active-Active Gateways: Azure supports active-active configurations for VPN and ExpressRoute gateways, providing automatic failover.
  • Redundant Gateways per VNet: Deploying separate Virtual Network Gateways in different Azure Availability Zones within a region can also enhance resilience.

Important: For ExpressRoute active-active gateways, you'll need two different peering circuits with independent configurations and BGP sessions.

BGP Configuration for High Availability

BGP plays a pivotal role in ExpressRoute high availability. Proper configuration ensures that traffic is effectively routed and fails over seamlessly.

  • ASNs: Use different public or private ASNs for your on-premises routers and for the Azure side of each ExpressRoute circuit.
  • BGP Timers: Adjust BGP timers (Hold Time, Keepalive Interval) to influence failover speed. Shorter timers lead to quicker detection of failures but can increase overhead.
  • Route Selection: Understand how BGP selects the best path. Ensure that your policies on-premises and in Azure favor healthy, redundant paths.
  • Local Preference: Use BGP's Local Preference attribute to influence outbound traffic flow and guide it through a preferred ExpressRoute circuit.

# Example BGP configuration snippet (conceptual)
router bgp YOUR_ASN
  neighbor 10.0.0.1 remote-as 65515
  neighbor 10.0.0.1 description ExpressRoute_Circuit_1
  neighbor 10.0.0.1 default-originate
  neighbor 10.0.0.1 route-map SET_LOCAL_PREF in

route-map SET_LOCAL_PREF permit 10
  set local-preference 200
                    

Monitoring and Testing

Continuous monitoring and regular testing are essential to validate your high availability setup.

  • Azure Network Watcher: Utilize Network Watcher for connectivity checks, topology visualization, and performance monitoring.
  • BGP Monitoring: Monitor BGP peer status and route advertisements for both ExpressRoute circuits.
  • Traffic Flow Analysis: Regularly analyze network traffic to ensure it's flowing as expected and that failover mechanisms are operational.
  • Disaster Recovery Drills: Conduct simulated failover tests by temporarily disabling one of the ExpressRoute circuits or network devices to verify the automated failover process.

Key Takeaways for High Availability

  • Always plan for at least two ExpressRoute circuits.
  • Terminate circuits at diverse locations or with different providers.
  • Ensure your network gateways in Azure are also redundant.
  • Configure BGP carefully to manage route advertisements and preferences.
  • Implement robust monitoring and conduct regular testing.

By following these best practices, you can build a resilient and highly available network connection to Azure using ExpressRoute, ensuring your applications remain accessible and performant.