Application Gateway Health Probes

Health probes are crucial for ensuring the availability and reliability of your applications deployed behind Azure Application Gateway. Application Gateway uses health probes to monitor the health of backend servers and only route traffic to healthy instances.

Note: Proper configuration of health probes is essential for a seamless user experience. Incorrect settings can lead to intermittent connectivity issues or the perceived unavailability of your application.

Configuring Health Probe Settings

When you create or update an Application Gateway, you define health probe configurations. These settings dictate how Application Gateway checks the health of your backend resources.

Default Health Probe

If you don't explicitly define custom health probe settings, Application Gateway uses a default health probe. The default probe uses the following settings:

  • Protocol: HTTP
  • Host: The host name of the backend server (from the backend pool configuration)
  • Path: /
  • Interval: 30 seconds
  • Timeout: 20 seconds
  • Unhealthy threshold: 2

Custom Health Probes

For more granular control, you can configure custom health probes. This allows you to tailor the probe to your application's specific needs.

Application Gateway Health Probe Configuration
Example of custom health probe configuration in Azure portal.

Probe Protocol and Port

You can choose to probe your backend servers using either HTTP or HTTPS. The port used for probing typically matches the port your application listens on (e.g., 80 for HTTP, 443 for HTTPS).

  • HTTP: Suitable for applications that don't require encrypted communication for health checks.
  • HTTPS: Recommended for secure applications. You may need to configure trusted root certificates if your backend uses self-signed certificates.

Probe Path and Host

The Path specifies the URL path that the health probe will request from the backend server. It's crucial to set this to a path that reliably indicates the application's health. Often, a simple / is sufficient, but for more complex applications, you might point to a dedicated health check endpoint (e.g., /health or /status).

The Host setting in the health probe is used to override the host header sent in the probe request. This is particularly useful in scenarios where your backend servers host multiple websites or applications using the same IP address and port.

Example of a probe targeting a specific path:

GET /api/v1/healthcheck HTTP/1.1
Host: mywebapp.azurewebsites.net
Connection: close

Probe Intervals and Timeouts

  • Interval: The time in seconds between successive health probes. A shorter interval provides faster detection of failures but can increase load on backend servers.
  • Timeout: The time in seconds within which the backend server must respond to a health probe. If the response time exceeds this value, the probe is considered failed.

It's important to balance these settings. A too-short timeout might incorrectly mark healthy servers as unhealthy during transient network latency, while a too-long interval delays the detection of actual failures.

Probe Thresholds

  • Healthy Threshold: The number of consecutive successful probes required to mark a backend server as healthy.
  • Unhealthy Threshold: The number of consecutive failed probes required to mark a backend server as unhealthy.

A common configuration might involve an interval of 15-30 seconds, a timeout of 10-20 seconds, a healthy threshold of 2, and an unhealthy threshold of 3. This provides a good balance between responsiveness and avoiding false positives/negatives.

Understanding Health States

Application Gateway instances report the health of backend servers. The possible states are:

  • Healthy: The backend server is responding to probes successfully and is ready to receive traffic.
  • Unhealthy: The backend server is not responding to probes correctly (timeout, failed response code, etc.) and will not receive new traffic.
  • Unknown: The initial state before the first probe results are available, or if the probe configuration is invalid.

You can view the health status of your backend servers in the Azure portal under the Application Gateway's "Backend health" section.

Tip: Ensure that your backend applications return an HTTP status code between 200-299 for a probe to be considered successful. Any other status code, including redirects (3xx), client errors (4xx), or server errors (5xx), will result in a probe failure.

Troubleshooting Health Probe Failures

If your backend servers are consistently marked as unhealthy, consider the following:

  • Network Connectivity: Verify that Application Gateway can reach your backend servers. Check Network Security Groups (NSGs) and firewall rules.
  • Application Responsiveness: Ensure your application is running and accessible directly via its IP address or domain name.
  • Probe Configuration: Double-check the probe protocol, port, path, and host settings for accuracy.
  • Backend Server Logs: Review the logs on your backend servers for any errors that might be preventing them from responding to probes.
  • SSL Certificates: If using HTTPS probes, ensure the backend server's SSL certificate is valid and trusted by Application Gateway (if applicable, through trusted root certificates).
  • Timeouts and Intervals: Adjust probe timeouts and intervals if network latency or application response times are variable.

You can also use Application Gateway diagnostic logs to gain deeper insights into probe failures.