Troubleshooting Azure Application Gateway
This guide provides common troubleshooting steps for issues encountered with Azure Application Gateway. We'll cover a range of problems from connectivity to performance and configuration errors.
Common Issues and Solutions
1. Connectivity Problems
- Client cannot connect to the application gateway:
- Verify public IP address and DNS resolution.
- Check Network Security Groups (NSGs) attached to the Application Gateway subnet and associated network interfaces. Ensure inbound traffic on port 80/443 is allowed.
- Ensure the Application Gateway itself is running and healthy in the Azure portal.
- Check backend health to see if the gateway can reach your backend servers.
- Application gateway cannot connect to backend servers:
- Verify backend server IP addresses and ports are correctly configured in the backend pool.
- Check NSGs on the backend server's subnet. Ensure inbound traffic from the Application Gateway's subnet is allowed.
- Confirm that the backend servers are running and accessible directly (without the gateway).
- If using host headers, ensure they match the backend server's expected host.
- Check SSL certificate validity and configuration on backend servers if using HTTPS.
2. Health Probe Failures
Health probes are crucial for determining the availability of your backend servers. If probes are failing, the Application Gateway will stop sending traffic to that server.
- Common causes for probe failures:
- Backend server is down or unreachable.
- Incorrect probe protocol (HTTP/HTTPS), port, or path.
- Firewall blocking probe requests.
- SSL/TLS certificate issues (for HTTPS probes).
- Response timeouts.
- Troubleshooting steps:
- Review the Backend health section in the Azure portal for detailed error messages for each unhealthy probe.
- Manually test the probe URL from a machine that has network access to the backend servers (e.g., a jump box in the same VNet).
- Ensure the health probe path (e.g.,
/health
) is configured correctly and returns a 2xx or 3xx status code. - For HTTPS probes, ensure the SSL certificate on the backend server is valid, trusted, and matches the hostname specified in the probe.
- Increase the Timeout value in the health probe configuration if backend servers are slow to respond.
Tip: Utilize the Application Gateway's diagnostic logs and metrics in Azure Monitor to gain deeper insights into traffic flow, probe status, and backend health.
3. SSL/TLS Errors
- Client connection to gateway fails with SSL error:
- Verify the SSL certificate uploaded to the listener is valid, not expired, and its private key is correctly imported.
- Ensure the correct TLS policy is configured for the listener if you are enforcing specific protocol versions or ciphers.
- Check if the client's browser or tool trusts the certificate authority (CA) that issued your certificate.
- Gateway connection to backend fails with SSL error:
- If the gateway is configured to trust custom CA certificates for backend communication, ensure the CA certificate is correctly uploaded to the Application Gateway's trusted root certificate store.
- Verify the backend server's SSL certificate is valid and matches the hostname specified in the backend HTTP settings.
- Ensure the backend HTTP settings correctly specify the protocol (HTTPS) and port.
4. Performance Issues
- Slow response times:
- Check backend server CPU, memory, and network utilization.
- Monitor Application Gateway metrics like HTTP server errors, latency, and connection counts.
- Review Application Gateway SKU and instance count. Consider scaling up or out if under heavy load.
- Optimize backend application code for efficiency.
- Ensure network latency between the gateway and backend servers is minimal.
5. Rule and Routing Problems
- Requests not reaching the intended backend:
- Carefully review the order and priority of your request routing rules. More specific rules should generally be listed before broader ones.
- Verify the hostname and path matching criteria in your listeners and rules are accurate.
- Check if URL rewrite rules are inadvertently modifying the request path before it reaches the backend.
Important: Always test configuration changes in a non-production environment before applying them to your live Application Gateway.
Diagnostic Tools and Logs
- Azure Portal: The primary interface for monitoring health, metrics, and basic troubleshooting.
- Diagnostic Settings: Configure Application Gateway to send logs (ConnectionLogs, ApplicationGatewayAccessLogs, ApplicationGatewayPerformanceLogs, ApplicationGatewayFirewallLogs) and metrics to Log Analytics, Storage Accounts, or Event Hubs.
- Azure Monitor: Analyze logs and metrics to identify patterns and root causes. Use Kusto Query Language (KQL) in Log Analytics for powerful data exploration.
- Application Gateway Diagnostics (Classic): For older deployments, access logs via the "Diagnostic logs" blade.
- Network Watcher: Tools like Connection Troubleshoot and IP Flow Verify can help diagnose network path issues.
Troubleshooting Checklist
- Verify Application Gateway status in Azure Portal.
- Check backend health for all backend pools.
- Review listener configurations (ports, protocols, certificates).
- Examine request routing rules for correct matching and priority.
- Confirm NSG rules allow necessary traffic.
- Inspect health probe configurations (path, protocol, port, timeouts).
- Validate backend server responsiveness and availability.
- Analyze Application Gateway diagnostic logs for specific errors.
- Consider network latency and throughput.
- Check for any recent configuration changes.