Troubleshooting Azure Load Balancer
This section provides guidance on diagnosing and resolving common issues with Azure Load Balancer. A well-functioning load balancer is crucial for ensuring high availability and scalability of your applications.
Common Load Balancer Issues and Solutions
1. Health Probe Failures
Health probes are essential for the load balancer to determine the health of backend instances. If probes fail, traffic may not be directed to healthy instances.
- Symptom: Instances are marked as unhealthy, or the load balancer stops sending traffic to certain instances.
- Diagnosis:
- Verify that the health probe configuration (port, protocol, path) matches the application listening on the backend instances.
- Ensure that the backend instances are running and responsive on the specified probe port.
- Check Network Security Groups (NSGs) associated with the backend subnet or network interfaces. Ensure they allow inbound traffic from the load balancer's probe IP (typically
168.63.129.16) on the probe port. - Confirm that firewalls on the backend instances are not blocking the health probe traffic.
- Review application logs on the backend instances for any errors that might cause them to fail the probe.
- Resolution:
- Correct the health probe configuration in the Azure portal.
- Adjust NSGs to permit health probe traffic.
- Configure instance-level firewalls to allow probe traffic.
- Address application errors impacting probe responses.
2. Connection Timeouts or Refusals
Users might experience connection timeouts or refusals when trying to access services behind the load balancer.
- Symptom: Clients cannot establish a connection, or connections are dropped.
- Diagnosis:
- Frontend IP and Port: Ensure the frontend IP configuration and the listener port on the load balancer are correctly configured.
- Backend Pool: Verify that backend instances are registered in the backend pool and are healthy (check health probe status).
- Load Balancing Rules: Confirm that the load balancing rule is configured with the correct frontend IP, port, backend pool, and probe.
- NSGs: Check NSGs on both the frontend subnet (if applicable) and the backend subnet for rules blocking traffic. Remember to allow inbound traffic from the internet on the frontend port and outbound traffic from the load balancer to the backend instances on their service port.
- Instance Firewalls: Ensure that the firewalls on the backend virtual machines allow inbound traffic on the service port from the load balancer's internal IP addresses.
- Application Responsiveness: Confirm that the application on the backend instances is actively listening and responding to requests on the correct port.
- Resolution:
- Update load balancer configuration if necessary.
- Modify NSGs to allow required traffic flows.
- Configure instance firewalls.
- Troubleshoot application-level issues.
3. Uneven Traffic Distribution
Traffic might not be distributed evenly across backend instances, leading to some instances being overloaded while others are idle.
- Symptom: Some backend instances show high CPU or network utilization, while others are low.
- Diagnosis:
- Load Balancing Algorithm: Azure Load Balancer uses a five-tuple hash-based distribution (source IP, source port, destination IP, destination port, protocol). Understand how this works and if your application traffic patterns might lead to unequal distribution (e.g., many clients with the same source IP).
- Session Persistence (Sticky Sessions): If session persistence is enabled, subsequent requests from the same client will always go to the same backend instance. If this is not desired, it can lead to uneven load.
- Health Probe Configuration: If health probes are too sensitive or not accurately reflecting application load, they might incorrectly mark instances as unhealthy or healthy, affecting distribution.
- Backend Instance Performance: Ensure all backend instances have similar capacity and performance.
- Resolution:
- If uneven distribution is a concern due to the hashing algorithm, consider using Azure Application Gateway for more advanced routing capabilities or ensuring a diverse set of source IPs if possible.
- Adjust session persistence settings if they are contributing to the issue.
- Fine-tune health probe configurations.
- Ensure backend instances are adequately provisioned.
4. Slow Performance
Applications might exhibit slow response times when accessed through the load balancer.
- Symptom: Web pages load slowly, API calls take a long time to complete.
- Diagnosis:
- Latency: Measure latency between the client and the load balancer, and between the load balancer and the backend instances. Azure's
168.63.129.16IP address can be used for connectivity tests. - Backend Instance Performance: Check the CPU, memory, and network utilization of the backend virtual machines.
- Application Profiling: Profile the application code on the backend instances to identify performance bottlenecks.
- Network Path: Use tools like
traceroute(ortracerton Windows) to diagnose potential network congestion or high latency hops. - Load Balancer SKU: Ensure the Load Balancer SKU (Standard or Basic) meets your performance requirements. Standard Load Balancer offers higher throughput.
- Latency: Measure latency between the client and the load balancer, and between the load balancer and the backend instances. Azure's
- Resolution:
- Optimize application code.
- Scale up or scale out backend instances.
- Choose the appropriate Load Balancer SKU.
- Address any identified network latency issues.
Troubleshooting Tools and Techniques
- Azure Network Watcher: Utilize Network Watcher's "Connection Troubleshoot" and "IP Flow Verify" features to diagnose connectivity issues between components.
- Azure Monitor: Monitor Load Balancer metrics (e.g., healthy/unhealthy host counts, data path availability, network in/out) for anomalies.
- Log Analytics: Query diagnostic logs for Azure Load Balancer to gain detailed insights into traffic flow and errors.
tcpdump/Wireshark: Capture network traffic on backend instances to inspect incoming requests and outgoing responses.netcat(nc): Usenetcaton backend instances to test if a port is open and listening.- Azure CLI/PowerShell: Use these tools to inspect and modify your Load Balancer configuration programmatically.
168.63.129.16. This IP is used by Azure platform services for health probes, connectivity checks, and other essential functions.
If you continue to experience issues, consider opening a support ticket with Azure for further assistance.
Back to Troubleshooting Home