Scaling Azure App Service

Azure App Service offers powerful capabilities for automatically scaling your web applications based on demand. This ensures your application remains responsive and available, even during traffic spikes.

Understanding Scaling in App Service

App Service supports two primary scaling methods:

Scale Up: Increasing the compute resources (CPU, memory, disk space) available to your application by moving to a higher pricing tier. This is suitable for handling more load with existing instances.
Scale Out: Increasing the number of compute instances running your application. This is ideal for distributing load across multiple machines.

Automatic Scaling (Scale Out)

Automatic scaling allows you to define rules that trigger the addition or removal of instances based on various metrics.

Metrics for Scaling

Common metrics used for triggering scaling actions include:

CPU Percentage: Scales out when average CPU utilization exceeds a defined threshold.
Memory Percentage: Scales out when average memory utilization exceeds a defined threshold.
Disk Queue Length: Scales out when the number of requests waiting for disk I/O increases.
HTTP Queue Length: Scales out when the number of pending HTTP requests grows.
Data In/Out: Scales out based on network traffic.

Configuring Scale Out Rules

You can configure scale-out rules in the Azure portal under the "Scale out (App Service plan)" section of your App Service.

Example Configuration:


// Example of a scale-out rule using CPU percentage
{
  "name": "ScaleUpOnCPU",
  "description": "Scale out when CPU is above 70%",
  "metricName": "CPUPercentage",
  "metricNamespace": "Microsoft.Web/serverFarms",
  "statistic": "Average",
  "timeGrain": "PT1M",
  "timeWindow": "PT5M",
  "threshold": 70.0,
  "changeCount": 1,
  "direction": "Increase"
}

// Example of a scale-in rule using CPU percentage
{
  "name": "ScaleDownOnCPU",
  "description": "Scale in when CPU is below 30%",
  "metricName": "CPUPercentage",
  "metricNamespace": "Microsoft.Web/serverFarms",
  "statistic": "Average",
  "timeGrain": "PT1M",
  "timeWindow": "PT5M",
  "threshold": 30.0,
  "changeCount": -1,
  "direction": "Decrease"
}

Instance Limits

You must define both a minimum and maximum number of instances. The service will not scale below the minimum or above the maximum.

Scale Up vs. Scale Out Decision

Consider the following when choosing between scaling up and scaling out:

Scale Up: Good for applications that cannot be easily parallelized or benefit from more powerful single instances.
Scale Out: Generally more cost-effective for handling high concurrency and distributing load. Most web applications benefit from scaling out.

Best Practice: Start with automatic scaling (scale out) and monitor your application's performance. If you encounter limitations that cannot be addressed by adding more instances (e.g., single-threaded processes), consider scaling up.

Manual Scaling

You can also manually set the number of instances for your App Service plan. This is useful for testing or when you have predictable load patterns.

Monitoring Scalability

Regularly monitor your application's performance metrics in the Azure portal. Look for trends in CPU usage, memory, request queue length, and response times to fine-tune your scaling rules.

Scaling Azure App Service

Understanding Scaling in App Service

Automatic Scaling (Scale Out)

Metrics for Scaling

Configuring Scale Out Rules

Instance Limits

Scale Up vs. Scale Out Decision

Manual Scaling

Monitoring Scalability

Related Documentation