App Service Scaling

Ensure your application can handle varying loads by implementing effective scaling strategies.

Understanding Scaling in Azure App Service

Azure App Service offers two primary scaling methods:

You can choose to manually scale or configure autoscale rules to adjust the number of instances based on predefined metrics.

Manual Scaling

Manual scaling provides direct control over the number of instances. You can adjust the instance count at any time.

Steps for Manual Scaling:

  1. Navigate to your App Service in the Azure portal.
  2. Under "Scale out (App Service plan)", select "Manual scale".
  3. Choose the desired number of instances.
  4. Click "Save".

When to use manual scaling: Predictable traffic patterns, testing, or for applications with fixed resource requirements.

Autoscale

Autoscale allows your App Service to automatically adjust the number of instances based on performance metrics or a schedule.

Key Autoscale Features:

Configuring Autoscale Rules:

  1. Navigate to your App Service plan in the Azure portal.
  2. Under "Scale out (App Service plan)", select "Autoscale".
  3. Configure the default rule (for when no other rules match).
  4. Add new scale-out or scale-in rules. For each rule, define:
    • The metric to monitor.
    • The threshold value.
    • The time grain (how often to evaluate the metric).
    • The statistic (e.g., average, maximum).
    • The cooldown period (how long to wait before triggering another scale action).
    • The number of instances to add or remove.
  5. Configure schedule-based scaling if needed.
  6. Set the minimum and maximum number of instances.
  7. Click "Save".

Example Autoscale Rule: Scale out by 1 instance when CPU percentage is greater than 70% for 10 minutes. Scale in by 1 instance when CPU percentage is less than 30% for 15 minutes.

Scaling Considerations

Code Example: Measuring Request Latency (Conceptual)

While Azure monitors built-in metrics, you might want to implement custom metrics. Here's a conceptual example of how you might track request latency in your application code (e.g., using ASP.NET Core middleware):


// Example using ASP.NET Core Middleware
public class RequestLatencyMiddleware
{
    private readonly RequestDelegate _next;
    private readonly ILogger _logger;

    public RequestLatencyMiddleware(RequestDelegate next, ILogger logger)
    {
        _next = next;
        _logger = logger;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        var stopwatch = Stopwatch.StartNew();
        await _next(context);
        stopwatch.Stop();

        // Log or send this latency metric to an Application Insights custom metric
        // This could be used for custom autoscale rules if configured correctly
        _logger.LogInformation($"Request {context.Request.Path} took {stopwatch.ElapsedMilliseconds}ms");

        // In a real scenario, you would push this to Azure Monitor/Application Insights
        // TelemetryClient.TrackMetric("RequestLatency", stopwatch.ElapsedMilliseconds);
    }
}

// In Startup.cs or Program.cs:
// app.UseMiddleware();
            

Best Practices