Upgrading Azure Kubernetes Service (AKS) Clusters

Keeping your Azure Kubernetes Service (AKS) clusters up-to-date is crucial for security, performance, and accessing the latest features. This guide covers the various methods and best practices for upgrading your AKS clusters.

Understanding AKS Upgrade Concepts

An AKS upgrade involves updating the Kubernetes version of your control plane and agent nodes. AKS provides managed upgrades for the control plane, while agent nodes can be upgraded through several mechanisms.

Control Plane Upgrades

The AKS control plane is a managed service, meaning Microsoft handles its lifecycle, including upgrades. When a new Kubernetes version is released, AKS will eventually make it available for your cluster. You can initiate a control plane upgrade through the Azure portal, Azure CLI, or REST API.

Agent Node Upgrades

Upgrading agent nodes involves updating the operating system and kubelet on the virtual machines that make up your node pools. AKS supports different upgrade strategies for node pools:

  • In-place upgrade: The existing nodes are updated.
  • Surge upgrade: New nodes with the upgraded version are created, and then the old nodes are drained and deleted. This minimizes downtime.

Methods for Upgrading AKS Clusters

1. Azure Portal

The Azure portal offers a user-friendly interface for managing AKS clusters. To upgrade:

  1. Navigate to your AKS cluster in the Azure portal.
  2. In the left-hand menu, under 'Settings', select 'Node pools'.
  3. Select the node pool you wish to upgrade.
  4. Click the 'Upgrade' button and choose the desired Kubernetes version.
  5. Follow the prompts to initiate the upgrade process.

2. Azure CLI

The Azure Command-Line Interface (CLI) is a powerful tool for automating and managing Azure resources. Use the following commands:


# Set your subscription and resource group
az account set --subscription ""
az configure --defaults group=""

# List available Kubernetes versions for your cluster
az aks get-versions --output table

# Upgrade the control plane to a specific version
az aks upgrade --name "" --kubernetes-version ""

# Upgrade a specific node pool (using surge upgrade by default if not specified)
az aks nodepool upgrade --resource-group "" --cluster-name "" --name "" --kubernetes-version ""
                

3. Azure REST API / SDKs

For programmatic upgrades and integration into CI/CD pipelines, you can use the Azure REST API or the various Azure SDKs (e.g., Python, Go, .NET).

Best Practices for AKS Upgrades

  • Test upgrades in a staging environment: Before upgrading production clusters, test the upgrade process and application compatibility in a non-production environment.
  • Review release notes: Always check the Kubernetes and AKS release notes for breaking changes or deprecated features that might affect your applications.
  • Use surge upgrades for node pools: For production environments, leverage surge upgrades to minimize application disruption.
  • Monitor application health during and after upgrades: Keep a close eye on your applications for any unexpected behavior or errors.
  • Perform upgrades during off-peak hours: Schedule upgrades when traffic is minimal to reduce the impact on users.
  • Keep node pools in sync: For consistency, it's generally recommended to upgrade all node pools to the same Kubernetes version.

Important Note

AKS supports upgrading to newer patch versions of the same minor version, and to newer minor versions. You cannot skip minor versions (e.g., from 1.25 to 1.27). Always refer to the official AKS release schedule for supported versions.

Troubleshooting Common Upgrade Issues

If you encounter issues during an upgrade, consider the following:

  • Check pod disruption budgets (PDBs): Ensure your PDBs are configured correctly to allow for graceful node draining during upgrades.
  • Review cluster logs: Examine AKS diagnostic logs and Kubernetes event logs for error messages.
  • Verify resource availability: Ensure sufficient resources are available in your subscription to accommodate new nodes during surge upgrades.