Upgrading Apache Airflow

This guide provides detailed instructions and considerations for upgrading your Apache Airflow installation. Keeping Airflow up-to-date is crucial for security, performance, and access to new features.

Note: Always back up your metadata database and Airflow configurations before performing an upgrade.

Before You Upgrade

Before initiating the upgrade process, ensure you have:

Upgrade Process Overview

The general upgrade process involves the following steps:

  1. Backup: Create a complete backup of your metadata database and Airflow configuration files.
  2. Environment Setup: Prepare your upgrade environment (e.g., staging).
  3. Install New Version: Install the new Airflow version and its dependencies.
  4. Upgrade Metadata Database: Run the database migration commands.
  5. Test: Thoroughly test your DAGs, UI, and critical functionalities in the upgraded environment.
  6. Deploy: Once satisfied with testing, deploy the new version to production.

Detailed Steps

1. Backup Your Metadata Database

This is the most critical step. The method for backing up your database depends on your specific database system (e.g., PostgreSQL, MySQL, SQLite).

For example, with PostgreSQL:

pg_dump airflow_db > airflow_db_backup.sql

Consult your database documentation for the exact commands.

2. Backup Airflow Configurations

Copy your airflow.cfg file and any custom configuration files.

cp /path/to/airflow.cfg /path/to/airflow.cfg.backup

3. Install the New Airflow Version

It's recommended to upgrade in a virtual environment.

First, uninstall the old version:

pip uninstall apache-airflow

Then, install the new version. Replace <new_version> with the desired version number. For example, to install version 2.8.1:

pip install apache-airflow==<new_version>

If you use specific extras, like postgres or cncf.kubernetes, include them:

pip install apache-airflow[postgres,cncf.kubernetes]==<new_version>

Install any necessary provider packages separately if they are not bundled or have been moved:

pip install apache-airflow-providers-cncf-kubernetes==<new_provider_version>

4. Upgrade the Metadata Database

After installing the new Airflow version, you need to upgrade the metadata database schema.

Ensure your AIRFLOW_HOME environment variable is set correctly and points to your Airflow configuration directory.

export AIRFLOW_HOME=/path/to/your/airflow/home

Run the database upgrade command:

airflow db upgrade

This command will apply any pending schema migrations. Review the output for any errors.

Caution: If airflow db upgrade fails, do NOT proceed without resolving the errors. You may need to restore from your backup.

5. Update Airflow Configuration

After a successful database upgrade, review your airflow.cfg file. New versions might introduce new configuration options or change default values. Refer to the release notes and documentation for specific changes.

You may need to re-apply any custom settings you had in your old configuration file.

6. Restart Airflow Components

Start your Airflow services (scheduler, webserver, workers) with the newly installed version.

If you are running services via systemd, supervisor, or Docker Compose, restart them accordingly.

7. Test Thoroughly

This is a crucial step. Deploy the upgraded Airflow to your staging environment first.

Specific Considerations for Major Version Upgrades

Upgrading between major versions (e.g., from 1.x to 2.x, or 2.x to 3.x) often involves more significant changes.

Downgrading

Downgrading is generally not recommended and can be complex, especially if database schema changes have occurred. If you encounter critical issues that require downgrading:

  1. Restore your metadata database from the backup taken before the upgrade.
  2. Reinstall the previous version of Airflow.
  3. Restore your configuration files.
  4. Restart Airflow components.

It is always best to thoroughly test upgrades in a non-production environment to avoid the need for downgrades.