Upgrading Apache Airflow
This guide provides detailed instructions and considerations for upgrading your Apache Airflow installation. Keeping Airflow up-to-date is crucial for security, performance, and access to new features.
Note: Always back up your metadata database and Airflow configurations before performing an upgrade.
Before You Upgrade
Before initiating the upgrade process, ensure you have:
- Reviewed the Release Notes for the target version. Pay close attention to any breaking changes, deprecations, or new features that might affect your workflows.
- Understood the changes in dependencies and Python versions required by the new release.
- Assessed the impact of the upgrade on your custom plugins, providers, and integrations.
- Configured a staging or testing environment that mirrors your production setup.
Upgrade Process Overview
The general upgrade process involves the following steps:
- Backup: Create a complete backup of your metadata database and Airflow configuration files.
- Environment Setup: Prepare your upgrade environment (e.g., staging).
- Install New Version: Install the new Airflow version and its dependencies.
- Upgrade Metadata Database: Run the database migration commands.
- Test: Thoroughly test your DAGs, UI, and critical functionalities in the upgraded environment.
- Deploy: Once satisfied with testing, deploy the new version to production.
Detailed Steps
1. Backup Your Metadata Database
This is the most critical step. The method for backing up your database depends on your specific database system (e.g., PostgreSQL, MySQL, SQLite).
For example, with PostgreSQL:
pg_dump airflow_db > airflow_db_backup.sql
Consult your database documentation for the exact commands.
2. Backup Airflow Configurations
Copy your airflow.cfg
file and any custom configuration files.
cp /path/to/airflow.cfg /path/to/airflow.cfg.backup
3. Install the New Airflow Version
It's recommended to upgrade in a virtual environment.
First, uninstall the old version:
pip uninstall apache-airflow
Then, install the new version. Replace <new_version>
with the desired version number. For example, to install version 2.8.1:
pip install apache-airflow==<new_version>
If you use specific extras, like postgres
or cncf.kubernetes
, include them:
pip install apache-airflow[postgres,cncf.kubernetes]==<new_version>
Install any necessary provider packages separately if they are not bundled or have been moved:
pip install apache-airflow-providers-cncf-kubernetes==<new_provider_version>
4. Upgrade the Metadata Database
After installing the new Airflow version, you need to upgrade the metadata database schema.
Ensure your AIRFLOW_HOME
environment variable is set correctly and points to your Airflow configuration directory.
export AIRFLOW_HOME=/path/to/your/airflow/home
Run the database upgrade command:
airflow db upgrade
This command will apply any pending schema migrations. Review the output for any errors.
Caution: If airflow db upgrade
fails, do NOT proceed without resolving the errors. You may need to restore from your backup.
5. Update Airflow Configuration
After a successful database upgrade, review your airflow.cfg
file. New versions might introduce new configuration options or change default values. Refer to the release notes and documentation for specific changes.
You may need to re-apply any custom settings you had in your old configuration file.
6. Restart Airflow Components
Start your Airflow services (scheduler, webserver, workers) with the newly installed version.
If you are running services via systemd, supervisor, or Docker Compose, restart them accordingly.
7. Test Thoroughly
This is a crucial step. Deploy the upgraded Airflow to your staging environment first.
- Verify that the UI loads correctly.
- Check that your DAGs are parsed and visible.
- Trigger a few sample DAG runs, including those with different operators and dependencies.
- Test integrations with external systems (databases, cloud services, etc.).
- Monitor logs for any new errors or warnings.
- Perform load testing if possible to ensure performance.
Specific Considerations for Major Version Upgrades
Upgrading between major versions (e.g., from 1.x to 2.x, or 2.x to 3.x) often involves more significant changes.
- Provider Packages: Many integrations have been moved into separate provider packages. Ensure you install the correct provider packages for your version.
- Configuration Structure: The configuration file structure and options might change.
- API Changes: If you use the Airflow REST API, review any changes in endpoints or data formats.
- Executor Changes: Certain executors might have been deprecated or modified.
Downgrading
Downgrading is generally not recommended and can be complex, especially if database schema changes have occurred. If you encounter critical issues that require downgrading:
- Restore your metadata database from the backup taken before the upgrade.
- Reinstall the previous version of Airflow.
- Restore your configuration files.
- Restart Airflow components.
It is always best to thoroughly test upgrades in a non-production environment to avoid the need for downgrades.