Upgrade to Airflow 2
This guide explains how to upgrade an Astronomer Software Deployment from Airflow 1.10.15 to 2.3.
As a follow up to Airflow 2, Airflow 2.3 was released in May 2022 with new features like dynamic task mapping and a Grid view in the Airflow UI. Given the significance of this release, Astronomer is providing full support for Airflow 2.3 until October 2023.
Astronomer strongly recommends upgrading any Astronomer Software Deployments currently running Airflow 1.10.15 to Airflow 2.3.
The benefits of Airflow 2
Airflow 2 was built to be fast, reliable, and infinitely scalable. Among the hundreds of new features both large and small, Airflow 2 includes:
- Refactored Airflow Scheduler for enhanced performance and high-availability.
- Full REST API that enables more opportunities for automation.
- TaskFlow API for a simpler way to pass information between tasks.
- Independent Providers for improved usability and a more agile release cadence.
- Simplified KubernetesExecutor for ultimate flexibility in configuration.
- UI/UX Improvements including a new Airflow UI and auto-refresh button in the Graph view.
Airflow 2.3 subsequently introduced several powerful features, the most notable of which is dynamic task mapping. For more information on Airflow 2.3, see "Apache Airflow 2.3.0 is here" and the Airflow 2.3.0 changelog.
Prerequisites
This setup requires:
- The Astro CLI.
- An Astro project running Airflow 1.10.15. If your Astro project uses Airflow 1.10.14 or earlier, upgrade to 1.10.15 using the standard upgrade process.
Step 1: Run the Airflow upgrade check script
Not all Airflow 1.10.15 DAGs work in Airflow 2,. The Airflow 2 upgrade check script can check for compatibility issues in your DAG code.
To run the Airflow 2 upgrade check script and install the latest version of the apache-airflow-upgrade-check
package at runtime, open your Astro project and run the following command:
astro dev upgrade-check
This command outputs the results of tests which check the compatibility of your DAGs with Airflow 2.
In the upgrade check output, you can ignore the following entries:
Fernet is enabled by default
Check versions of PostgreSQL, MySQL, and SQLite to ease upgrade to Airflow 2
Users must set a kubernetes.pod_template_file value
For more information about upgrade check functionality, see Upgrade Check Script in Apache Airflow documentation.
Step 2: Prepare Airflow 2 DAGs
Review the results from the Airflow upgrade check script and then update your import statements, DAGs, and configurations if necessary.
a. Import operators from backport providers
All Airflow 2 providers supported a backported package version for Airflow 1.10.15. You can use backported provider packages to test your DAGs with Airflow 2's functionality in a 1.10.15 environment.
- Add all necessary backported providers to the
requirements.txt
file of the Astro project. - Modify the import statements of your DAGs to reference the backported provider packages.
- Run your DAGs to test their compatibility with Airflow 2 providers.
For more information, see 1.10.15 Backport Providers in Apache Airflow documentation, or see the collection of Backport Providers in PyPi.
b. Modify Airflow DAGs
Depending on your DAGs, you might need to make the following changes to make sure your code is compatible with Airflow 2:
- Changes to undefined variable handling in templates.
- Changes to the KubernetesPodOperator.
- Changing the default value for
dag_run_conf_overrides_params
.
For other compatibility considerations, see Step 5: Upgrade Airflow DAGs in Apache Airflow documentation.
Step 3: Upgrade to Airflow 2.3
If the upgrade check script didn't identify any issues with your existing DAGs and configurations, you're ready to upgrade to Airflow 2.3.
To upgrade to Airflow 2.3,
-
Initialize the Airflow upgrade process via the Astronomer UI or CLI.
-
Depending on what distribution of Airflow you want to use, add one of the following lines to your project's
Dockerfile
:FROM quay.io/astronomer/astro-runtime:5.4.0
FROM quay.io/astronomer/ap-airflow:2.3.4-onbuild
-
Modify all backport providers and replace them with fully supported provider packages. For example, if you were using the Mongo backport provider, replace
apache-airflow-backport-providers-mongo
withapache-airflow-providers-mongo
in yourrequirements.txt
file. For more information, see Airflow documentation on provider packages. -
Restart your local environment and open the Airflow UI to confirm that your upgrade was successful.
-
Deploy your project to Astronomer.
Upgrade considerations
Airflow 2.3 includes changes to the schema of the Airflow metadata database. When you first upgrade to Runtime 2.3, consider the following:
-
Upgrading to Airflow 2.3 can take 10 to 30 minutes or more depending on the number of task instances that have been recorded in the metadata database throughout the lifetime of your Deployment. During the upgrade, scheduled tasks will continue to execute but new tasks will not be scheduled.
-
Once you upgrade successfully to Airflow 2.3, you might see errors in the Airflow UI that warn you of incompatible data in certain tables of the database. For example:
Airflow found incompatible data in the `dangling_rendered_task_instance_fields` table in your metadata database, and moved...
These warnings have no impact on your tasks or DAGs and can be ignored. If you want to remove these warning messages from the Airflow UI, contact Astronomer Support. If necessary, Astronomer can remove incompatible tables from your metadata database.