Develop your Astro project
An Astro project contains all of the files necessary to test and run dags in a local Airflow environment and on Astro. This guide provides information about adding and organizing Astro project files, including:
- Adding dags
- Setting environment variables
- Applying changes
- Running on-build commands
For information about adding Airflow Providers, Python, and OS-level packages, see Add Airflow providers and packages.
For information about running your Astro project in a local Airflow, see Run Airflow locally.
Prerequisites
- The Astro CLI
Create an Astro project
In an empty folder, run the following command to create an Astro project:
This command generates the following files in your directory:
You can use the --from-template option with astro dev init to initialize an Astro project based off of a template. Each template has different contents unique to the use case they are made for, but the directory structures are the same.
Use the rest of this document to understand how to interact with each of these folders and files.
Add dags
In Apache Airflow, data pipelines are defined in Python code as Directed Acyclic Graphs (dags). A dag is a collection of tasks and dependencies between tasks that are defined as code. See Introduction to Airflow dags for an introduction to dags, and Dag Writing on Astro for information on tools to help you write and develop dags.
Dags are stored in the dags folder of your Astro project. To add a dag to your project:
- Add the
.pyfile to thedagsfolder. - Save your changes. If you’re using a Mac, use Command-S.
- Refresh your Airflow browser.
astro run <dag-id> command to run and debug a dag from the command line without starting a local Airflow environment. This is an alternative to testing your entire Astro project with the Airflow webserver and scheduler. See Test your Astro project locally.Add utility files
Airflow dags sometimes require utility files to run workflows. This can include:
- SQL files.
- Custom Airflow operators.
- Python functions.
When more than one dag in your Astro project needs a certain function or query, creating a shared utility file helps make your dags idempotent, more readable, and minimizes the amount of code you have in each dag.
You can store utility files in the /dags directory of your Astro project. In most cases, Astronomer recommends adding your utility files to the /dags directory and organizing them into sub-directories based on whether they’re needed for a single dag or for multiple dags.
In the following example, the dags folder includes both types of utility files:
- To add utility files which are shared between all your dags, create a folder named
utilsin thedagsdirectory of your Astro project. To add utility files only for a specific dag, create a new folder indagsto store both your dag file and your utility file. - Add your utility files to the folder you created.
- Reference your utility files in your dag code.
- Apply your changes. If you’re developing locally, refresh the Airflow UI in your browser.
Utility files in the /dags directory will not be parsed by Airflow, so you don’t need to specify them in .airflowignore to prevent parsing. If you’re using dag-only deploys on Astro, changes to this folder are deployed when you run astro deploy --dags and do not require rebuilding your Astro project into a Docker image and restarting your Deployment.
Add Airflow connections, pools, variables
Airflow connections connect external applications such as databases and third-party services to Apache Airflow. See Manage connections in Apache Airflow or Apache Airflow documentation.
To add Airflow connections, pools, and variables to your local Airflow environment, you have the following options:
- Use the Airflow UI. In Admin, click Connections, Variables or Pools, and then add your values. These values are stored in the metadata database and are deleted when you run the
astro dev killcommand, which can sometimes be used for troubleshooting. - Modify the
airflow_settings.yamlfile of your Astro project. This file is included in every Astro project and permanently stores your values in plain-text. To prevent you from committing sensitive credentials or passwords to your version control tool, Astronomer recommends adding this file to.gitignore. - Use the Astro UI to create connections that can be shared across Deployments in a Workspace. These connections are not visible in the Airflow UI. See Create Airflow connections in the Astro UI.
- Use a secret backend, such as AWS Secrets Manager, and access the secret backend locally. See Configure an external secrets backend on Astro.
When you add Airflow objects to the Airflow UI of a local environment or to your airflow_settings.yaml file, your values can only be used locally. When you deploy your project to a Deployment on Astro, the values in this file are not included.
Astronomer recommends using the airflow_settings.yaml file so that you don’t have to manually redefine these values in the Airflow UI every time you restart your project. To ensure the security of your data, Astronomer recommends configuring a secrets backend.
Add test data or files for local testing
Use the include folder of your Astro project to store files for testing locally, such as test data or a dbt project file. The files in your include folder are included in your deploys to Astro, but they are not parsed by Airflow. Therefore, you don’t need to specify them in .airflowignore to prevent parsing.
If you’re running Airflow locally, apply your changes by refreshing the Airflow UI.
Configure airflow_settings.yaml (Local development only)
The airflow_settings.yaml file includes a template with the default values for all possible configurations. To add a connection, variable, or pool, replace the default value with your own.
-
Open the
airflow_settings.yamlfile and replace the default value with your own. -
Save the modified
airflow_settings.yamlfile in your code editor. If you use a Mac computer, for example, use Command-S. -
Import these objects to the Airflow UI. Run:
-
In the Airflow UI, click Connections, Pools, or Variables to see your new or modified objects.
-
Optional. To add another connection, pool, or variable, you append it to this file within its corresponding section. To create another variable, add it under the existing
variablessection of the same file. For example:
Set environment variables locally
For local development, Astronomer recommends setting environment variables in your Astro project’s .env file. You can then push your environment variables from the .env file to a Deployment on Astro. To manage environment variables in the Astro UI, see Environment variables.
If your environment variables contain sensitive information or credentials that you don’t want to expose in plain-text, you can add your .env file to .gitignore when you deploy these changes to your version control tool.
-
Open the
.envfile in your Astro project directory. -
Add your environment variables to the
.envfile or runastro deployment variable list --saveto copy environment variables from an existing Deployment to the file.Use the following format when you set environment variables in your
.envfile:Environment variables should be in all-caps and not include spaces.
-
Run the following command to confirm that your environment variables were applied locally:
These commands output all environment variables that are running locally. This includes environment variables set on Astro Runtime by default.
-
Optional. Run
astro deployment variable create --loadorastro deployment variable update --loadto export environment variables from your.envfile to a Deployment. You can view and modify the exported environment variables in the Astro UI page for your Deployment.
For local environments, the Astro CLI generates an airflow.cfg file at runtime based on the environment variables you set in your .env file. You can’t create or modify airflow.cfg in an Astro project.
To view your local environment variables in the context of the generated Airflow configuration, run:
These commands output the contents of the generated airflow.cfg file, which lists your environment variables as human-readable configurations with inline comments.
Use multiple .env files
The Astro CLI looks for .env by default, but if you want to specify multiple files, make .env a top-level directory and create sub-files within that folder.
A project with multiple .env files might look like the following:
Add Airflow plugins
If you need to build a custom view in the Airflow UI or build an application on top of the Airflow metadata database, you can use Airflow plugins. To use an Airflow plugin, add your plugin files to the plugins folder of your Astro project. To apply changes from this folder to a local Airflow environment, restart your local environment.
To learn more about Airflow plugins and how to build them, see Airflow Plugins in Airflow documentation or the Astronomer Airflow plugins guide.
Unsupported project configurations
You can’t use airflow.cfg or airflow_local_settings.py files in an Astro project. airflow_local_settings.py has no effect on Astro Deployments, and airflow.cfg has no effect on local environments and Astro Deployments.
An alternative to using airflow.cfg is to set Airflow environment variables in your .env file. See Set environment variables locally.
See also
For more advanced project configurations, see: