Using the BashOperator
The BashOperator is one of the most commonly used operators in Airflow. It executes bash commands or a bash script from within your Airflow DAG.
In this guide you’ll learn:
- When to use the BashOperator.
- How to use the BashOperator and
@task.bash
decorator. - How to use the BashOperator including executing bash commands and bash scripts.
- How to run scripts in non-Python programming languages using the BashOperator.
Assumed knowledge
To get the most out of this guide, you should have an understanding of:
- Airflow operators. See Operators 101.
- Airflow decorators. See Introduction to the TaskFlow API and Airflow decorators.
- Basic bash commands. See the Bash Reference Manual.
How to use the BashOperator and @task.bash
decorator
The BashOperator is part of core Airflow and can be used to execute a single bash command, a set of bash commands, or a bash script ending in .sh
. The @task.bash
decorator can be used to create bash statements using Python functions and is available as of Airflow 2.9.
Traditional
Taskflow
The following parameters can be provided to the operator and decorator:
bash_command
: Defines a single bash command, a set of commands, or a bash script to execute. This parameter is required.env
: Defines environment variables in a dictionary for the bash process. By default, the defined dictionary overwrites all existing environment variables in your Airflow environment, including those not defined in the provided dictionary. To change this behavior, you can set theappend_env
parameter. If you leave this parameter blank, the BashOperator inherits the environment variables from your Airflow environment.append_env
: Changes the behavior of theenv
parameter. If you set this toTrue
, the environment variables you define inenv
are appended to existing environment variables instead of overwriting them. The default isFalse
.output_encoding
: Defines the output encoding of the bash command. The default isutf-8
.skip_exit_code
: Defines which bash exit code should cause the BashOperator to enter askipped
state. The default is99
.cwd
: Changes the working directory where the bash command is run. The default isNone
and the bash command runs in a temporary directory.
The behavior of a BashOperator task is based on the status of the bash shell:
- Tasks succeed if the whole shell exits with an exit code of 0.
- Tasks are skipped if the exit code is 99 (unless otherwise specified in
skip_exit_code
). - Tasks fail in case of all other exit codes.
If you expect a non-zero exit from a sub-command you can add the prefix set -e;
to your bash command to make sure that the exit is captured as a task failure.
Both the bash_command
and the env
parameter can accept Jinja templates. However, the input given through Jinja templates to bash_command
is not escaped or sanitized. If you are concerned about potentially harmful user input you can use the setup shown in the BashOperator documentation.
When to use the BashOperator
The following are common use cases for the BashOperator and @task.bash
decorator in Airflow DAGs:
- Creating and running bash commands based on complex Python logic.
- Running a single or multiple bash commands in your Airflow environment.
- Running a previously prepared bash script.
- Running scripts in a programming language other than Python.
- Running commands to initialize tools that lack specific operator support. For example Soda Core.
Example: Using Python to create bash commands
In Airflow 2.9+, you can use @task.bash
to create bash statements using Python functions. This decorator is especially useful when you want to run bash commands based on complex Python logic, including inputs from upstream tasks. The following example demonstrates how to use the @task.bash
decorator to conditionally run different bash commands based on the output of an upstream task.
Example: Execute two bash commands using one BashOperator
The BashOperator can execute any number of bash commands separated by &&
.
In this example, you run two bash commands in a single task:
echo Hello $MY_NAME!
prints the environment variableMY_NAME
to the console.echo $A_LARGE_NUMBER | rev 2>&1 | tee $AIRFLOW_HOME/include/my_secret_number.txt
takes the environment variableA_LARGE_NUMBER
, pipes it to therev
command which reverses any input, and saves the result in a file calledmy_secret_number.txt
located in the/include
directory. The reversed number will also be printed to the console.
The second command uses an environment variable from the Airflow environment, AIRFLOW_HOME
. This is only possible because append_env
is set to True
.
It is also possible to use two separate BashOperators to run the two commands, which can be useful if you want to assign different dependencies to the tasks.
Example: Execute a bash script
The BashOperator can also be provided with a bash script (ending in .sh
) to be executed.
For this example, you run a bash script which iterates over all files in the /include
folder and prints their names to the console.
Make sure that your bash script (my_bash_script.sh
in this example) is available to your Airflow environment. If you use the Astro CLI, you can make this file accessible to Airflow by placing it in the /include
directory of your Astro project.
It is important to make the bash script executable by running the following command before making the script available to your Airflow environment:
If you use the Astro CLI, you can run this command before running astro dev start
, or you can add the command to your project’s Dockerfile with the following RUN
command:
Astronomer recommends running this command in your Dockerfile for production builds such as Astro Deployments or in production CI/CD pipelines.
After making the script available to Airflow, you only have to provide the path to the script in the bash_command
parameter. Be sure to add a space character at the end of the filepath, or else the task will fail with a Jinja exception!
Example: Run a script in another programming language
Using the BashOperator is a straightforward way to run a script in a non-Python programming language in Airflow. You can run a script in any language that can be run with a bash command.
In this example, you run some JavaScript to query a public API providing the current location of the international Space Station. The query result is pushed to XCom so that a second task can extract the latitude and longitude information in a script written in R and print the data to the console.
The following setup is required:
- Install the JavaScript and R language packages at the OS level.
- Write a JavaScript file.
- Write a R script file.
- Make the scripts available to the Airflow environment.
- Execute the files from within a DAG using the BashOperator.
If you use the Astro CLI, the programming language packages can be installed at the OS level by adding them to the packages.txt
file of your Astro project.
The following JavaScript file contains code for sending a GET request to the /iss-now
path at api.open-notify.org
and returning the results to stdout
, which will both be printed to the console and pushed to XCom by the BashOperator.
The second task runs a script written in R that uses a regex to filter and print the longitude and latitude information from the API response.
To run these scripts using the BashOperator, ensure that they are accessible to your Airflow environment. If you use the Astro CLI, you can place these files in the /include
directory of your Astro project.
The DAG uses the BashOperator to execute both files defined above sequentially.