You can use our Docker images however you'd like, subject to standard Apache 2.0 restrictions.
You may also be interested to use our product which comes as both a managed SaaS (Cloud Edition) and on-prem, deploys to your Kubernetes (Enterprise Edition). Both editions makes it easy to deploy Apache Airflow clusters and Airflow DAGs in a multi-team, multi-user environment.
The only requirement to get our Docker images up and running is Docker Engine and Docker Compose. If you don't have these installed already, visit these links for more information:
To get up and running quickly, we have provided a several
docker-compose files to quickly spin up different components
of the platform. Simply
docker-compose up. Some directories will have additional
scripts to wrap some useful functionality around
These are documented on their respective pages.
docker-compose up will download our pre-built images
(see Makefile for details) from our
and spin up containers running the various platform systems.
Building the Images
All platform images are built from a minimal Alpine Linux base image to keep our footprint minimal and secure.
To build the images from scratch run
make build in your terminal.
If you've already downloaded the images from DockerHub, this will
replace them. These images will be used when running the platform
Building the Documentation
Documentation is built on Jekyll and hosted on Google Cloud Storage.
Build the docs site locally:
cd docs bundle install
bundle exec jekyll serve
The Astronomer Airflow module consists of seven components, and you must bring your own Postgres and Redis database, as well as a container deployment strategy for your cloud.
git clone https://github.com/astronomer/astronomer.git cd astronomer
We provide two examples for Apache Airflow. Each will spin up a handful of containers to mimic a live Astronomer environment.
Airflow Core vs Airflow Enterprise
Here's a comparison of the components included in the Airflow Core vs. Airflow Enterprise examples:
|Component||Airflow Core||Airflow Enterprise|
To start the simple Airflow example:
cd examples/airflow-core docker-compose up
To start the more sophisticated Airflow example:
cd examples/airflow-enterprise docker-compose up
You're up and running with Apache Airflow. The following sections will help you get started with your first pipelines, or get your existing pipelines running on the Astronomer Platform.
Start from Scratch
You need to write your first DAG. Review:
We recommend managing your DAGs in a Git repo, but for the purposes
of getting rolling, just make a directory on your machine with a
dags directory, and you can copy the sample dag from the link
above into the folder inside a file
test_dag.py. We typically
advise first testing locally on your machine, before pushing
changes to your staging environment. Once fully tested you can
deploy to your production instance.
When ready to commit new source or destination hooks/operators, our best practice is to commit these into separate repositories for each plugin.
Start from Existing Code
If you already have an Airflow project (Airflow home directory),
getting things running on Astronomer is straightforward. Within
examples/airflow, we provide a
start script that can wire up
a few things to help you develop on Airflow quickly.
You'll also notice a small
.env file next to the
docker-compose.yml file. This file is automatically sourced by
docker-compose and it's variables are interpolated into the
service definitions in the
docker-compose.yml file. If you run
docker-compose up, like we did above, we mount volumes into your
/tmp directory for Postgres and Redis. This will
automatically be cleaned up for you.
This will also be the behavior if you run
./start with no
arguments. If you want to load your own Airflow project into this
system, just provide the project's path as an argument to run, like
Under the hood, a few things make this work.
.dockerignore files are written into your project directory.
.astro directory is created.
Dockerfile.astrojust links to a special
onbuildversion of our Airflow image that will automatically add certain files, within the
.astrodirectory to the image.
.astrofile will contain a
datadirectory which will be used for mapping docker volumes into for Postgres and Redis. This lets you persist your current Airflow state between shutdowns. These files are automatically ignored by
.astrodirectory will also contain a
requirements.txtfile that you can add python packages to be installed using
pip. We will automatically build and install them when the containers are restarted.
- In some cases, python modules will need to compile native modules
and/or rely on other package that exist outside of the python
ecosystem. In this case, we also provide a
packages.txtfile in the
.astrodirectory, where you can add Alpine packages. The format is similar to
requirements.txt, with a package on each line.
With this configuration, you can point the
./start script at any
Airflow home directory and maintain distinct and separate
environments for each, allowing you to easily test different
Airflow projects in isolation.