Use Git submodules with an Astro project

When your Astro project depends on code maintained in a separate repository, Git submodules let you include that external repository as a nested folder inside your Astro project. This is useful when you want to keep shared code, such as a dbt project or a Python utility library, in its own repository while still deploying it as part of your Astro project.

This document covers how to add a Git submodule to an Astro project, configure CI/CD to deploy with submodules, and test your setup locally.

When to use Git submodules

Use Git submodules when:

  • A separate team maintains code that your Dags depend on, and that code lives in its own repository.
  • You want to pin your Astro project to a specific version of an external repository.
  • You need to combine code from multiple repositories into a single Astro project before deployment.

A common example is including an external dbt project as a submodule so that Cosmos can orchestrate dbt models as Airflow tasks.

Git submodules add complexity to your development and deployment workflows. If the external code changes infrequently or is small, consider copying the code directly into your Astro project instead.

Prerequisites

  • The Astro CLI.
  • An Astro project.
  • Git installed on your local computer.
  • A remote Git repository containing the code you want to include as a submodule.

Add a submodule to your Astro project

1

Add the submodule

Run the following command from your Astro project root folder to add the external repository as a submodule:

$git submodule add <repository-url> <folder-name>

Replace <repository-url> with the URL of the external repository and <folder-name> with the name of the folder where the submodule code appears in your project. For example, to add a dbt project called jaffle-shop:

$git submodule add https://github.com/jessicaschueler/jaffle-shop-classic jaffle-shop

This command creates a .gitmodules file in your project root and clones the external repository into the specified folder. The .gitmodules file tracks the submodule configuration:

.gitmodules
[submodule "jaffle-shop"]
path = jaffle-shop
url = https://github.com/jessicaschueler/jaffle-shop-classic
2

Commit the submodule

After you add the submodule, commit the changes to your repository:

$git add .gitmodules <folder-name>
$git commit -m "Add <folder-name> as a submodule"
3

Verify the submodule

Confirm the submodule is correctly linked by running the following command:

$git submodule status

The output displays the commit hash of the submodule and its folder path.

Example: dbt project with Cosmos

The cosmos-demo-submod repository demonstrates using a Git submodule to include an external dbt project in an Astro project. In this example, the jaffle-shop dbt project is included as a submodule and orchestrated with Cosmos.

The project structure looks like the following:

astro-project/
├── .gitmodules
├── Dockerfile
├── dags/
│ └── basic_cosmos_dag.py
├── jaffle-shop/ # Git submodule (external dbt project)
│ ├── models/
│ ├── seeds/
│ └── dbt_project.yml
├── requirements.txt
└── packages.txt

The Dockerfile installs dbt into a virtual environment so that Cosmos can run dbt models:

Dockerfile
1FROM quay.io/astronomer/astro-runtime:8.8.0
2
3RUN python -m venv dbt_venv && source dbt_venv/bin/activate && \
4 pip install --no-cache-dir dbt-postgres==1.5.4 && deactivate

The requirements.txt file includes the Cosmos package:

requirements.txt
astronomer-cosmos>=1.0.2

Clone an Astro project that contains submodules

When you clone a repository that contains submodules, the submodule folders are empty by default. To initialize and fetch the submodule contents, use one of the following methods:

  • Clone the repository with the --recurse-submodules flag:

    $git clone --recurse-submodules <repository-url>
  • If you already cloned the repository without the flag, initialize the submodules manually:

    $git submodule init
    $git submodule update

Update a submodule to the latest commit

When the external repository has new changes that you want to pull into your Astro project, update the submodule:

1

Pull the latest changes

Run the following command from your Astro project root folder:

$git submodule update --remote <folder-name>

This updates the submodule to the latest commit on its default branch.

2

Commit the updated reference

After you update the submodule, your Astro project repository tracks a new commit hash for it. Commit this change:

$git add <folder-name>
$git commit -m "Update <folder-name> submodule to latest"

Configure CI/CD for submodules

When you deploy an Astro project that contains submodules, your CI/CD pipeline must clone the submodule contents before running astro deploy. Without this step, the submodule folders are empty and the deploy fails.

GitHub Actions

Add the submodules option to your checkout step:

.github/workflows/deploy.yml
1steps:
2 - name: Checkout repository
3 uses: actions/checkout@v4
4 with:
5 submodules: recursive

If your submodule is in a private repository, configure authentication by adding a personal access token (PAT) or deploy key:

.github/workflows/deploy.yml
1steps:
2 - name: Checkout repository
3 uses: actions/checkout@v4
4 with:
5 submodules: recursive
6 token: ${{ secrets.GIT_PAT }}

GitLab CI/CD

Set the GIT_SUBMODULE_STRATEGY variable in your .gitlab-ci.yml file:

.gitlab-ci.yml
1variables:
2 GIT_SUBMODULE_STRATEGY: recursive

Other CI/CD tools

For other CI/CD tools, run the following commands after cloning the repository and before deploying:

$git submodule init
$git submodule update --recursive

Pin a submodule to a specific branch

By default, a submodule tracks a specific commit. To configure it to follow a specific branch, run the following command:

$git config -f .gitmodules submodule.<folder-name>.branch <branch-name>

After setting the branch, run git submodule update --remote to pull the latest commit from that branch.

Troubleshoot submodules

Empty submodule folder after cloning

If the submodule folder is empty after you clone the Astro project, run:

$git submodule init && git submodule update

Submodule points to the wrong commit

If the submodule shows an unexpected version of the code, check the commit hash with git submodule status. To update the submodule to the latest commit:

$git submodule update --remote <folder-name>
$git add <folder-name>
$git commit -m "Update <folder-name> to latest commit"

CI/CD deploy fails with missing files

Ensure your CI/CD pipeline includes a recursive submodule checkout. See Configure CI/CD for submodules.

See also