Database architecture

Astro Private Cloud (APC) uses PostgreSQL as the primary database system for both platform management (Houston) and Apache Airflow deployments. The following sections describe the database architecture, provisioning, connection pooling, and high availability configuration.

Database components

Houston database

The Houston API uses a PostgreSQL database to store platform configuration and state.

Key tables:

  • User - System users with authentication credentials
  • Workspace - Organizational units for grouping deployments
  • Deployment - Airflow deployment configurations
  • Cluster - Data plane cluster properties including cloud provider, region, and configuration
  • RoleBinding - User and service account role assignments
  • ServiceAccount - API keys for programmatic access
  • DeployRevision - Deployment version history
  • TaskUsage - Task execution metrics

Airflow metadata database

Each Airflow Deployment gets its own isolated PostgreSQL database containing two schemas:

SchemaPurpose
airflowAirflow scheduler and webserver metadata
celeryCelery task results backend

The database name is derived from the Deployment release name:

deployment-name → deployment_name_airflow

Each schema has a separate database user with permissions scoped to that schema. Commander provisions the database and both schemas automatically when you create a Deployment. For more information about what Airflow stores in the metadata database, see Understanding the Airflow metadata database.

Database provisioning

Automatic provisioning

By default, Commander automatically provisions databases for new Deployments. No additional configuration is required.

External database

To use pre-existing or managed databases, set skipAirflowDatabaseProvisioning to true in the upsertDeployment mutation:

1mutation {
2 upsertDeployment(
3 workspaceId: "<workspace-id>"
4 label: "<my-deployment-label>"
5 skipAirflowDatabaseProvisioning: true
6 ) {
7 id
8 }
9}

When using external databases, provide the connection string in your Deployment configuration. For complete setup steps with connection string examples, see Bring your own Airflow database. For supported PostgreSQL versions, see the Apache Airflow database backend documentation.

Connection pooling (PgBouncer)

APC uses PgBouncer for connection pooling to reduce database connection overhead. Airflow can open many database connections due to its distributed nature, and each PostgreSQL connection creates a dedicated OS process. PgBouncer reduces this overhead by maintaining a pool of reusable server connections. For more detail on why PgBouncer is recommended, see the Apache Airflow Helm chart production guide.

PgBouncer is only used with the embedded PostgreSQL database. When you provide a metadataConnectionString URI through the upsertDeployment mutation, Airflow connects directly to the external database and bypasses PgBouncer. If you use an external database and need connection pooling, configure it through your managed database service or deploy a separate PgBouncer instance.

PgBouncer operates in transaction pool mode by default, which means server connections are returned to the pool after each transaction completes. This mode does not support session-level features such as prepared statements or SET commands that persist across transactions.

Configuration

1pgbouncer:
2 enabled: true
3 port: 6543
4 resources:
5 requests:
6 memory: 256Mi
7 cpu: 250m
8 limits:
9 memory: 256Mi
10 cpu: 250m

Pool sizes

Pool sizes are configured per Deployment in the Airflow chart values:

1pgbouncer:
2 metadataPoolSize: 10 # Airflow metadata connections per pool
3 resultBackendPoolSize: 5 # Celery result backend connections per pool
4 maxClientConn: 100 # Maximum client connections to PgBouncer

These defaults match the Apache Airflow Helm chart defaults and work for most Deployments. To determine whether you need to adjust pool sizes, monitor PgBouncer’s cl_waiting metric. If clients consistently wait for connections, increase pool sizes or scale PgBouncer replicas.

Airflow also maintains its own SQLAlchemy connection pool (default pool_size: 5, max_overflow: 10) between each Airflow component and PgBouncer. With these defaults, each Airflow process can open up to 15 simultaneous database connections. The full connection chain is: Airflow component → SQLAlchemy pool → PgBouncer → PostgreSQL.

Kerberos support

PgBouncer in APC includes Kerberos support:

1pgbouncer:
2 image:
3 repository: quay.io/astronomer/ap-pgbouncer-krb
4 tag: 1.25.0-3

For complete Kerberos database setup instructions, see Configure Kerberos authentication for Airflow databases.

High availability

PostgreSQL replication

Astronomer doesn’t recommend using internal PostgreSQL instance. If you need production database resiliency, use an externally managed database service such as Amazon RDS, Google Cloud SQL, or Azure Database for PostgreSQL. Managed services provide built-in replication, automated failover, and backup capabilities that are more reliable than running replication within the Helm chart.

Replication is disabled by default. The following example shows a recommended production configuration that enables streaming replication with synchronous commit:

1postgresql:
2 replication:
3 enabled: true
4 slaveReplicas: 2
5 synchronousCommit: "on"
6 numSynchronousReplicas: 1
7 user: repl_user
8
9### Persistence
10
11```yaml
12postgresql:
13 persistence:
14 enabled: true
15 size: 8Gi
16 storageClass: "" # Use default storage class

The following table compares the chart default values with starting points that Astronomer recommends for production environments. The Default column reflects the values shipped in the Helm chart. The Recommended for production column reflects general guidance based on Astronomer operational experience. Adjust these values based on the number of Deployments, workload scale, and observed resource utilization in your environment.

For additional production guidance, see the Apache Airflow Helm chart production guide. Astronomer also recommends using a managed database service such as Amazon RDS, Google Cloud SQL, or Azure Database for PostgreSQL rather than the embedded PostgreSQL container for production workloads.

PostgreSQL

SettingDefaultRecommended for production
postgresql.resources.requests.cpu250m1000m. Adjust based on query load
postgresql.resources.requests.memory256Mi2Gi. Adjust based on active connection count
postgresql.resources.limits.cpu1000m2000m. Adjust based on query load
postgresql.resources.limits.memory2Gi4Gi. Adjust based on active connection count
postgresql.persistence.enabledtruetrue
postgresql.persistence.size8Gi50Gi or more. Scale based on the number of Deployments and retention policy
postgresql.replication.enabledfalsetrue
postgresql.replication.slaveReplicas12
postgresql.replication.synchronousCommit"off""on"
postgresql.replication.numSynchronousReplicas01

PgBouncer (platform level)

SettingDefaultRecommended for production
pgbouncer.resources.requests.cpu250m250m
pgbouncer.resources.requests.memory256Mi256Mi
pgbouncer.resources.limits.cpu250m500m. Adjust based on connection throughput
pgbouncer.resources.limits.memory256Mi512Mi. Adjust based on connection count

PgBouncer (per Deployment)

These values match the Apache Airflow Helm chart defaults. No official formula ties pool sizes to a specific workload metric such as Dag count. Instead, adjust pool sizes based on observed connection utilization.

SettingDefaultRecommended for production
pgbouncer.metadataPoolSize1010. Increase if cl_waiting is consistently non-zero
pgbouncer.resultBackendPoolSize55. Increase for Deployments with high Celery task throughput
pgbouncer.maxClientConn100100. Increase if Deployments run many concurrent Airflow components

Backup and recovery

For database backup and restore procedures, including size estimation, pg_dump/mysqldump commands, and restore steps, see Access Airflow database.

Connection strings

Houston database

The Houston database connection depends on whether you use an external database or the in-cluster PostgreSQL. You configure this during control plane installation.

For external databases, provide the connection through houston.backendConnection in your Helm values:

1houston:
2 backendConnection:
3 user: <username>
4 pass: <password>
5 host: <external-database-host>
6 port: 5432
7 db: <database-name>

Alternatively, provide a pre-existing Kubernetes secret name through houston.backendSecretName. The secret must contain a connection key with the full connection URI.

To retrieve the active Houston database connection string, read the astronomer-bootstrap secret:

$kubectl get secret -n astronomer astronomer-bootstrap \
> -o jsonpath='{.data.connection}' | base64 -d

The in-cluster PostgreSQL option (global.postgresqlEnabled: true) is only for development or proof-of-concept environments and is not supported in production.

Airflow database

Both the metadata and result backend connections use the same database with different schemas and credentials:

# Metadata connection (airflow schema)
postgresql://<metadata-user>:<password>@<host>:5432/<deployment>_airflow
# Result backend connection (celery schema)
postgresql://<celery-user>:<password>@<host>:5432/<deployment>_airflow

When PgBouncer is enabled, Airflow components connect through PgBouncer on port 6543 instead of directly to PostgreSQL on port 5432.

To retrieve the connection string for a specific Deployment, see Access Airflow database.

Monitoring

Monitor the following metrics to track database and connection pool health:

MetricDescription
pg_stat_activityActive database connections
pg_database_sizeDatabase size on disk
pgbouncer_pools_*PgBouncer connection pool statistics, including cl_waiting

Best practices

  • Enable replication for production Deployments.
  • Use PgBouncer to manage connection overhead.
  • Monitor connection pools for cl_waiting and database size growth.
  • Configure regular backups. See Access Airflow database.
  • Size pools based on observed connection utilization, not Dag count.
  • Use separate credentials for each Deployment.
  • Enable SSL for database connections in production.
  • Use a managed database service for production workloads instead of the in-cluster PostgreSQL container.