Debug an Astro Private Cloud migration

Use this guide when your Astro Private Cloud (APC) control plane or data plane Pods are not progressing to a healthy state after installation.

JetStream Pods stuck in Pending or CrashLoopBackOff

  • Cause: STAN volume claims or PVCs not deleted before migration.
  • Solution: Delete existing PVCs for STAN manually with the following code:
1kubectl delete pvc -l app.kubernetes.io/name=stan

prisma migrate deploy or database migration job timing out

  • Cause: Database locks or concurrent transactions, often because of Prometheus or monitoring jobs.
  • Solution:
    • Check for locks:
    SELECT * FROM pg_locks pl JOIN pg_stat_activity psa ON pl.pid = psa.pid;
    • Terminate long-running transactions:
    SELECT pg_terminate_backend(pid)
    FROM pg_stat_activity
    WHERE state = 'active' AND now() - query_start > interval '5 minutes';

Houston Worker logs show NATS: connection timeout

  • Cause: Houston tries to connect to STAN before JetStream is ready.
  • Solution: Restart the Houston Worker Pod to reconnect to JetStream:
1kubectl rollout restart deploy/<release-name>-houston-worker

Airflow 3 Pods fail with ModuleNotFoundError: airflow.providers.cncf

  • Cause: Cluster configuration override missing or not applied.
  • Solution:
    • Reapply the override function in the Migration guide.
    • Ensure minimumAstroRuntimeVersion is set to 3.1-2 or higher in the override config.

Airflow deployments not coming up due to postgres connection errors after upgrade (missing database name)

  • Cause: The astronomer-bootstrap secret connection string does not include a database name suffix (for example, /main on AWS RDS or /postgres on AKS), causing connection failures. Verify the correct database name by logging in to your database.
  • Solution:
    • Patch the astronomer-bootstrap secret so connection ends with /<database_name>, then run a Helm upgrade.
    • The secret is in the astronomer namespace (or the namespace where you install APC).
    • In CP/DP mode, patch in both the Control Plane and Data Plane clusters.
$# Namespace where APC is installed
>NAMESPACE=astronomer
># Database name suffix; for AWS the default database is usually "main", for AKS "postgres"
>DB_NAME=main
>
># Read current connection string, append database name, and update the secret
>CURRENT=$(kubectl -n "$NAMESPACE" get secret astronomer-bootstrap -o jsonpath='{.data.connection}' | base64 -d)
>NEW="${CURRENT%/}/$DB_NAME"
>kubectl -n "$NAMESPACE" patch secret astronomer-bootstrap --type=merge -p "{\"data\":{\"connection\":\"$(printf '%s' "$NEW" | base64)\"}}"
>
># Apply the update with a Helm upgrade
>helm upgrade -f values.yaml -n "$NAMESPACE" astronomer astronomer/astronomer

Partially applied database migration

  • Cause: Prisma migration interrupted during the JetStream upgrade.
  • Solution:
    • Run migration manually.
    1npx prisma migrate deploy
    • Verify the Cluster and Deployment tables exist.