Debug an Astro Private Cloud migration

Use this guide when your Astro Private Cloud (APC) control plane or data plane Pods are not progressing to a healthy state after installation.

JetStream Pods stuck in Pending or CrashLoopBackOff

  • Cause: STAN volume claims or PVCs not deleted before migration.
  • Solution: Delete existing PVCs for STAN manually with the following code:
1kubectl delete pvc -l app.kubernetes.io/name=stan

prisma migrate deploy or database migration job timing out

  • Cause: Database locks or concurrent transactions, often because of Prometheus or monitoring jobs.
  • Solution:
    • Check for locks:
    SELECT * FROM pg_locks pl JOIN pg_stat_activity psa ON pl.pid = psa.pid;
    • Terminate long-running transactions:
    SELECT pg_terminate_backend(pid)
    FROM pg_stat_activity
    WHERE state = 'active' AND now() - query_start > interval '5 minutes';

Houston Worker logs show NATS: connection timeout

  • Cause: Houston tries to connect to STAN before JetStream is ready.
  • Solution: Restart the Houston Worker Pod to reconnect to JetStream:
1kubectl rollout restart deploy/<release-name>-houston-worker

Airflow 3 Pods fail with ModuleNotFoundError: airflow.providers.cncf

  • Cause: Cluster configuration override missing or not applied.
  • Solution:
    • Reapply the override function in the Migration guide.
    • Ensure minimumAstroRuntimeVersion is set to 3.1-2 or higher in the override config.

Partially applied database migration

  • Cause: Prisma migration interrupted during the JetStream upgrade.
  • Solution:
    • Run migration manually.
    1npx prisma migrate deploy
    • Verify the Cluster and Deployment tables exist.