Run Airflow on Databricks with Astro, the fully managed data orchestration platform.
Designed and operated by the core developers behind Apache Airflow and OpenLineage, Astro brings best practices learned from supporting thousands of Airflow environments into a simple, fully managed platform on Databricks. With Astro, you can focus on building and optimizing your data pipelines, not managing Airflow.
Orchestrate every component of your Databricks data platform
When you’re just starting with Databricks, you may only use a handful of services. But as you scale your usage, the complexity grows. A tension can emerge between empowering data teams and centralizing control. Astro allows you to embrace distributed data services, orchestrating across complex, distributed environments with a single control plane. Airflow offers more than 240 modules to connect to your favorite Databricks data services, including RDS, S3, Redshift, EMR, and more. With asynchronous task support built into Astro Runtime — Astro’s hardened, cloud-native distribution of Airflow — you can orchestrate these services with increased resilience and less overhead – no code changes required.
Orchestrate Across Environments
Astro also makes it easy for teams to build, run, and observe data pipelines that integrate data from sources that are distributed across multiple cloud services, the on-prem environment, and the network edge. It seamlessly orchestrates the pipelines that feed timely, conditioned data to the KPIs, metrics, and measures decision-makers depend on for detailed insights into business operations.
Spend less time managing Airflow
Astro enables you to be up and running in your cloud or ours in less than an hour. Push-button environments, in-place upgrades, and accelerated bug and security fixes free you from dependency on limited DevOps resources. With optimized configuration and auto-scaling built-in, you can count on your tasks running on time instead of getting stuck in the queue.
Understand complex ecosystems
As your data ecosystem grows, dependencies across distributed services and teams can lead to reduced availability of trusted data. Astro integrates OpenLineage into your orchestration environments, providing powerful visibility into the dependencies between datasets, as well as trends in their performance and quality over time.
Integrate securely and seamlessly
Astro’s hybrid architecture allows you to keep orchestration close to your data, with a single-tenant data plane in the region of your choice. Connect privately to your Databricks services with PrivateLink, or establish a peering connection with your VPC or Transit Gateway. Astro supports native IAM role authentication and authorization to your data services, or you can use secrets without giving up control with integrated Databricks Secrets Manager.
Purchase via Databricks Marketplace
Astronomer is an official Databricks Partner, and Astro can be purchased through the Databricks Marketplace. This approach can help speed up the procurement process and consolidate billing. And when purchased through the Databricks Marketplace, Astro even counts toward your committed spend for both your license and infrastructure.