+

Apache Airflow® Logo

About Google Dataproc

Google Dataproc is a highly scalable service to run Apache Spark, Apache Flink, Presto, and many more open source tools fully integrated in Google Cloud. Use Google Dataproc to run your compute-intensive Astro tasks handling large amounts of data for data science and ETL processes.


Use Case

Gaining insights from large amounts of data using distributed machine learning is a common use case for orchestrating jobs in Google Dataproc using Astro. Astro offers specialized operators to effortlessly leverage async processes when interacting with Google Dataproc, making your pipeline more cost-effective.

Ready to Get Started?

See how your team can fuel its data workflows with more power and less complexity than ever before.

Start Free Trial →

Which plan works best for your team?

Learn about pricing →

What can Astronomer do for your organization?

Talk to an expert →