Batch Inference with Airflow and SageMaker

Watch On Demand

Missed the Webinar? Watch Now.

By proceeding you agree to our Privacy Policy , our Website Terms and to receive emails from Astronomer.


ML Ops poses many challenges to data scientists, from tool selection to orchestration. Add on the requirements for reproducibility and management, and the complexity becomes difficult to manage at scale.

By leveraging Airflow with SageMaker, you can have a dependable production ML Ops system with enhanced data lineage and full orchestration capabilities, all in one place.

We’ll walk through using the new SageMaker Async Operators, as well as the new SageMaker OpenLineage integration, for end-to-end ML OPs for batch inference use cases.

Key Takeaways

We’ll look at the new Airflow integrations with SageMaker and how they simplify the execution and management of core ML Ops requirements, including:

  • How to coordinate SageMaker Training and Models with Airflow for batch inference use cases.
  • How to implement Astronomer’s async SageMaker operators for cost and resource savings.
  • How to apply the OpenLineage integration with the SageMakerProcessingOperator and the SageMakerTransformOperator for increased observability through SageMaker ML Lineage.

Ganapathi Krishnamoorthi - Senior ML Solutions Architect, Amazon

Ganapathi Krishnamoorthi is a Senior ML Solutions Architect at Amazon. Ganapathi has more than 10 years of experience in AI/ML and data analytics. He is 7x AWS certified, speaks at AI/ML events, and has authored several white papers and blogs on AWS AI/ML services. Ganapathi provides prescriptive guidance to startups and enterprise customers, helping them to design and deploy cloud applications at scale. He is passionate about solving real-world customer problems with ML for better business outcomes.

Faisal Hoda - Enterprise Architect at Astronomer Faisal is an Enterprise Architect focusing on ML Architectures and Ops at Astronomer. He has previous experience as a machine learning engineer, data scientist, and data engineer and has led large teams to develop and deploy ML-based platforms. He is also an Airflow contributor and has deployed Astronomer and Airflow in various companies at scale.

Astronomer Webinars are biweekly, real-time online sessions for data pipeline authors hosted by Astronomer’s Apache Airflow experts. During an hour-long meeting, participants have a chance to dive into the most important features and practices related to Apache Airflow and data orchestration — from Airflow 2+ feature highlights to DAG writing best practices. At the end of each webinar, we open the floor for a Q&A to ensure that participants leave the event confident about their newly acquired knowledge.

Hosted By

  • Ganapathi Krishnamoorthi Ganapathi Krishnamoorthi Senior ML Solutions Architect
  • Faisal Hoda Faisal Hoda Enterprise Architect