For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
      • AstroFully-managed data operations, powered by Apache Airflow.
      • Astro Private CloudRun Airflow-as-a-service in your environment.
      • Professional ServicesExpert Airflow services for your enterprise's success.
    • Tools
      • Cosmos
      • Orbiter
      • CLI
      • AI SDK
      • Agents
      • Blueprint
      • UpdatesThe State of Airflow 2026See the insights from over 5,800 data practitioners in the full report. Download Now ➔
  • Customers
  • Docs
    • Insights
      • Blog
      • Webinars
      • Resource Library
      • Events
    • Education
      • Academy
      • What is Airflow?
  • Pricing
Get Started Free
    • Overview
        • Advanced cluster policies
        • Airflow for MLOps
        • Airflow plugins
        • Airflow pools
        • Advanced asset scheduling
        • Asset decorator syntax
        • Asynchronous processes
        • Custom XCom backends
        • Event-driven scheduling
        • Human-in-the-loop
        • Isolated environments
        • KubernetesPodOperator
        • Logging
        • Multilanguage
        • Partitioned Dag runs
        • Programmatic/Dynamic Dags
        • Setup and teardown
        • Sharing code across projects
        • Synchronous execution
        • Testing Airflow
    • Glossary

Product

  • Platform Overview
  • Astro
  • Astro Observe
  • Astro Private Cloud
  • Security & Trust
  • Pricing

Tools & Services

  • Cosmos
  • Docs
  • Professional Services
  • Product Updates

Use Cases

  • AI Ops
  • Data Observability
  • ETL/ELT
  • ML Ops
  • Operational Analytics
  • All Use Cases

Industries

  • Financial Services
  • Gaming
  • Retail
  • Manufacturing
  • Healthcare
  • All Industries

Resources

  • Academy
  • eBooks & Guides
  • Blog
  • Webinars
  • Events
  • The Data Flowcast Podcast
  • All Resources

Airflow

  • What is Airflow
  • Airflow on Astro
  • Airflow 3.0
  • Airflow Upgrades
  • Airflow Use Cases
  • Airflow 2.x End of Life

Company

  • Our Story
  • Customers
  • Newsroom
  • Careers
  • Contact

Support

  • Knowledge Base
  • Status
  • Contact Support
GitHubYouTubeLinkedInx
  • Legal
  • Privacy
  • Terms of Service
  • Consent Preferences

  • Do Not Sell or Share My Personal information
  • Limit the Use Of My Sensitive Personal Information

Apache Airflow®, Airflow, and the Airflow logo are trademarks of the Apache Software Foundation. Copyright © Astronomer 2026. All rights reserved.

LogoLogo
On this page
  • Assumed Knowledge
  • Why use a custom XCom backend?
  • How to set up a custom XCom backend
  • Use the Object Storage XCom Backend
  • Use a custom XCom backend class
  • Custom serialization and deserialization
Airflow conceptsAdvanced

Strategies for custom XCom backends in Airflow

Edit this page
Built with

Airflow XComs allow you to pass data between tasks. By default, Airflow uses the metadata database to store XComs, which works well for local development but has limited performance. If you configure a custom XCom backend, you can define where and how Airflow stores XComs, as well as customize serialization and deserialization methods.

In this guide you’ll learn:

  • When you should use a custom XCom backend.
  • How to set up a custom XCom backend using the Object Storage XCom Backend.
  • How to use a custom XCom backend class with custom serialization and deserialization methods.

Warning

While a custom XCom backend allows you to store virtually unlimited amounts of data as XComs, you will need to scale other Airflow components to pass large amounts of data between tasks. For help running Airflow at scale, reach out to Astronomer.

Assumed Knowledge

To get the most benefits from this guide, you need an understanding of:

  • XCom basics. See Pass data between tasks.
  • Basic knowledge of a cloud-based object storage service like AWS S3, GCP Cloud Storage or Azure Blob Storage.

Why use a custom XCom backend?

Common reasons to use a custom XCom backend include:

  • You need more storage space for XComs than the Airflow metadata database can offer.
  • You’re running a production environment where you require custom retention, deletion, and backup policies for XComs.
  • You want to access XComs without accessing the metadata database.
  • You want to restrict types of allowed XCom values.
  • You want to save XComs in multiple locations simultaneously.

You can also use custom XCom backends to define custom serialization and deserialization methods for XComs if you need to add a serialization method to a class, or if registering a custom serializer is not feasible. See Custom serialization and deserialization for more information.

How to set up a custom XCom backend

There are two main ways to set up a custom XCom backend:

  • Object Storage XCom Backend: Use this method to create a custom XCom backend when you want to store XComs in a cloud-based object storage service like AWS S3, GCP Cloud Storage, or Azure Blob Storage. This option is recommended if you need to store XComs in a single remote location, and the Object Storage XCom Backend threshold and compression options meet your requirements.
  • Custom XCom backend class: Use this method when you want to further customize how XComs are stored, for example to simultaneously store XComs in two different locations.

Additionally, some provider packages offer custom XCom backends that you can use out of the box. For example, the Snowpark provider contains a custom XCom backend for Snowflake.

Use the Object Storage XCom Backend

You can create a custom XCom backend using object storage. The Object Storage XCom Backend is part of the Common IO provider and can be defined using the following environment variables:

  • AIRFLOW__CORE__XCOM_BACKEND: The XCom backend to use. Set this to airflow.providers.common.io.xcom.backend.XComObjectStorageBackend to use the Object Storage XCom Backend.
  • AIRFLOW__COMMON_IO__XCOM_OBJECTSTORAGE_PATH: The path to the object storage where XComs are stored. The path should be in the format <your-scheme>://<your-connection-id@<your-bucket>/xcom. For example, s3://my-s3-connection@my-bucket/xcom. The most common schemes are s3, gs, and abfs for Amazon S3, Google Cloud Storage, and Azure Blob Storage, respectively.
  • AIRFLOW__COMMON_IO__XCOM_OBJECTSTORAGE_THRESHOLD: The threshold in bytes for XComs to be stored in the object storage. All objects smaller or equal to this threshold are stored in the metadata database. All objects larger than this threshold are stored in the object storage. The default value is -1, meaning all XComs are stored in the metadata database.
  • AIRFLOW__COMMON_IO__XCOM_OBJECTSTORAGE_COMPRESSION: Optional. The compression algorithm to use when storing XComs in the object storage, for example zip. The default value is None.

For a step-by-step tutorial on how to set up a custom XCom backend using the Object Storage XCom Backend for Amazon S3, Google Cloud Storage and Azure Blob Storage, see the Set up a custom XCom backend using object storage.

Use a custom XCom backend class

To create a custom XCom backend, you need to define an XCom backend class which inherits from the BaseXCom class.

The code below shows an example MyCustomXComBackend class that only allows JSON-serializeable XComs and stores them in both, Amazon S3 and Google Cloud Storage using a custom serialize_value() method. The deserialize_value() method retrieves the XComs from the Amazon S3 bucket and returns the value.

The Airflow metadata database stores a reference string to the XCom, which is displayed in the XCom tab of the Airflow UI. The reference string is prefixed with s3_and_gs:// to indicate that the XCom is stored in both Amazon S3 and Google Cloud Storage. You can add any serialization and deserialization logic to the serialize_value() and deserialize_value() methods that you need, see Custom serialization and deserialization for more information.

Click to view the full custom XCom backend class example code
1from airflow.sdk.bases.xcom import BaseXCom
2from airflow.providers.amazon.aws.hooks.s3 import S3Hook
3from airflow.providers.google.cloud.hooks.gcs import GCSHook
4import json
5import uuid
6import os
7
8class MyCustomXComBackend(BaseXCom):
9 # the prefix is optional and used to make it easier to recognize
10 # which reference strings in the Airflow metadata database
11 # refer to an XCom that has been stored in remote storage
12 PREFIX = "s3_and_gs://"
13 S3_BUCKET_NAME = "s3-xcom-backend-example"
14 GS_BUCKET_NAME = "gcs-xcom-backend-example"
15
16 @staticmethod
17 def serialize_value(
18 value,
19 key=None,
20 task_id=None,
21 dag_id=None,
22 run_id=None,
23 map_index=None,
24 **kwargs,
25 ):
26
27 # make sure the value is JSON-serializable
28 try:
29 serialized_value = json.dumps(value)
30 except TypeError as e:
31 raise ValueError(f"XCom value is not JSON-serializable!: {e}")
32
33 # instantiate a context with the value as a temporary JSON file
34 with tempfile.NamedTemporaryFile(mode="w+", delete=False) as tmp_file:
35 tmp_file.write(serialized_value)
36 tmp_file.flush()
37 tmp_file_name = tmp_file.name
38
39 # the connection to AWS is created by using the S3 hook
40 hook = S3Hook(aws_conn_id="my_aws_conn_id")
41 # make sure the file_id is unique, either by using combinations of
42 # the task_id, run_id and map_index parameters or by using a uuid
43 filename = "data_" + str(uuid.uuid4()) + ".json"
44 # define the full S3 key where the file should be stored
45 key = f"{dag_id}/{run_id}/{task_id}/{map_index}/{key}_{filename}"
46
47 # load the local JSON file into the S3 bucket
48 hook.load_file(
49 filename=tmp_file_name,
50 key=key,
51 bucket_name=MyCustomXComBackend.S3_BUCKET_NAME,
52 replace=True,
53 )
54
55 # the connection to GCS is created by using the GCS hook
56 hook = GCSHook(gcp_conn_id="my_gcs_conn_id")
57
58 if hook.exists(MyCustomXComBackend.GS_BUCKET_NAME, key):
59 print(
60 f"File {key} already exists in the bucket {MyCustomXComBackend.GS_BUCKET_NAME}."
61 )
62 else:
63 # load the local JSON file into the GCS bucket
64 hook.upload(
65 filename=tmp_file_name,
66 object_name=key,
67 bucket_name=MyCustomXComBackend.GS_BUCKET_NAME,
68 )
69
70 # define the string that will be saved to the Airflow metadata
71 # database to refer to this XCom
72 reference_string = MyCustomXComBackend.PREFIX + key
73
74 # use JSON serialization to write the reference string to the
75 # Airflow metadata database (like a regular XCom)
76 return BaseXCom.serialize_value(value=reference_string)
77
78 @staticmethod
79 def deserialize_value(result):
80 import logging
81
82 reference_string = BaseXCom.deserialize_value(result=result)
83 hook = S3Hook(aws_conn_id="my_aws_conn")
84 key = reference_string.replace(MyCustomXComBackend.PREFIX, "")
85
86 # Use a temporary directory to download the file
87 with tempfile.TemporaryDirectory() as tmp_dir:
88 local_file_path = hook.download_file(
89 key=key,
90 bucket_name=MyCustomXComBackend.S3_BUCKET_NAME,
91 local_path=tmp_dir,
92 )
93
94 # ensure the file is not empty and log its size
95 file_size = os.path.getsize(local_file_path)
96 logging.info(f"Downloaded file size: {file_size} bytes.")
97 if file_size == 0:
98 raise ValueError(
99 f"The downloaded file is empty. Check the content of the S3 object at {key}."
100 )
101
102 with open(local_file_path, "r") as file:
103 try:
104 output = json.load(file)
105 except json.JSONDecodeError as e:
106 logging.error(f"Error decoding JSON from the file: {e}")
107 raise
108
109 return output

To use a custom XCom backend class, you need to save it in a Python file in the include directory of your Airflow project. Then, set the AIRFLOW__CORE__XCOM_BACKEND environment variable in your Airflow instance to the path of the custom XCom backend class. If you run Airflow locally with the Astro CLI, you can set the environment variable in the .env file of your Astro project. On Astro, you can set the environment variable in the Astro UI.

AIRFLOW__CORE__XCOM_BACKEND=include.<your-file-name>.MyCustomXComBackend

If you want to further customize the functionality for your custom XCom backend, you can override additional methods of the XCom module (source code).

Custom serialization and deserialization

By default, Airflow includes serialization methods for common object types like JSON, pandas DataFrames and NumPy.

If you need to pass data objects through XCom that are not supported, you have several options:

  • Register a custom serializer, see Serialization.
  • Add a serialize() and deserialize() method to the class of the object you want to pass through XCom, see Serialization.
  • Use a custom XCom backend to define custom serialization and deserialization methods, see Use a custom XCom backend class.