Snowflake data quality

Private Preview
This feature is in Private Preview. Please reach out to your account team to enable this feature.

Astro Observe data quality helps users monitor Snowflake tables to ensure data accuracy, completeness, and integrity across your pipelines. It automatically tracks key metrics such as column null percentages, schema changes, and table row counts to detect anomalies or unexpected shifts in your data.

Ensure the necessary permissions in Snowflake

Before connecting to Astro Observe, create a Snowflake service user and role and ensure the necessary Snowflake permissions are configured.

1

Log into Snowflake

Log into Snowflake using a high-privilege role such as ACCOUNTADMIN.

2

Create a dedicated role for Observe

Create Observe role
1CREATE ROLE IF NOT EXISTS ASTRO_OBSERVE_ROLE;
3

Create a read-only service user

Create a service user that Observe will use.

Create Observe service user
1CREATE USER IF NOT EXISTS ASTRO_OBSERVE_USER
2 DEFAULT_ROLE = ASTRO_OBSERVE_ROLE
3 TYPE = SERVICE;
4

Assign the role to the user

Grant role to user
1GRANT ROLE ASTRO_OBSERVE_ROLE TO USER ASTRO_OBSERVE_USER;
5

Grant the role the privileges Observe requires.

Replace YOUR_DB and other example names to match your Snowflake environment.

Example privileges for ASTRO_OBSERVE_ROLE
1-- Grant warehouse access (replace COMPUTE_WH with your warehouse)
2GRANT USAGE ON WAREHOUSE "COMPUTE_WH" TO ROLE ASTRO_OBSERVE_ROLE;
3ALTER USER ASTRO_OBSERVE_USER SET DEFAULT_WAREHOUSE = 'COMPUTE_WH';
4
5-- Metadata and object access:
6GRANT USAGE, MONITOR ON DATABASE YOUR_DB TO ROLE ASTRO_OBSERVE_ROLE;
7GRANT USAGE, MONITOR ON ALL SCHEMAS IN DATABASE YOUR_DB TO ROLE ASTRO_OBSERVE_ROLE;
8
9-- Read access:
10GRANT SELECT ON ALL TABLES IN DATABASE YOUR_DB TO ROLE ASTRO_OBSERVE_ROLE;
11GRANT SELECT ON ALL VIEWS IN DATABASE YOUR_DB TO ROLE ASTRO_OBSERVE_ROLE;
12GRANT SELECT ON ALL EXTERNAL TABLES IN DATABASE YOUR_DB TO ROLE ASTRO_OBSERVE_ROLE;
13
14-- Optional: access to Snowflake usage views:
15GRANT SELECT ON SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY TO ROLE ASTRO_OBSERVE_ROLE;
16GRANT SELECT ON SNOWFLAKE.ACCOUNT_USAGE.TABLE_DML_HISTORY TO ROLE ASTRO_OBSERVE_ROLE;
17
18-- Or use:
19-- GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO ROLE ASTRO_OBSERVE_ROLE;
These are example grants. Replace database, schema, and warehouse names with values appropriate for your account and security policies.

All Snowflake integrations require that the Observe role has access to both ACCOUNT_USAGE and INFORMATION_SCHEMA system tables. The service user must have a default warehouse configured for all discovery and monitoring operations.

Setup key-pair authentication in Snowflake

Astronomer recommends key-pair authentication for Snowflake service users. Generate an RSA key pair, then assign the public key to the Observe service user to enable secure authentication.

1

Generate a password-protected private key and a public key

Run the following commands on a secure host to create an encrypted private key and a public key:

Generate RSA key pair (example)
$# Create encrypted private key
>openssl genrsa 2048 | openssl pkcs8 -topk8 -v2 des3 -inform PEM -out rsa_key.p8
>
># Create public key
>openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub
2

Validate the key

$openssl pkey -in rsa_key.p8 -check -noout
3

Assign the public key to the Snowflake user

Remove the -----BEGIN PUBLIC KEY----- and -----END PUBLIC KEY----- headers and newlines so the key is a compact single string. Assign the cleaned public key to the service user:

Assign public key to Snowflake user
1ALTER USER ASTRO_OBSERVE_USER SET RSA_PUBLIC_KEY = '<your_cleaned_public_key>';

Connect Snowflake to Astro Observe

After you configure Snowflake permissions and key-pair auth, create the Observe connection.

1

Open Connections

In the Observe UI navigate to Connections and click + Connection.

2

Fill in connection details

Complete the following fields:

  • Name: A name for the connection.
  • Description: Optional description.
  • Connection Type: Snowflake.
  • Polling Schedule: How frequently Observe polls Snowflake for metrics (examples: every 1 hour, 6 hours, 1 day). Polling frequency is the maximum rate at which Observe updates data quality metrics and monitors; more frequent polling may increase Snowflake compute costs.
  • Account Identifier: Your Snowflake account identifier (for example, FY02423-GP2141). Observe maps assets to a connection by account identifier.
  • Username: The Snowflake service user (ASTRO_OBSERVE_USER).
  • Private Key: Paste your private key for key-pair authentication if using key-pair auth.

Only one Observe connection is allowed per Snowflake account identifier. If you have multiple Snowflake accounts, create a separate connection for each account identifier.

3

Save and start discovery

Click Create. Observe begins the metadata extraction process and will discover databases, schemas, and tables within the configured account and surface discovered tables in the Asset Catalog.

All connections require that the Observe role has access to both the ACCOUNT_USAGE and INFORMATION_SCHEMA system tables. The service user must have a default warehouse configured to support discovery and ongoing data quality monitoring.

Asset Catalog

Navigate to Asset Catalog, filter by Snowflake tables, and select the desired table.

You can sort tables by popularity to quickly identify frequently used tables. Popularity rankings are based on query frequency and the number of unique users accessing each table.

Schema

The Schema tab shows table structure details:

  • Column names
  • Data types
  • Completeness status
  • Nullability
  • Default values

You can enable monitoring for specific columns to actively track completeness.

Event Timeline

The Event Timeline tab shows data quality events for a selected timeframe. Events are color-coded by severity: Success, Neutral, and Failure. Click an event to view details, historical patterns, and affected metrics.

Data quality

The data quality tab provides visualizations for monitored metrics:

  • Table Volume: track changes in row counts and percent change over time to identify unexpected fluctuations.
  • Completeness: visualize column null percentages against thresholds to surface completeness problems.

Monitors

The Monitors tab lists all configured data quality monitors (Column Null Percentage, Table Schema Change, Row Volume Change). Each monitor’s schedule and modification history are shown for management.

Set up a data quality monitor

Follow these steps to create a monitor for a Snowflake table or column.

1

Open the table

In the Asset Catalog, select the Snowflake table you want to monitor.

2

Open monitors

Click the Monitors tab.

3

Create a monitor

  • If no monitors exist, click Monitor (or + Monitor) to create your first monitor.
  • Select the monitor type: Column Null Percentage, Row Volume Change, or Schema Change.

Volume checks

  • Specify thresholds based on percentage changes or absolute row-count changes.
  • Monitors execute checks according to the schedule you define. If you set the monitor to run every 6 hours, it evaluates whether row counts exceed the configured thresholds within that interval.

Column null percentage checks

  • Select the column to monitor and define the null percentage threshold.
  • The monitor evaluates the column at the interval you specify. If the null percentage exceeds the threshold, the monitor triggers.
4

Define monitor settings

Set alert thresholds and notification preferences.

5

Create monitor

Click Create Monitor to activate the monitor.