1 of 27

Register Data Sources

When a data source is exposed, policies are dynamically enforced on the data, appropriately redacting and masking information depending on the attributes or groups of the user accessing the data. Once the data source is exposed and subscribed to, the data can be accessed in a consistent manner, allowing reproducibility and collaboration.

This section includes how-to guides for registering data sources in Immuta:

Amazon Redshift Spectrum data source
Amazon S3 data source

Amazon Redshift Spectrum Data Source

You can create an Amazon Redshift Spectrum data source using the Immuta CLI or the Immuta V2 API.

Requirements

The enable_case_sensitive_identifier parameter must be set to false (default setting) for your Redshift cluster.
CREATE_DATA_SOURCE Immuta permission
The Redshift user registering data sources must have the following privileges on all securables:
- USAGE on all schemas with registered data sources
- SELECT on all tables within those schemas

Create a data source using the Immuta CLI

Copy the snippet above and save it as a YAML file. Replace the configuration values with your own, where
1. hostname is the URL of your Redshift account.
2. database is the name of the that the Immuta system user will manage and store metadata in.

For additional configuration options, see the connection object details for Redshift Spectrum data sources on the .

Avoid schema name conflicts

Your nativeSchemaFormat must contain _immuta to avoid schema name conflicts.

Create a data source using the Immuta V2 API

Copy the request example. The example provided uses JSON format, but the request also accepts YAML.
Replace the Immuta URL and with your own.
Change the config values to your own, where

Avoid schema name conflicts

Your nativeSchemaFormat must contain _immuta to avoid schema name conflicts.

Path parameters

Parameter

Description

Required or optional

Default value

Body parameters

The request accepts a JSON or YAML payload with the parameters outlined below.

Parameter

Description

Required or optional

Amazon S3 Data Source

Private preview: The Amazon S3 integration is available to select accounts. Contact your Immuta representative for details.

Requirement

CREATE_S3_DATA_SOURCE Immuta permission

Prerequisite

Create a data source

Navigate to the Data Sources list page in Immuta.
Click Register Data Source.
Select the S3 tile in the data platform section.

Azure Synapse Analytics Data Source

Prerequisites

If you are using the OAuth authentication method,

Ensure that Microsoft Entra ID is on the same account as the Azure Synapse Analytics workspace and dedicated SQL pool.
.
Select Accounts in this organizational directory only as the account type.

Enter connection information

Navigate to the Data Sources list page and click Register Data Source.
Select the Azure Synapse Analytics tile in the Data Platform section.
Complete these fields in the Connection Information box:

Use SSL

Although not required, it is recommended that all connections use SSL. Additional connection string arguments may also be provided.

Note: Only Immuta uses the connection you provide and injects all policy controls when users query the system. In other words, users always connect through Immuta with policies enforced and have no direct association with this connection.

Considerations

Immuta pushes down joins to be processed on the remote database when possible. To ensure this happens, make sure the connection information matches between data sources, including host, port, ssl, and credentials. You will see performance degradation on joins against the same database if this information doesn't match.

Select virtual population

Decide how to virtually populate the data source by selecting one of the options:

Create sources for all tables in this database: This option will create data sources and keep them in sync for every table in the dataset. New tables will be automatically detected and new Immuta views will be created.
Schema / Table: This option will allow you to specify tables or datasets that you want Immuta to register.
1. Opt to Edit in the table selection box that appears.

Enter basic information

Enter the SQL Schema Name Format to be the SQL name that the data source exists under in Immuta. It must include a schema macro but you may personalize it using lowercase letters, numbers, and underscores to personalize the format. It may have up to 255 characters.
Enter the Schema Project Name Format to be the name of the schema project in the Immuta UI. If you enter a name that already exists, the name will automatically be incremented. For example, if the schema project Customer table already exists and you enter that name in this field, the name for this second schema project will automatically become Customer table 2 when you create it.

Enable or disable schema monitoring

Schema monitoring best practices

Schema monitoring is a powerful tool that ensures tables are all governed by Immuta.

Consider using schema monitoring later in your onboarding process, not during your initial setup and configuration when tables are not in a stable state.

When selecting the Schema/Table option, you can opt to enable by selecting the checkbox in this section.

Note: This step will only appear if all tables within a server have been selected for creation.

Opt to configure advanced settings

Although not required, completing these steps will help maximize the utility of your data source. Otherwise, click Create to save the data source.

Column detection

This setting monitors when remote tables' columns have been changed, updates the corresponding data sources in Immuta, and notifies Data Owners of these changes.

To enable, select the checkbox in this section.

See the page to learn more about column detection.

Data source tags

Adding tags to your data source allows users to search for the data source using the tags and Governors to apply Global policies to the data source. Note if Schema Detection is enabled, any tags added now will also be added to the tables that are detected.

To add tags,

Click the Edit button in the Data Source Tags section.
Begin typing in the Search by Tag Name box to select your tag, and then click Add.

Tags can also be added after you create your data source from the data source details page on the overview tab or the data dictionary tab.

Create the data source

Click Create to save the data source(s).

Databricks Data Source

Deprecation notice

Support for registering Databricks Unity Catalog data sources using this legacy workflow has been deprecated. Instead, register your data using connections.

Requirements

Databricks Spark integration

When exposing a table or view from an Immuta-enabled Databricks cluster, be sure that at least one of these traits is true:

The user exposing the tables has READ_METADATA and SELECT permissions on the target views/tables (specifically if Table ACLs are enabled).
The user exposing the tables is listed in the immuta.spark.acl.allowlist configuration on the target cluster.
The user exposing the tables is a Databricks workspace administrator.

Databricks Unity Catalog integration

When registering Databricks Unity Catalog securables in Immuta, use the service principal from the integration configuration and ensure it has the privileges listed below. Immuta uses this service principal continuously to orchestrate Unity Catalog policies and maintain state between Immuta and Databricks.

USE CATALOG and MANAGE on all catalogs containing securables registered as Immuta data sources.
USE SCHEMA on all schemas containing securables registered as Immuta data sources.

MANAGE and MODIFY are required so that the service principal can apply row filters and column masks on the securable; to do so, the service principal must also have SELECT on the securable as well as USE CATALOG on its parent catalog and USE SCHEMA on its parent schema. Since privileges are inherited, you can grant the service principal the MODIFY and SELECT privilege on all catalogs or schemas containing Immuta data sources, which automatically grants the service principal the

Azure Databricks Unity Catalog limitation

Set all table-level ownership on your Unity Catalog data sources to an individual user or service principal instead of a Databricks group before proceeding. Otherwise, Immuta cannot apply data policies to the table in Unity Catalog. See the for details.

Enter connection information

Performance recommendations

Register entire databases with Immuta and run jobs through the Python script provided during data source registration.
Use a Databricks administrator account to register data sources with Immuta using the UI or API; however, you should

Navigate to the Data Sources list page and click Register Data Source.
Select the Databricks tile in the Data Platform section. When exposing a table or view from an Immuta-enabled Databricks cluster, be sure that at least one of these traits is true:
- The user exposing the tables has READ_METADATA and SELECT permissions on the target views/tables (specifically if Table ACLs are enabled).

Further considerations

Immuta pushes down joins to be processed on the remote database when possible. To ensure this happens, make sure the connection information matches between data sources, including host, port, ssl, username, and password. You will see performance degradation on joins against the same database if this information doesn't match.

Select virtual population

Decide how to virtually populate the data source by selecting one of the options:

Create sources for all tables in this database: This option will create data sources and keep them in sync for every table in the dataset. New tables will be automatically detected and new Immuta views will be created.
Schema / Table: This option will allow you to specify tables or datasets that you want Immuta to register.
1. Opt to Edit in the table selection box that appears.

Enter basic information

Enter the SQL Schema Name Format to be the SQL name that the data source exists under in Immuta. It must include a schema macro but you may personalize it using lowercase letters, numbers, and underscores to personalize the format. It may have up to 255 characters.
Enter the Schema Project Name Format to be the name of the schema project in the Immuta UI. If you enter a name that already exists, the name will automatically be incremented. For example, if the schema project Customer table already exists and you enter that name in this field, the name for this second schema project will automatically become Customer table 2 when you create it.

Enable or disable schema monitoring

Note: This step will only appear if all tables within a server have been selected for creation.

Schema monitoring best practices

Schema monitoring is a powerful tool that ensures tables are all governed by Immuta.

Consider using schema monitoring later in your onboarding process, not during your initial setup and configuration when tables are not in a stable state.

Generate your Immuta API Key from your user profile page. The Immuta API key used in the Databricks notebook job for schema detection must either belong to an Immuta admin or the user who owns the schema detection groups that are being targeted.
On the data source creation page, click the checkbox to enable Schema Monitoring or Detect Column Changes.
Click Download Schema Job Detection Template and then the Click Here To Download text.

Opt to configure advanced settings

Although not required, completing these steps will help maximize the utility of your data source. Otherwise, click Create to save the data source.

Column detection

This setting monitors when remote tables' columns have been changed, updates the corresponding data sources in Immuta, and notifies Data Owners of these changes.

To enable, select the checkbox in this section.

See the page to learn more about column detection.

Event time

An Event Time column denotes the time associated with records returned from this data source. For example, if your data source contains news articles, the time that the article was published would be an appropriate Event Time column.

Click the Edit button in the Event Time section.
Select the column(s).
Click Apply.

Selecting an Event Time column will enable

more statistics to be calculated for this data source including the most recent record time, which is used for determining the freshness of the data source.
the creation of time-based restrictions in the policy builder.

Latency

Click Edit in the Latency section.
Complete the Set Time field, and then select MINUTES, HOURS, or DAYS from the subsequent dropdown menu.
Click Apply.

This setting impacts how often Immuta checks for new values in a column that is driving row-level redaction policies. For example, if you are redacting rows based on a country column in the data, and you add a new country, it will not be seen by the Immuta policy until this period expires.

Sensitive data discovery

Data owners can disable identification for their data sources in this section.

Click Edit in this section.
Select Enabled or Disabled in the window that appears, and then click Apply.

Data source tags

To add tags,

Click the Edit button in the Data Source Tags section.
Begin typing in the Search by Tag Name box to select your tag, and then click Add.

Tags can also be added after you create your data source from the data source details page on the overview tab or the data dictionary tab.

Create the data source

Click Create to save the data source(s).

Google BigQuery Data Source

Private preview: The Google BigQuery integration is available to all accounts.

Requirements

CREATE_DATA_SOURCE Immuta permission
Google BigQuery roles:
- roles/bigquery.metadataViewer on the source table (if managed at that level) or dataset
- roles/bigquery.dataViewer (or higher) on the source table (if managed at that level) or dataset

Prerequisite

Create a Google Cloud service account for creating Google BigQuery data sources

Google BigQuery data sources in Immuta must be created using a Google Cloud service account rather than a Google Cloud user account. If you do not currently have a service account for the Google Cloud project separate from the Google Cloud service account you created when , you must create a Google Cloud service account with privileges to view and run queries against the tables you are protecting.

You have two options to create the required Google Cloud service account:

Create a service account using the Google Cloud web console

Using the , create a service account with the following roles:
- BigQuery User
- BigQuery Data Viewer

Create a service account using gcloud

Copy the script below and update the SERVICE_ACCOUNT, PROJECT_ID, and IMMUTA_GCP_KEY_FILE values.
- SERVICE_ACCOUNT is the name for the new service account.
- PROJECT_ID is the project ID for the Google Cloud Project that is integrated with Immuta.

Register data sources in Immuta

Required Google BigQuery roles

Ensure that the user creating the data source has these Google BigQuery roles:

roles/bigquery.metadataViewer on the source table (if managed at that level) or dataset

Navigate to the Data Sources list page.
Click Register Data Source.
Select the Google BigQuery tile in the Data Platform section.

Next steps

With data sources registered in Immuta, your organization can now start

building and to govern data.
to collaborate.

Snowflake Data Source

Deprecation notice

Support for registering Snowflake data sources using this legacy workflow has been deprecated. Instead, register your data using connections.

Requirements

CREATE_DATA_SOURCE Immuta permission
The Snowflake user registering data sources must have the following privileges on all securables:
- USAGE on all databases and schemas with registered data sources

Snowflake imported databases

Immuta does not support Snowflake tables from imported databases. Instead, create a view of the table and register that view as a data source.

Enter connection information

Use SSL

Although not required, all connections should use SSL. Additional connection string arguments may also be provided.

Navigate to the Data Sources list page and click Register Data Source.
Select the Snowflake tile in the Data Platform section.
Complete these fields in the Connection Information box:

Considerations

Immuta pushes down joins to be processed on the remote database when possible. To ensure this happens, make sure the connection information matches between data sources, including host, port, ssl, username, and password. You will see performance degradation on joins against the same database if this information doesn't match.

File naming convention

If you are uploading more than one file, ensure the certificate used for the OAuth authentication has the key name "oauth client certificate."

Select virtual population

Decide how to virtually populate the data source by selecting one of the options:

Create sources for all tables in this database: This option will create data sources and keep them in sync for every table in the dataset. New tables will be automatically detected and new Immuta views will be created.
Schema / Table: This option will allow you to specify tables or datasets that you want Immuta to register.
1. Opt to Edit in the table selection box that appears.

Enter basic information

Enter the SQL Schema Name Format to be the SQL name that the data source exists under in Immuta. It must include a schema macro but you may personalize it using lowercase letters, numbers, and underscores to personalize the format. It may have up to 255 characters.
Enter the Schema Project Name Format to be the name of the schema project in the Immuta UI. If you enter a name that already exists, the name will automatically be incremented. For example, if the schema project Customer table already exists and you enter that name in this field, the name for this second schema project will automatically become Customer table 2 when you create it.

Enable or disable schema monitoring

Schema monitoring best practices

Schema monitoring is a powerful tool that ensures tables are all governed by Immuta.

Consider using schema monitoring later in your onboarding process, not during your initial setup and configuration when tables are not in a stable state.

When selecting the Schema/Table option, opt to enable by selecting the checkbox in this section.

Note: This step will only appear if all tables within a server have been selected for creation.

Opt to configure advanced settings

Although not required, completing these steps will help maximize the utility of your data source. Otherwise, click Create to save the data source.

Column detection

This setting monitors when remote tables' columns have been changed, updates the corresponding data sources in Immuta, and notifies Data Owners of these changes.

To enable, select the checkbox in this section.

See the page to learn more about column detection.

Event time

Click the Edit button in the Event Time section.
Select the column(s).
Click Apply.

Selecting an Event Time column will enable

more statistics to be calculated for this data source including the most recent record time, which is used for determining the freshness of the data source.
the creation of time-based restrictions in the policy builder.

Latency

Click Edit in the Latency section.
Complete the Set Time field, and then select MINUTES, HOURS, or DAYS from the subsequent dropdown menu.
Click Apply.

Sensitive data discovery

Data owners can disable identification for their data sources in this section.

Click Edit in this section.
Select Enabled or Disabled in the window that appears, and then click Apply.

Data source tags

To add tags,

Click the Edit button in the Data Source Tags section.
Begin typing in the Search by Tag Name box to select your tag, and then click Add.

Tags can also be added after you create your data source from the data source details page on the overview tab or the data dictionary tab.

Create the data source

Click Create to register your data source.

Bulk Create Snowflake Data Sources

Private preview: This feature is available to select accounts. Contact your Immuta representative to enable this feature.

Requirements

Snowflake Enterprise Edition
Snowflake X-Large or Large warehouse is strongly recommended

Create Snowflake data sources

Make a request to the Immuta V2 API , as the Immuta UI does not support creating more than 1000 data sources. The following options must be specified in your request to ensure the maximum performance benefits of bulk data source creation. The Skip Stats Job tag is only required if you are using ; otherwise, Snowflake data sources automatically skip the stats job.

Specifying disableSensitiveDataDiscovery as true ensures that will not be applied when the new data sources are created in Immuta, regardless of how it is configured for the Immuta tenant. Disabling identification improves performance during data source creation.

Applying the Skip Stats Job tag using the tableTag value will ensure that some jobs that are not vital to data source creation are skipped, specifically the fingerprint and high cardinality check jobs.

When the Snowflake bulk data source creation feature is configured, the create data source endpoint operates asynchronously and responds immediately with a bulkId that can be used for monitoring progress.

Monitor progress

To monitor the progress of the background jobs for the bulk data source creation, make the following request using the bulkId from the response of the previous step:

The response will contain a list of job states and the number of jobs currently in each state. If errors were encountered during processing, a list of errors will be included in the response:

With these recommended configurations, bulk creating 100,000 Snowflake data sources will take between six and seven hours for all associated jobs to complete.

Data Source Settings

Once a data source is created, the data owner can manage data source policies, members, data dictionary, and tags.

The reference and how-to guides in this section cover topics related to managing existing data sources.

How-to guides

: Edit data source settings or disable and delete a data source.
: Add, remove, or modify users on a data source.
: Approve and deny subscriptions requests on data source.
: Manage the data dictionary descriptions and tags.
: Disable metadata collection that requires sampling data.

Reference guide

: This reference guide defines the data source health check jobs that are run when a data source is created.

How-to Guides

Manage Data Sources

As a data owner, you can edit your data source settings and disable, delete, and re-enable a data source.

For other guides related to data source members and management, see the Related guides section.

Bulk edit data sources

Data owners can bulk edit data sources.

Navigate to the Data Sources page.
Select the checkboxes for the data sources you want to edit. Note that when editing a connection string using bulk edit, all data sources from that connection must be selected.
Select the action you want or click More Actions for additional options.
Confirm your edits by following the prompts in the modals that appear.

Resync policies

Resync data policies

Users can manually resync data policies for all Immuta integrations.

Navigate to the Data Sources page.
Select the checkboxes for the data sources you want to sync data policies for.
Click More Actions and select Sync Data Policies.

Resync grants and data policies

For Databricks Unity Catalog integrations, users can manually resync both subscription and data policies through the data source health check.

Navigate to the data source you would like to resync policies for.
Click the health status in the top corner.
Click Sync All Policies.

To sync grants and data policies using the API, see the .

Disable a data source

Disabling a data source hides it and its data from all users except the data owner. While in this state, the data source will display as disabled in the console for the data owner and other users will not be able to see it at all.

Navigate to the data source.
Click on the more actions icon and select Disable.

A label will appear next to the data source indicating it is now disabled, and a notification will be sent to all users of the data source informing them that the data source has been disabled.

Disabled data sources and Immuta policies

Disabling a data source for one of the integrations below removes subscription and data policies from that data source; policies will not be applied until the data source is re-enabled:

Azure Synapse Analytics

Enable a disabled data source

Navigate to the data source.
Click on the more actions icon and select Enable.

A notification will be sent out to all users of the data source informing them that the data source has been enabled. After you enable a data source, existing policies on that data source will take effect.

Delete a data source

Deleting a data source permanently removes it from Immuta. Data sources must first be disabled before they can be deleted.

.
Navigate to the data source, click the more actions icon and select Delete.
Confirm that the data source should be deleted by clicking Delete.

A notification will be sent out to all users of the data source informing them that the data source has been deleted.

Reference guides

For information about data sources and policies, see the following guides:

How-to guides

In addition to adding and managing data source settings as outlined above, data owners can manage data source

Manage Data Dictionary Descriptions

The data dictionary provides information about the columns within the data source, including column names and value types.

As a data owner, you can manage data dictionary descriptions and column tags. For other guides related to the data dictionary, see the Related guides section.

Manage data dictionary descriptions

Navigate to the Data Dictionary tab.
To add or edit column descriptions, click the menu icon in the Actions column next to the entry you want to change and select Edit.
Complete the fields in the form that appears, and then click Save.

Reference guide

For information about the data dictionary, see the .

How-to guide

In addition to managing data dictionary descriptions as outlined above, data owners or experts can also manage .

Data Source Health Checks Reference Guide

When an Immuta data source is created, background jobs use the connection information provided to compute health checks dependent on the type of data source created and how it was configured. These data source health checks include the

blob crawl status: indicates whether the blob was successfully crawled.
column detection status: indicates whether the job run to determine if a column was added or removed from the remote table registered as an Immuta data source was successful.
external catalog link status: indicates whether or not the external catalog was successfully linked to the data source.
fingerprint generation status: indicates whether or not the data source fingerprint was successfully generated. Fingerprints are only available for Snowflake data sources.
framework classification status: indicates whether classification was successfully run on the data source to determine the sensitivity of the data source.
global policy applied status: indicates whether global policies were successfully applied to the data source.
high cardinality calculation status: indicates whether the data source's high cardinality column was successfully calculated.
SQL sync status (for Snowflake data sources): indicates whether Snowflake governance policies have been successfully synced.
SQL view creation status (for Amazon Redshift Spectrum, Azure Synapse Analytics, or Google BigQuery data sources): indicates whether views were properly created for tables registered in Immuta.
row count status: indicates whether the number of rows in the data source was successfully calculated.
schema detection status: indicates whether the job run to determine if a remote table was added or removed from the schema was successful.
sensitive data discovery status: indicates whether identification was successfully run on the data source.

After these jobs complete, the health status for each is updated to indicate whether the status check passed, was skipped, is unknown, or failed.

These background jobs can be disabled during data source creation by adding a specific tag to prevent automatic table statistics. This prevent statistics tag can be set on the by a system administrator. However, with automatic table statistics disabled these policies will be unavailable until the data source owner :

Masking with format preserving masking
Masking using randomized response

Unhealthy Databricks data sources

Unhealthy data sources may fail their row count queries if they run against a cluster that has the Databricks query watchdog enabled.

Limitations

Data sources with over 1600 columns will not have health checks run, but will still appear as healthy. The health check cannot be run automatically or manually.

Schema Monitoring

With schema monitoring enabled, Immuta monitors your organization's servers to find when new tables or columns are created or deleted and automatically registers (or disables) those tables in Immuta.

How-to guides

Manage schema monitoring: Edit connection information, schema project owner, or the naming conventions of data registered in the schema.
: Manually trigger schema monitoring.

Reference guides

: This reference guide describes the design and components of schema monitoring.
: This reference guide describes schema projects, which group all the data sources of a schema.

Concept guide

: This explanatory guide provides a conceptual overview of schema monitoring. It offers a discussion of the benefits of the feature, context for why it was developed, and insights into the features schema monitoring pairs with. This guide is designed to deepen your understanding of schema monitoring's purpose as you implement it.

How-to Guides

Run Schema Monitoring and Column Detection Jobs

Manually Run Schema Monitoring Jobs

Manually Run Schema Monitoring Job for All Data Sources

Requirement: Immuta permission USER_ADMIN

You can manually run a schema monitoring job globally using the with an empty payload.

Manually Run Schema Monitoring Job as a Data Owner

You can manually run a schema monitoring job for all data sources that you own using the with a payload containing the hostname for your data sources or their individual IDs.

Manually Run Schema Monitoring Job as a Data User

You can manually run a schema monitoring job for data sources you are subscribed to using the with a payload containing the hostname for your data source and the table name or data source ID.

Manually Run a Column Detection Job

Navigate to the data source overview page.
Click on the health check icon.
Scroll to Column Detection, and click Trigger Detection.

Reference Guides

Schema Projects

Schema projects are automatically created and managed by Immuta. They group all the data sources of the schema, and when new data sources are created, manually or with schema monitoring, they are automatically added to the schema project. They work as a tool to organize all the data sources within a schema, which is particularly helpful with schema monitoring enabled.

Schema projects are created when tables are registered as data sources in Immuta. The user creating the data source does not need the CREATE_PROJECT permission to have the project auto-create because no data sources can be added by the owner. Instead, new data sources are managed by Immuta. The user can manage Subscription policies for schema projects, but they cannot apply Data policies or purposes to them.

The schema settings, such as schema evolution and connection information, can be edited from the project overview tab. Note: Deleting the project will delete all of the data sources within it as well.

Schema Project Actions

Schema settings are edited from the project overview tab:

: Editing these details will update them for all the data sources within the schema project.
: When schema monitoring is enabled, new data sources will be automatically detected and added to the schema project. Updating the naming convention will change how these newly detected data sources are named by Immuta.
: When schema monitoring is enabled, a user is assigned to be the owner of any detected and Immuta created data source.

Why Use Schema Monitoring Concept Guide

Immuta is a live metadata aggregator - metadata about your data and your users. With data metadata specifically, Immuta can monitor changes in your database and reflect those changes in your Immuta tenant through schema monitoring.

When schema monitoring is enabled, Immuta monitors your organization's servers to identify when new tables or columns are created or deleted, and automatically registers (or disables) those tables in Immuta. The newly updated data sources then have global policies and tags applied to them, and the Immuta data dictionary is updated with column changes.

Schema monitoring keeps Immuta in sync with your data environment, helping you remain compliant without having to manually update individual data sources.

Anti-patterns: Using Immuta without schema monitoring

Without schema monitoring, data owners have to manually add and remove Immuta data sources when users add or remove tables from databases in their data platforms. At worst, data owners are not aware of these changes; at best they are aware of the changes and have to manually update Immuta with those changes, which is a time-consuming, error-prone process.

Beyond draining data owners' time, manually updating data sources to reflect the state of the data platform also complicates the process: not only must they understand when a new table is present, but they then must remember to tag it and protect it appropriately. This leaves organizations ripe for data leaks as new data is created across the business, perhaps daily.

Schema monitoring, by contrast, is scalable and accounts for the evolution of your schemas and policies. Instead of manually managing access to these tables or adding and removing data sources, you are empowered to register a schema, create policies, and allow Immuta to manage those policies and changes to your schema for you to keep your data in sync and restrict access appropriately.

Business value

Both monitoring for new data and align with the , removing redundant and arduous work. Once tables are registered and tagged, policies can immediately be applied - this means humans can be completely removed from the process by creating tag-based policies that dynamically apply themselves to new tables.

Then, your business reaps the following benefits:

Increased revenue: Accelerate data access and time-to-data access because where sensitive data lives is well understood.
Decreased cost: Operate efficiently and move with agility at scale.
Decreased risk: Discover and protect sensitive data immediately.

What features does it pair with?

Schema monitoring pairs with the following features:

: Column detection identifies when a column has been added to or removed from a table and adds or removes that column from the data source in Immuta.
: When paired with column detection or schema monitoring, this policy locks down access to those newly added columns and tables to prevent data leaks.
: When the tables are discovered through the registration process, Immuta evaluates the table data for sensitive information and tags it as such. These tags are critical for scaling tag-based policies.

Register Data Sources

Amazon Redshift Spectrum Data Source

hashtagRequirements

hashtagCreate a data source using the Immuta CLI

hashtagCreate a data source using the Immuta V2 API

hashtagPath parameters

hashtagBody parameters

Amazon S3 Data Source

hashtagRequirement

hashtagPrerequisite

hashtagCreate a data source

Azure Synapse Analytics Data Source

hashtagPrerequisites

hashtagEnter connection information

hashtagSelect virtual population

hashtagEnter basic information

hashtagEnable or disable schema monitoring

hashtagOpt to configure advanced settings

hashtagColumn detection

hashtagData source tags

hashtagCreate the data source

Databricks Data Source

hashtagRequirements

hashtagEnter connection information

hashtagSelect virtual population

hashtagEnter basic information

hashtagEnable or disable schema monitoring

hashtagOpt to configure advanced settings

hashtagColumn detection

hashtagEvent time

hashtagLatency

hashtagSensitive data discovery

hashtagData source tags

hashtagCreate the data source

Google BigQuery Data Source

hashtagRequirements

hashtagPrerequisite

hashtagCreate a Google Cloud service account for creating Google BigQuery data sources

hashtagCreate a service account using the Google Cloud web console

hashtagCreate a service account using gcloud

hashtagRegister data sources in Immuta

hashtagNext steps

Snowflake Data Source

hashtagRequirements

hashtagEnter connection information

hashtagSelect virtual population

hashtagEnter basic information

hashtagEnable or disable schema monitoring

hashtagOpt to configure advanced settings

hashtagColumn detection

hashtagEvent time

hashtagLatency

hashtagSensitive data discovery

hashtagData source tags

hashtagCreate the data source

Bulk Create Snowflake Data Sources

hashtagRequirements

hashtagCreate Snowflake data sources

hashtagMonitor progress

Data Source Settings

hashtagHow-to guides

hashtagReference guide

How-to Guides

Manage Data Sources

hashtagBulk edit data sources

hashtagResync policies

hashtagResync data policies

hashtagResync grants and data policies

hashtagDisable a data source

hashtagEnable a disabled data source

hashtagDelete a data source

hashtagRelated guides

hashtagReference guides

hashtagHow-to guides

Manage Data Dictionary Descriptions

hashtagManage data dictionary descriptions

hashtagRelated guides

hashtagReference guide

hashtagHow-to guide

Data Source Health Checks Reference Guide

Requirements

Create a data source using the Immuta CLI

Create a data source using the Immuta V2 API

Path parameters

Body parameters

Requirement

Prerequisite

Create a data source

Prerequisites

Enter connection information

Select virtual population

Enter basic information

Enable or disable schema monitoring

Opt to configure advanced settings

Column detection

Data source tags

Create the data source

Requirements

Enter connection information

Select virtual population

Enter basic information

Enable or disable schema monitoring

Opt to configure advanced settings

Column detection

Event time

Latency

Sensitive data discovery

Data source tags

Create the data source

Requirements

Prerequisite

Create a Google Cloud service account for creating Google BigQuery data sources

Create a service account using the Google Cloud web console

Create a service account using gcloud

Register data sources in Immuta

Next steps

Requirements

Enter connection information

Select virtual population

Enter basic information

Enable or disable schema monitoring

Opt to configure advanced settings

Column detection

Event time

Latency

Sensitive data discovery

Data source tags

Create the data source

Requirements

Create Snowflake data sources

Monitor progress

How-to guides

Reference guide

Bulk edit data sources

Resync policies

Resync data policies

Resync grants and data policies

Disable a data source

Enable a disabled data source

Delete a data source

Related guides

Reference guides

How-to guides

Manage data dictionary descriptions

Related guides

Reference guide

How-to guide

Unhealthy Databricks data sources

Limitations

How-to guides

Reference guides

Concept guide

Manually Run Schema Monitoring Jobs

Manually Run Schema Monitoring Job for All Data Sources

Manually Run Schema Monitoring Job as a Data Owner

Manually Run Schema Monitoring Job as a Data User

Manually Run a Column Detection Job

Schema Project Actions

Anti-patterns: Using Immuta without schema monitoring

Business value