Upgrading to Connections
Public preview: This feature is available to all accounts. Contact your Immuta representative to enable this feature.
Connections allow you to register your data objects in a technology through a single connection, making data registration more scalable for your organization. Instead of registering schema and databases individually, you can register them all at once and allow Immuta to monitor your data platform for changes so that data sources are added and removed automatically to reflect the state of data on your data platform.
Exceptions
Do not upgrade to Connections if you meet any of the criteria below:
You are using the Databricks Spark integration
You are using the workspace-catalog binding capability with Databricks Unity Catalog
You are not on SaaS
Integrations
Integrations are now connections. Once the upgrade is complete, you will control most integration settings at the connection level via the Connections tab in Immuta.
Integrations are set up from the Immuta app settings page or via the API. These integrations establish a relationship between Immuta and your data platform for policy orchestration. Then tables are registered as data sources through an additional step with separate credentials. Schemas and databases are not reflected in the UI.
Integrations and data sources are set up together with a single connection per account between Immuta and your data platform. Based on the privileges granted to the Immuta system user, metadata from databases, schemas, and tables is automatically pulled into Immuta and continuously monitored for any changes.
Supported technology and authorization methods
Snowflake
Snowflake OAuth
Username and password
Key pair
Databricks
Personal Access Token
M2M OAuth
Unsupported technologies
The following technologies are not yet supported with connections:
Azure Synapse Analytics
Databricks Spark
Google BigQuery
Redshift
S3
Starburst (Trino)
Additional connection string options
When registering data sources using the legacy method, there is a field for Additional Connection String Options that your Immuta representative may have instructed you to use. If you did enter any additional connection information there, check to ensure the information you included is supported with connections. Only the following Additional Connection String Options input is supported:
Snowflake data sources with the private key file password set using Additional Connection String Options.
Supported features
The tables below outline Immuta features, their availability with integrations, and their availability with connections.
Snowflake
User impersonation
Project workspaces
Snowflake lineage
Supported
Supported
Query audit
Supported
Supported
Tag ingestion
Supported
Supported
Databricks Unity Catalog
User impersonation
Not supported
Not supported
Project workspaces
Not supported
Not supported
Query audit
Supported
Supported
Tag ingestion
Supported
Supported
Workspace-catalog binding
Supported
Not supported
Data sources
There will be no policy downtime on your data sources while performing the upgrade.
Supported object types
The supported object types for Snowflake and Databricks Unity Catalog connections are listed below. When applying read and write access policies to these data sources, the privileges granted by Immuta vary depending on the object type. See an outline of privileges granted by Immuta on Snowflake and Databricks Unity Catalog object types on the Subscription policy access types page.
Snowflake
Table
View
Materialized view
External table
Event table
Iceberg table
Dynamic table
Databricks Unity Catalog
Table
View
Materialized view
Streaming table
External table
Foreign table
Volumes (external and managed)
Hierarchy
With connections, your data sources are ingested and presented to reflect the infrastructure hierarchy of your connected data platform. For example, this is what the new hierarchy will look like for a Snowflake connection:
Integration
Connection
-
Database
-
Schema
Data source
Data source (once enabled, becomes available for policy enforcement)
Tags
Connections will not change any tags currently applied on your data sources.
Tag ingestion
If you want all data objects from connections to have Snowflake data tags ingested into Immuta, ensure the credentials provided on the Immuta app settings page for the external catalog feature can access all the data objects. Any data objects the credentials do not have access to will not be tagged in Immuta. In practice, it is recommended to just use the same credentials for the connection and tag ingestion.
Consideration
If you previously ingested data sources using the V2 /data
endpoint this limitation applies to you.
The V2 /data
endpoint allows users to register data sources and attach a tag automatically when the data sources are registered in Immuta.
The V2 /data
endpoint is not supported with a connection, and there is no substitution for this behavior at this time. If you require default tags for newly onboarded data sources, please reach out to your Immuta support professional before upgrading.
Users and permissions
With integrations
APPLICATION_ADMIN
Configure integration
Integration
CREATE_DATA_SOURCE
Register tables
Data source
Data owner
Manage data sources
Data source
With connections
APPLICATION_ADMIN
Register the connection
Connection, database, schema, data source
GOVERNANCE or APPLICATION_ADMIN
Manage all connections
Connection, database, schema, data source
Data owner
Manage data objects
Connection, database, schema, data source
Schema monitoring
Schema monitoring is renamed to object sync with connections, as it can also monitor for changes at database and connection level.
During object sync, Immuta crawls your connection to ingest metadata for every database, schema, and table that the Snowflake role or Databricks account credentials you provided during the configuration has access to. Upon completion of the upgrade, the tables' states depend on your previous schema monitoring settings:
If you had schema monitoring enabled on a schema: All tables from that schema will be registered in Immuta as enabled data sources.
If you had schema monitoring disabled on a schema: All tables from that schema (that were not already registered in Immuta) will be registered as disabled data objects. They are visible from the Data Objects tab in Immuta, but are not listed as data sources until they are enabled.
After the initial upgrade, object sync runs on your connection every 24 hours (at 1:00 AM UTC) to keep your tables in Immuta in sync. Additionally, users can also manually run object sync via the UI or API.
Additional settings
Object sync provides additional controls compared to schema monitoring:
Object status: Connections, databases, schemas and tables can be marked enabled, which for tables make them appear as data sources, or disabled. These statuses are inherited to all lower objects by default, but that can be overridden. For example, if you make a database disabled, all schemas and tables within that database will inherit the status to be disabled. However, if you want one of those tables to be a data source, you can manually enable it.
Enable new data objects: This setting controls what state new objects are registered as in Immuta when found by object sync.
Enable: New data objects found by object sync will automatically be enabled and tables will be registered as data sources.
Disable: This is the default. New data objects found by object sync will be disabled.
Comparison
Name
Schema monitoring and column detection
Object sync
Where to turn on
Enable (optionally) when configuring a data source
Enabled by default
Where to update the feature
Enable or disable from the schema project
Object sync cannot be disabled
Default schedule
Every 24 hours
Every 24 hours (at 1:00 AM UTC)
Can you adjust the default schedule?
No
No
New
tags applied automatically
New
tags are applied automatically for a data source being created, a column being added, or a column type being updated on an existing data source
New
tags are applied automatically for a column being added or a column type being updated on an existing data source
Performance
Connections use a new architectural pattern resulting in an improved performance when monitoring for in your data platform, particularly with large numbers of data sources. The following scenarios are regularly tested in an isolated environment in order to provide a benchmark. Please note, that these numbers can vary based on a number of factors such as (but not limited to) number and type of policies applied, overall API and user activity in the system, connection latency to your data platform.
Scenario 1 Running object sync on a schema with 10,000 data sources with 50 columns each
172.2 seconds on average
Scenario 2 Running object sync on a schema with 1,000 data sources with 10 columns each
9.38 seconds on average
Scenario 3 Running object sync on a schema with 1 data source with 50 columns
0.512 seconds on average
Databricks Unity Catalog
Data sources with integrations, required users to manually create the schema monitoring job in Databricks. However, this job has been fully automated on data sources with connections, and this step is no longer necessary.
APIs
Consolidating integration setup and data source registration into a single connection significantly simplifies programmatic interaction with the Immuta APIs. Actions that used to be managed through multiple different endpoints can now be achieved through one simple and standardized one. As a result, multiple API endpoints are blocked once a user has upgraded their connection.
All blocked APIs will send an error indicating "400 Bad Request - [...]. Use the /data endpoint." This error indicates that you will need to update your processes that are calling the Immuta APIs to leverage the new /data
endpoint instead. For details, see the API changes page.
Last updated
Was this helpful?