Upgrading to Connections

Connections allow you to register your data objects in a technology through a single connection, making data registration more scalable for your organization. Instead of registering schema and databases individually, you can register them all at once and allow Immuta to monitor your data platform for changes so that data sources are added and removed automatically to reflect the state of data on your data platform.

This document is meant to guide to you to connections from a configured integration. If you are a new user without any current integrations, see the Connections reference guide instead.

Exceptions

Do not upgrade to connections if you meet any of the criteria below:

You are using the Databricks Spark integration
You are using the workspace-catalog binding capability with Databricks Unity Catalog
You are using the V2 /data endpoint to register data sources and attach tags automatically

Integrations

Integrations are now connections. Once the upgrade is complete, you will control most integration settings at the connection level via the Connections tab in Immuta.

Integrations (existing)

Connections (new)

Integrations are set up from the Immuta app settings page or via the API. These integrations establish a relationship between Immuta and your data platform for policy orchestration. Then tables are registered as data sources through an additional step with separate credentials. Schemas and databases are not reflected in the UI.

Integrations and data sources are set up together with a single connection per account between Immuta and your data platform. Based on the privileges granted to the Immuta system user, metadata from databases, schemas, and tables is automatically pulled into Immuta and continuously monitored for any changes.

Supported technology and authorization methods

Snowflake

Snowflake OAuth
Username and password
Key pair

Databricks

Personal Access Token
M2M OAuth

Unsupported technologies

The following technologies are not yet supported with connections:

Amazon S3
Azure Synapse Analytics
Databricks Spark
Google BigQuery
Redshift
Starburst (Trino)

Additional connection string options

When registering data sources using the legacy method, there is a field for Additional Connection String Options that your Immuta representative may have instructed you to use. If you did enter any additional connection information there, check to ensure the information you included is supported with connections. Only the following Additional Connection String Options input is supported:

Snowflake data sources with the private key file password set using Additional Connection String Options.

Supported features

The tables below outline Immuta features, their availability with integrations, and their availability with connections.

Snowflake

Feature

Integrations (existing)

Connections (new)

Snowflake lineage

Supported

Query audit

Supported

Tag ingestion

Supported

Connection tags

Not supported

Project workspaces

User impersonation

Databricks Unity Catalog

Feature

Integrations (existing)

Connections (new)

Query audit

Supported

Tag ingestion

Supported

Connection tags

Not supported

Workspace-catalog binding

Supported

Not supported

Project workspaces

Not supported

User impersonation

Not supported

Data sources

There will be no policy downtime on your data sources while performing the upgrade.

Supported object types

See the integration's reference guide for the supported object types for each technology:

Data source names

Data source names will change when migrating from integrations to connections. The new data source names will contain the connection, database, schema, and object name. For example, on Snowflake this will typically mean that my_table will become My Connection.MY_DATABASE.MY_SCHEMA.MY_TABLE.

Having multiple objects with the same name within the same schema is currently unsupported and will lead to object uniqueness violations in Immuta. In this scenario, you can work around it as follows:

Ensure every object within the same schema has a unique name, or
Remove the visibility of one of the objects from the Immuta system account. This will ensure only one of the objects is seen by the system account and ingested in Immuta.

Hierarchy

With connections, your data sources are ingested and presented to reflect the infrastructure hierarchy of your connected data platform. For example, this is what the new hierarchy will look like for a Snowflake connection:

Integrations (existing)

Connections (new)

Integration

Connection

Database

Schema

Data source

Data source (once enabled, becomes available for policy enforcement)

Users and permissions

With integrations

Permission

Action

Object

APPLICATION_ADMIN

Configure integration

Integration

CREATE_DATA_SOURCE

Data source

Data owner

Manage data sources

Data source

With connections

Permission

Action

Object

APPLICATION_ADMIN

Connection, database, schema, data source

GOVERNANCE or APPLICATION_ADMIN

Manage all connections

Connection, database, schema, data source

Data owner

Manage data objects

Connection, database, schema, data source

Schema monitoring

Schema monitoring is renamed to object sync with connections, as it can also monitor for changes at database and connection level.

During object sync, Immuta crawls your connection to ingest metadata for every database, schema, and table that the Snowflake role or Databricks account credentials you provided during the configuration has access to. Upon completion of the upgrade, the tables' states depend on your previous schema monitoring settings:

If you had schema monitoring enabled on a schema: All tables from that schema will be registered in Immuta as enabled data sources.
If you had schema monitoring disabled on a schema: All tables from that schema (that were not already registered in Immuta) will be registered as disabled data objects. They are visible from the Data Objects tab in Immuta, but are not listed as data sources until they are enabled.

After the initial upgrade, object sync runs on your connection every 24 hours (at 1:00 AM UTC) to keep your tables in Immuta in sync. Additionally, users can also manually run object sync via the UI or API.

Schema projects

With integrations, many settings and the connection details for data sources were controlled in the schema project, including schema monitoring. This functionality is no longer needed with connections and now you can control connection details in a central spot.

Schema project owners

With integrations, schema project owners can become schema monitoring owners, control connection settings, and manage subscription policies on the schema project.

These schema project owners will not be represented in connections, and if you want them to have similar abilities, you must make them Data Owner on the schema.

Additional settings

Object sync provides additional controls compared to schema monitoring:

Object status: Connections, databases, schemas and tables can be marked enabled, which for tables make them appear as data sources, or disabled. These statuses are inherited to all lower objects by default, but that can be overridden. For example, if you make a database disabled, all schemas and tables within that database will inherit the status to be disabled. However, if you want one of those tables to be a data source, you can manually enable it.
Enable new data objects: This setting controls what state new objects are registered as in Immuta when found by object sync.
- Enable: New data objects found by object sync will automatically be enabled and tables will be registered as data sources.
- Disable: This is the default. New data objects found by object sync will be disabled.

Comparison

Integrations (existing)

Connections (new)

Name

Schema monitoring and column detection

Object sync

Where to turn on

Enable (optionally) when configuring a data source

Enabled by default

Where to update the feature

Enable or disable from the schema project

Object sync cannot be disabled

Default schedule

Every 24 hours

Every 24 hours (at 1:00 AM UTC)

Can you adjust the default schedule?

New tags applied automatically

New tags are applied automatically for a data source being created, a column being added, or a column type being updated on an existing data source

New tags are applied automatically for a column being added or a column type being updated on an existing data source

Performance

Connections use a new architectural pattern resulting in an improved performance when monitoring for in your data platform, particularly with large numbers of data sources. The following scenarios are regularly tested in an isolated environment in order to provide a benchmark. Please note, that these numbers can vary based on a number of factors such as (but not limited to) number and type of policies applied, overall API and user activity in the system, connection latency to your data platform.

Scenario 1 Running object sync on a schema with 10,000 data sources with 50 columns each

172.2 seconds on average

Scenario 2 Running object sync on a schema with 1,000 data sources with 10 columns each

9.38 seconds on average

Scenario 3 Running object sync on a schema with 1 data source with 50 columns

0.512 seconds on average

Databricks Unity Catalog

Data sources with integrations, required users to manually create the schema monitoring job in Databricks. However, this job has been fully automated on data sources with connections, and this step is no longer necessary.

APIs

Consolidating integration setup and data source registration into a single connection significantly simplifies programmatic interaction with the Immuta APIs. Actions that used to be managed through multiple different endpoints can now be achieved through one simple and standardized one. As a result, multiple API endpoints are blocked once a user has upgraded their connection.

All blocked APIs will send an error indicating "400 Bad Request - [...]. Use the /data endpoint." This error indicates that you will need to update your processes that are calling the Immuta APIs to leverage the new /data endpoint instead. For details, see the API changes page.

PreviousConnections NextBefore You Begin

Last updated 1 day ago

Was this helpful?

Upgrading to Connections

Integrations

Supported technology and authorization methods

Snowflake

Databricks

Supported features

Snowflake

Databricks Unity Catalog

Data sources

Supported object types

Data source names

Hierarchy

Tags

Tag ingestion

Consideration

Users and permissions

With integrations

With connections

Schema monitoring

Schema projects

Additional settings

Comparison

Performance

Databricks Unity Catalog

APIs