# Upgrading to Connections

Connections allow you to register your data objects in a technology through a single connection, making data registration more scalable for your organization. Instead of registering schema and databases individually, you can register them all at once and allow Immuta to monitor your data platform for changes so that data sources are added and removed automatically to reflect the state of data on your data platform.

This document is meant to guide you to connections from a configured integration. If you are a new user without any current integrations, see the [Connections reference guide](https://documentation.immuta.com/SaaS/configuration/integrations/data-and-integrations/registering-a-connection/reference-guides/connections-overview) instead.

{% hint style="danger" %}
**Exceptions**

Do not upgrade to connections if you meet any of the criteria below:

* You are using the Databricks Spark integration
* You are using the [workspace-catalog binding](https://documentation.immuta.com/SaaS/configuration/databricks/databricks-unity-catalog/unity-catalog-overview#workspace-catalog-binding) capability with Databricks Unity Catalog
* You are using [the V2 /data endpoint to register data sources and attach tags automatically](#consideration)
  {% endhint %}

## Integrations

{% hint style="warning" %}
Integrations are now **connections**. Once the upgrade is complete, you will control most integration settings at the connection level via the Connections tab in Immuta.
{% endhint %}

| Integrations (existing)                                                                                                                                                                                                                                                                                                                                | Connections (new)                                                                                                                                                                                                                                                                                                                  |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| <p>Integrations are set up from the Immuta app settings page or via the API. These integrations establish a relationship between Immuta and your data platform for policy orchestration.</p><p>Then tables are registered as data sources through an additional step with separate credentials. Schemas and databases are not reflected in the UI.</p> | <p>Integrations and data sources are set up together with a single connection per account between Immuta and your data platform.</p><p>Based on the privileges granted to the Immuta system user, metadata from databases, schemas, and tables is automatically pulled into Immuta and continuously monitored for any changes.</p> |

### Supported technology and authorization methods

#### Snowflake <a href="#snowflake" id="snowflake"></a>

* Snowflake OAuth
* Username and password
* Key pair

#### Databricks

* Personal Access Token
* M2M OAuth

#### Trino

* Username and password
* OAuth 2.0

{% hint style="danger" %}
**Unsupported technologies**

The following technologies are not yet supported with connections:

* Amazon S3
* Azure Synapse Analytics
* Databricks Spark
* Google BigQuery
  {% endhint %}

{% hint style="warning" %}
**Additional connection string options**

When registering data sources using the legacy method, there is a field for **Additional Connection String Options** that your Immuta representative may have instructed you to use. If you did enter any additional connection information there, check to ensure the information you included is supported with connections. *Only the following **Additional Connection String Options** inputs are supported*:

* Snowflake data sources with the private key file password set using **Additional Connection String Options**.
* Trino data sources with proxy set using **Additional Connection String Options**
* Trino data sources with SSL/TLS enabled and certificate validation disabled using **Additional Connection String Options**
  {% endhint %}

### Supported features

The tables below outline Immuta features, their availability with integrations, and their availability with connections.

{% tabs %}
{% tab title="Databricks Unity Catalog" %}

<table><thead><tr><th>Feature</th><th width="239">Integrations (existing)</th><th>Connections (new)</th></tr></thead><tbody><tr><td>Query audit</td><td>Supported</td><td>Supported</td></tr><tr><td>Tag ingestion</td><td>Supported</td><td>Supported</td></tr><tr><td><a href="../connections-overview#connection-tags">Connection tags</a></td><td>Not supported</td><td><a data-footnote-ref href="#user-content-fn-1">Supported</a></td></tr><tr><td>Workspace-catalog binding</td><td>Supported</td><td>Not supported</td></tr><tr><td>Project workspaces</td><td>Not supported</td><td>Not supported</td></tr><tr><td>User impersonation</td><td>Not supported</td><td>Not supported</td></tr></tbody></table>
{% endtab %}

{% tab title="Snowflake" %}

<table><thead><tr><th>Feature</th><th width="223">Integrations (existing)</th><th>Connections (new)</th></tr></thead><tbody><tr><td>Snowflake lineage</td><td>Supported</td><td>Supported</td></tr><tr><td>Query audit</td><td>Supported</td><td>Supported</td></tr><tr><td>Tag ingestion</td><td>Supported</td><td>Supported</td></tr><tr><td><a href="../connections-overview#connection-tags">Connection tags</a></td><td>Not supported</td><td><a data-footnote-ref href="#user-content-fn-1">Supported</a></td></tr><tr><td>Project workspaces</td><td><a data-footnote-ref href="#user-content-fn-2">Not supported</a></td><td><a data-footnote-ref href="#user-content-fn-2">Not supported</a></td></tr><tr><td>User impersonation</td><td><a data-footnote-ref href="#user-content-fn-3">Not supported</a></td><td><a data-footnote-ref href="#user-content-fn-3">Not supported</a></td></tr></tbody></table>
{% endtab %}

{% tab title="Trino" %}

<table><thead><tr><th>Feature</th><th width="223">Integrations (existing)</th><th>Connections (new)</th></tr></thead><tbody><tr><td>Query audit</td><td>Supported</td><td>Supported</td></tr><tr><td>Tag ingestion</td><td>Not supported</td><td>Not supported</td></tr><tr><td><a href="../connections-overview#connection-tags">Connection tags</a></td><td>Not supported</td><td><a data-footnote-ref href="#user-content-fn-1">Supported</a></td></tr><tr><td>User impersonation</td><td>Supported</td><td>Supported</td></tr><tr><td>Multi-cluster support</td><td>Not supported</td><td>Supported</td></tr></tbody></table>
{% endtab %}
{% endtabs %}

## Data sources

{% hint style="success" %}
There will be no policy downtime on your data sources while performing the upgrade.
{% endhint %}

### Supported object types

See the integration's reference guide for the supported object types for each technology:

* [Databricks Unity Catalog](https://documentation.immuta.com/SaaS/configuration/databricks/databricks-unity-catalog/unity-catalog-overview#supported-object-types)
* [Snowflake](https://documentation.immuta.com/SaaS/configuration/snowflake/reference-guides/snowflake-overview#supported-object-types)
* [Trino](https://documentation.immuta.com/SaaS/configuration/starburst-trino/reference-guides/trino-overview#supported-object-types)

### Data source names

Data source names will change when migrating from integrations to connections. The new data source names will contain the connection, database, schema, and object name. For example, on Snowflake this will typically mean that `my_table` will become `My Connection.MY_DATABASE.MY_SCHEMA.MY_TABLE`.

Having multiple objects with the same name within the same schema is currently unsupported and will lead to object uniqueness violations in Immuta. In this scenario, you can work around it as follows:

* Ensure every object within the same schema has a unique name, or
* Remove the visibility of one of the objects from the Immuta system account. This will ensure only one of the objects is seen by the system account and ingested in Immuta.

### Hierarchy

With connections, your data sources are ingested and presented to reflect the infrastructure hierarchy of your connected data platform. For example, this is what the new hierarchy will look like for a Snowflake connection:

| Integrations (existing) | Connections (new)                                                    |
| ----------------------- | -------------------------------------------------------------------- |
| Integration             | Connection                                                           |
| -                       | Database                                                             |
| -                       | Schema                                                               |
| Data source             | Data source (once enabled, becomes available for policy enforcement) |

## Tags

{% hint style="success" %}
Connections will not change any tags currently applied on your data sources.
{% endhint %}

### Tag ingestion

When [supported](#supported-features), use tag ingestion to automatically apply tags from your data platform onto your Immuta data sources.

If you want all data objects from connections to have data tags ingested from the data provider into Immuta, ensure the credentials provided on the Immuta app settings page for the external catalog feature can access all the data objects. Any data objects the credentials do not have access to will not be tagged in Immuta. In practice, it is recommended to just use the same credentials for the connection and tag ingestion.

### Consideration

{% hint style="warning" %}
If you previously ingested data sources using the V2 `/data` endpoint this limitation applies to you.
{% endhint %}

The V2 `/data` endpoint allows users to register data sources and attach a tag automatically when the data sources are registered in Immuta.

The V2 `/data` endpoint is not supported with a connection, and there is no substitution for this behavior at this time. If you require default tags for newly onboarded data sources, reach out to your Immuta support professional before upgrading.

## Users and permissions

### With integrations

| Permission           | Action                | Object      |
| -------------------- | --------------------- | ----------- |
| APPLICATION\_ADMIN   | Configure integration | Integration |
| CREATE\_DATA\_SOURCE | Register tables       | Data source |
| Data owner           | Manage data sources   | Data source |

### With connections

| Permission                       | Action                  | Object                                    |
| -------------------------------- | ----------------------- | ----------------------------------------- |
| APPLICATION\_ADMIN               | Register the connection | Connection, database, schema, data source |
| GOVERNANCE or APPLICATION\_ADMIN | Manage all connections  | Connection, database, schema, data source |
| Data owner                       | Manage data objects     | Connection, database, schema, data source |

## Schema monitoring

{% hint style="success" %}
Schema monitoring is renamed to **object sync** with connections, as it can also monitor for changes at database and connection level.
{% endhint %}

During object sync, Immuta crawls your connection to ingest metadata for every database, schema, and table that the Snowflake role or Databricks account credentials you provided during the configuration has access to. Upon completion of the upgrade, the tables' states depend on your previous schema monitoring settings:

* **If you had schema monitoring enabled on a schema**: All tables from that schema will be registered in Immuta as enabled data sources.
* **If you had schema monitoring disabled on a schema**: All tables from that schema (that were not already registered in Immuta) will be registered as disabled data objects. They are visible from the Data Objects tab in Immuta, but are not listed as data sources until they are enabled.

After the initial upgrade, object sync runs on your connection every 24 hours (at 1:00 AM UTC) to keep your tables in Immuta in sync. Additionally, users can also [manually run object sync](https://documentation.immuta.com/SaaS/configuration/integrations/data-and-integrations/registering-a-connection/how-to-guides/crawl-a-connection) via the UI or API.

### Schema projects

With integrations, many settings and the connection details for data sources were controlled in the schema project, including schema monitoring. This functionality is no longer needed with connections and now you can control connection details in a central spot.

{% hint style="warning" %}
**Schema project owners**

With integrations, schema project owners can become schema monitoring owners, control connection settings, and manage subscription policies on the schema project.

These schema project owners will not be represented in connections, and if you want them to have similar abilities, [you must make them **Data Owner** on the schema](https://documentation.immuta.com/SaaS/configuration/integrations/data-and-integrations/how-to-guides/manage-connection-settings#assign-domain-permissions-2).
{% endhint %}

### Additional settings

Object sync provides additional controls compared to schema monitoring:

* **Object status**: Connections, databases, schemas and tables can be marked enabled, which for tables make them appear as data sources, or disabled. These statuses are inherited to all lower objects by default, but that can be overridden. For example, if you make a database disabled, all schemas and tables within that database will inherit the status to be disabled. However, if you want one of those tables to be a data source, you can manually enable it.
* **Enable new data objects**: This setting controls what state new objects are registered as in Immuta when found by object sync.
  * **Enable**: New data objects found by object sync will automatically be enabled and tables will be registered as data sources.
  * **Disable**: This is the default. New data objects found by object sync will be disabled.

### Comparison

|                                          | Integrations (existing)                                                                                                                               | Connections (new)                                                                                                       |
| ---------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- |
| **Name**                                 | Schema monitoring and column detection                                                                                                                | Object sync                                                                                                             |
| **Where to turn on**                     | Enable (optionally) when configuring a data source                                                                                                    | Enabled by default                                                                                                      |
| **Where to update the feature**          | Enable or disable from the schema project                                                                                                             | Object sync cannot be disabled                                                                                          |
| **Default schedule**                     | Every 24 hours                                                                                                                                        | Every 24 hours (at 1:00 AM UTC)                                                                                         |
| **Can you adjust the default schedule?** | No                                                                                                                                                    | No                                                                                                                      |
| **`New` tags applied automatically**     | `New` tags are applied automatically for a data source being created, a column being added, or a column type being updated on an existing data source | `New` tags are applied automatically for a column being added or a column type being updated on an existing data source |

### Performance

Connections use a new architectural pattern resulting in an improved performance when monitoring for [metadata changes](#user-content-fn-4)[^4] in your data platform, particularly with large numbers of data sources. The following scenarios are regularly tested in an isolated environment in order to provide a benchmark. These numbers can vary based on a number of factors such as (but not limited to) number and type of policies applied, overall API and user activity in the system, connection latency to your data platform.

<table data-view="cards"><thead><tr><th></th><th></th></tr></thead><tbody><tr><td><strong>Scenario 1</strong><br><br>Running object sync on a single schema with 1 data source with 50 columns</td><td><strong>0.512</strong> seconds on average</td></tr><tr><td><strong>Scenario 2</strong><br><br>Running object sync on a single schema with 1,000 data sources with 10 columns each</td><td><strong>9.38</strong> seconds on average</td></tr><tr><td><strong>Scenario 3</strong><br><br>Running object sync on a single schema with 10,000 data sources with 50 columns each</td><td><strong>172.2</strong> seconds on average</td></tr><tr><td><strong>Scenario 4</strong><br><br>Running object sync on 15 databases with 5,500 schemas (100,000 tables)</td><td><strong>0.67</strong> hours on average</td></tr><tr><td><strong>Scenario 5</strong><br><br>Running object sync on 125 databases with 3,000 schemas (200,000 tables)</td><td><strong>1.32</strong> hours on average</td></tr><tr><td><strong>Scenario 6</strong><br><br>Running object sync on 1,000 databases with 17,000 schemas (500,000 tables)</td><td><strong>5.43</strong> hours on average</td></tr></tbody></table>

#### Databricks Unity Catalog

Data sources with integrations required users to [manually create the schema monitoring job in Databricks](https://documentation.immuta.com/SaaS/configuration/integrations/registering-metadata/register-data-sources/databricks-tutorial#enable-or-disable-schema-monitoring). However, this job has been fully automated on data sources with connections, and this step is no longer necessary.

## APIs

Consolidating integration setup and data source registration into a single connection significantly simplifies programmatic interaction with the Immuta APIs. Actions that used to be managed through multiple different endpoints can now be achieved through one simple and standardized one. As a result, multiple API endpoints are blocked once a user has upgraded their connection.

All blocked APIs will send an error indicating "400 Bad Request - \[...]. Use the /data endpoint." This error indicates that you will need to update your processes that are calling the Immuta APIs to leverage the new `/data` endpoint instead. For details, see the [API changes](https://documentation.immuta.com/SaaS/configuration/integrations/data-and-integrations/registering-a-connection/reference-guides/upgrading-to-connections/api-changes) page.

[^1]: When you upgrade to connections, connection tags will automatically be applied to all data sources belonging to a connection.

[^2]: Currently, only supported if table grants is disabled.

[^3]: Currently, only supported if low row access policy mode is disabled.

[^4]: Immuta monitors your data platform to identify when data objects are created, updated or deleted, and automatically registers those changes by updating the corresponding data source's columns in Immuta. This guarantees that policies continue to be applied without requiring manual user intervention and ensures continuous compliance with your organization's access requirements.
