# Register a Databricks Unity Catalog Connection

{% hint style="info" %}
[Connections](https://documentation.immuta.com/latest/configuration/integrations/registering-metadata/connections/reference-guides/connections-reference-guide) allow you to register your data objects in a technology through a single connection, instead of registering data sources and an integration separately.

This feature is available to all 2025.1+ tenants. Contact your Immuta representative to enable this feature.
{% endhint %}

## **Requirements**

* **Immuta user** with the `APPLICATION_ADMIN` Immuta permission
* [**Databricks service principal**](#user-content-fn-1)[^1] with the following privileges. For instructions on setting up this user, see the [Creating the Databricks service principal section](#setting-up-the-required-databricks-service-principal):
  * `USE CATALOG` and `MANAGE` on all catalogs containing securables you want registered as Immuta data sources.
  * `USE SCHEMA` on all schemas containing securables you want registered as Immuta data sources.
  * [`MODIFY`](#user-content-fn-2)[^2] and `SELECT` on all securables you want registered as Immuta data sources.
  * Additional privileges are required for query audit:
    * `USE CATALOG` on the `system` catalog
    * `USE SCHEMA` on the `system.access` and `system.query` schemas
    * `SELECT` on the following system tables:
      * `system.access.table_lineage`
      * `system.access.column_lineage`
      * `system.access.audit`
      * `system.query.history`
* **Databricks user to run the script to register the connection** with the following privileges:
  * [`Metastore admin` and `account admin`](#user-content-fn-3)[^3]
  * `CREATE CATALOG` privilege on the Unity Catalog metastore to create an Immuta-owned catalog and tables

See the [Databricks documentation](https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/privileges.html) for more details about Unity Catalog privileges and securable objects.

#### **Prerequisites**

* Unity Catalog [metastore created](https://docs.databricks.com/data-governance/unity-catalog/create-metastore.html) and attached to a Databricks workspace.
* Unity Catalog enabled on your Databricks cluster or SQL warehouse. All SQL warehouses have Unity Catalog enabled if your workspace is attached to a Unity Catalog metastore. Immuta recommends linking a SQL warehouse to your Immuta tenant rather than a cluster for both performance and availability reasons.
* No Databricks Unity Catalog integration configured in Immuta. If your Databricks Unity Catalog integration is already configured on the app settings page, follow the [Use the connection upgrade manager guide](https://documentation.immuta.com/latest/configuration/integrations/registering-metadata/connections/how-to-guides/use-the-connection-upgrade-manager).

## Register a connection

{% hint style="warning" %}
**Create a separate Immuta catalog for each Immuta tenant**

If multiple Immuta tenants are connected to your Databricks environment, create a separate **Immuta catalog** for each of those tenants. Having multiple Immuta tenants use the same Immuta catalog causes failures in policy enforcement.
{% endhint %}

1. Click <i class="fa-database">:database:</i> **Data** and select the **Connections** tab in the navigation menu.
2. Click the **+ Add Connection** button.
3. Select the Databricks data platform tile.
4. Enter the connection information:
   * **Host**: The hostname of your Databricks workspace.
   * **Port**: Your Databricks port.
   * **HTTP Path**: The HTTP path of your Databricks cluster or SQL warehouse.
   * **Immuta Catalog**: The name of the catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
   * **Display Name**: The display name represents the unique name of your connection and will be used as prefix in the name for all data objects associated with this connection. It will also appear as the display name in the UI and will be used in all API calls made to update or delete the connection. Avoid the use of periods (`.`) or [restricted words](#user-content-fn-4)[^4] in your connection name.
5. Click **Next**.
6. Select your authentication method from the dropdown:
   * **Access Token**: Enter the **Access Token** in the Immuta System Account Credentials section. This is the access token for the Immuta service principal, which can be [an on-behalf token created in Databricks](https://docs.databricks.com/api/workspace/tokenmanagement/createobotoken). This service principal must have the metastore [privileges listed above](#requirements) for the metastore associated with the Databricks workspace. If this token is configured to expire, update this field regularly for the connection to continue to function. This authentication information will be included in the script populated later on the page.
   * **OAuth M2M**:
     * **AWS Databricks**:
       * Follow [Databricks documentation to create a client secret](https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html) for the Immuta service principal and assign this service principal the [privileges listed above](#requirements) for the metastore associated with the Databricks workspace.
       * Fill out the **Token Endpoint** with the full URL of the identity provider. This is where the generated token is sent. The default value is `https://<your workspace name>.cloud.databricks.com/oidc/v1/token`.
       * Fill out the **Client ID**. This is a combination of letters, numbers, or symbols, used as a public identifier and is the [client ID displayed in Databricks when creating the client secret for the service principal](https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html#step-3-create-an-oauth-secret-for-a-service-principal).
       * Enter the **Scope** (string). The scope limits the operations and roles allowed in Databricks by the access token. See the [OAuth 2.0 documentation](https://oauth.net/2/scope/) for details about scopes.
       * Enter the **Client Secret** you created above. Immuta uses this secret to authenticate with the authorization server when it requests a token.
     * **Azure Databricks:**
       * Follow [Databricks documentation](https://learn.microsoft.com/en-us/azure/databricks/dev-tools/auth/oauth-m2m) to create a service principal within Azure and then populate to your Databricks account and workspace.
       * Assign this service principal the [privileges listed above](#requirements) for the metastore associated with the Databricks workspace.
       * Within Databricks, [create an OAuth client secret for the service principal](https://learn.microsoft.com/en-us/azure/databricks/dev-tools/auth/oauth-m2m#step-5-create-an-azure-databricks-oauth-secret-for-the-service-principal). This completes your Databricks-based service principal setup.
       * Within Immuta, fill out the **Token Endpoint** with the full URL of the identity provider. This is where the generated token is sent. The default value is `https://<your workspace name>.azuredatabricks.net/oidc/v1/token`.
       * Fill out the **Client ID**. This is a combination of letters, numbers, or symbols, used as a public identifier and is the [client ID displayed in Databricks when creating the client secret for the service principal](https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html#step-3-create-an-oauth-secret-for-a-service-principal) (note that Azure Databricks uses the Azure SP Client ID; it will be identical).
       * Enter the **Scope** (string). The scope limits the operations and roles allowed in Databricks by the access token. See the [OAuth 2.0 documentation](https://oauth.net/2/scope/) for details about scopes.
       * Enter the **Client Secret** you created above. Immuta uses this secret to authenticate with the authorization server when it requests a token.
7. Copy the provided script and run it in Databricks as a user with the [privileges listed in the requirements section](#requirements).
8. Click **Validate Connection**.
9. If the connection is successful, click **Next**. If there are any errors, check the connection details and credentials to ensure they are correct and try again.
10. Ensure all the details are correct in the summary and click **Complete Setup**.

{% hint style="warning" %}
**Databricks Unity Catalog behavior**

If you register a connection and a data object has no subscription policy set on it, Immuta will REVOKE access to the data in Databricks for all Immuta users, even if they had been directly granted access to the table in Unity Catalog.

If you disable a Unity Catalog data source in Immuta, all existing grants and policies on that object will be removed in Databricks for all Immuta users. All existing grants and policies will be removed, regardless of whether they were set in Immuta or in Unity Catalog directly.

If a user is not registered in Immuta, Immuta will have no effect on that user's access to data in Unity Catalog.

See the [Databricks Unity Catalog reference guide](https://documentation.immuta.com/latest/configuration/integrations/databricks/unity-catalog-overview#user-permissions-immuta-revokes) for more details about permissions Immuta revokes and how to configure this behavior for your connection.
{% endhint %}

## Setting up the required Databricks service principal

If you need instruction for setting up your Databricks service principal **before registering your connection**, see the steps below.

### Creating the Databricks service principal

In Databricks, [create a service principal](https://docs.databricks.com/en/admin/users-groups/service-principals.html#manage-service-principals-in-your-account) with the privileges listed below. Immuta uses this service principal continuously to orchestrate Unity Catalog policies and maintain state between Immuta and Databricks.

* `USE CATALOG` and `MANAGE` on all catalogs containing securables you want registered as Immuta data sources.
* `USE SCHEMA` on all schemas containing securables you want registered as Immuta data sources.
* `MODIFY` and `SELECT` on all securables you want registered as Immuta data sources. *The* `MODIFY` *privilege is not required for materialized views registered as Immuta data sources, since* `MODIFY` *is not a supported privilege on that object type in* [*Databricks*](https://docs.databricks.com/aws/en/data-governance/unity-catalog/manage-privileges/privileges#privilege-types-by-securable-object-in-unity-catalog)*.*

{% hint style="info" %}
`MANAGE` and `MODIFY` are required so that the service principal can apply row filters and column masks on the securable; to do so, the service principal must also have `SELECT` on the securable as well as `USE CATALOG` on its parent catalog and `USE SCHEMA` on its parent schema. Since privileges are inherited, you can grant the service principal the `MODIFY` and `SELECT` privilege on all catalogs or schemas containing Immuta data sources, which automatically grants the service principal the `MODIFY` and `SELECT` privilege on all current and future securables in the catalog or schema. The service principal also inherits `MANAGE` from the parent catalog for the purpose of applying row filters and column masks, but that privilege must be set directly on the parent catalog in order for grants to be fully applied.
{% endhint %}

See the [Databricks documentation](https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/privileges.html) for more details about Unity Catalog privileges and securable objects.

### Configuring query audit privileges

{% hint style="info" %}
Audit is enabled by default on all Databricks Unity Catalog connections. If you need to turn audit off, [create the connection with the connections API](https://documentation.immuta.com/latest/developer-guides/api-intro/connections-api/how-to-guides/register-a-connection/register-a-databricks-unity-catalog-connection) and set `audit` to `false` in the payload.
{% endhint %}

[Grant the service principal access to the Databricks Unity Catalog system tables](https://docs.databricks.com/en/administration-guide/system-tables/index.html#grant-access-to-system-tables). For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.

* `USE CATALOG` on the `system` catalog
* `USE SCHEMA` on the `system.access` and `system.query` schemas
* `SELECT` on the following system tables:

  * `system.access.table_lineage`
  * `system.access.column_lineage`
  * `system.access.audit`
  * `system.query.history`

  Access to system tables is governed by Unity Catalog. No user has access to these system schemas by default. To grant access, a user that is both a metastore admin and an account admin must grant `USE_SCHEMA` and `SELECT` privileges on the system schemas to the service principal. See [Manage privileges in Unity Catalog](https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/index.html).

[^1]: A Databricks user authorized to create a Databricks service principal must create one for Immuta. This service principal is used continuously by Immuta to orchestrate Unity Catalog policies and maintain state between Immuta and Databricks.

[^2]: *The* `MODIFY` *privilege is not required for materialized views registered as Immuta data sources, since* `MODIFY` *is not a supported privilege on that object type in* [*Databricks*](https://docs.databricks.com/aws/en/data-governance/unity-catalog/manage-privileges/privileges#privilege-types-by-securable-object-in-unity-catalog)*.*

[^3]: These privileges are required to enable query audit. See [Manage privileges in Unity Catalog](https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/index.html) for details.

[^4]: Your display name cannot be any of the following words: `data`, `connection`, `object`, `crawl`, `search`, `settings`, `metadata`, `permission`, `sync`, `bulk`, and `upgrade`.
