Register a Databricks Unity Catalog Connection

Requirements

  • APPLICATION_ADMIN Immuta permission

  • The Databricks user registering the connection and running the script must have the following privileges:

    • Metastore admin and account admin

    • CREATE CATALOG privilege on the Unity Catalog metastore to create an Immuta-owned catalog and tables

See the Databricks documentationarrow-up-right for more details about Unity Catalog privileges and securable objects.

Prerequisites

  • Unity Catalog metastore createdarrow-up-right and attached to a Databricks workspace.

  • Unity Catalog enabled on your Databricks cluster or SQL warehouse. All SQL warehouses have Unity Catalog enabled if your workspace is attached to a Unity Catalog metastore. Immuta recommends linking a SQL warehouse to your Immuta tenant rather than a cluster for both performance and availability reasons.

Create the Databricks service principal

In Databricks, create a service principalarrow-up-right with the privileges listed below. Immuta uses this service principal continuously to orchestrate Unity Catalog policies and maintain state between Immuta and Databricks.

  • USE CATALOG and MANAGE on all catalogs containing securables you want registered as Immuta data sources.

  • USE SCHEMA on all schemas containing securables you want registered as Immuta data sources.

  • MODIFY and SELECT on all securables you want registered as Immuta data sources. The MODIFY privilege is not required for materialized views registered as Immuta data sources, since MODIFY is not a supported privilege on that object type in Databricksarrow-up-right.

circle-info

MANAGE and MODIFY are required so that the service principal can apply row filters and column masks on the securable; to do so, the service principal must also have SELECT on the securable as well as USE CATALOG on its parent catalog and USE SCHEMA on its parent schema. Since privileges are inherited, you can grant the service principal the MODIFY and SELECT privilege on all catalogs or schemas containing Immuta data sources, which automatically grants the service principal the MODIFY and SELECT privilege on all current and future securables in the catalog or schema. The service principal also inherits MANAGE from the parent catalog for the purpose of applying row filters and column masks, but that privilege must be set directly on the parent catalog in order for grants to be fully applied.

See the Databricks documentationarrow-up-right for more details about Unity Catalog privileges and securable objects.

Set up query audit

circle-info

Audit is enabled by default on all Databricks Unity Catalog connections. If you need to turn audit off, create the connection with the connections API and set audit to false in the payload.

Grant the service principal access to the Databricks Unity Catalog system tablesarrow-up-right. For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.

  • USE CATALOG on the system catalog

  • USE SCHEMA on the system.access and system.query schemas

  • SELECT on the following system tables:

    • system.access.table_lineage

    • system.access.column_lineage

    • system.access.audit

    • system.query.history

    Access to system tables is governed by Unity Catalog. No user has access to these system schemas by default. To grant access, a user that is both a metastore admin and an account admin must grant USE_SCHEMA and SELECT privileges on the system schemas to the service principal. See Manage privileges in Unity Catalogarrow-up-right.

Register a connection

circle-exclamation
  1. Click Data and select the Connections tab in the navigation menu.

  2. Click the + Add Connection button.

  3. Select the Databricks data platform tile.

  4. Enter the connection information:

    • Host: The hostname of your Databricks workspace.

    • Port: Your Databricks port.

    • HTTP Path: The HTTP path of your Databricks cluster or SQL warehouse.

    • Immuta Catalog: The name of the catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.

    • Display Name: The display name represents the unique name of your connection and will be used as prefix in the name for all data objects associated with this connection. It will also appear as the display name in the UI and will be used in all API calls made to update or delete the connection.

  5. Click Next.

  6. Select your authentication method from the dropdown:

  7. Copy the provided script and run it in Databricks as a user with the privileges listed in the requirements section.

  8. Click Validate Connection.

  9. If the connection is successful, click Next. If there are any errors, check the connection details and credentials to ensure they are correct and try again.

  10. Ensure all the details are correct in the summary and click Complete Setup.

circle-exclamation

Last updated

Was this helpful?