Connect a Databricks Unity Catalog Host with Personal Access Token
This page details how to use the /data
v1 API to connect a Databricks Unity Catalog host to Immuta using a service principal with a personal access token (PAT). This connection works with a single set of credentials rather than configuring an integration and registering data sources separately. To manage your host, see the Manage a host reference guide.
Requirements
To complete this guide, you must be a user with the following:
Immuta permissions:
APPLICATION_ADMIN
CREATE_DATA_SOURCE
Databricks authorizations:
Account or workspace admin
CREATE CATALOG
privilege on the Unity Catalog metastore to create an Immuta-owned catalog and tables
Prerequisites
Unity Catalog metastore created and attached to a Databricks workspace. See the Databricks Unity Catalog reference guide for information on workspaces and catalog isolation support with Immuta.
Unity Catalog enabled on your Databricks cluster or SQL warehouse. All SQL warehouses have Unity Catalog enabled if your workspace is attached to a Unity Catalog metastore. Immuta recommends linking a SQL warehouse to your Immuta tenant rather than a cluster for both performance and availability reasons.
To connect a Databricks host, you must do the following:
Create a service principal in Databricks Unity Catalog with the proper Databricks permissions for Immuta to use to manage policies in Unity Catalog
Enable Databricks Unity Catalog in Immuta
Set up Unity Catalog system tables for native query audit.
Use the
/integrations/scripts/create
endpoint to receive a script.Run the script in Databricks Unity Catalog.
Use the
/data/connection
endpoint to finish creating the connection to your host and Immuta.
Step 1: Create your service principal
Create a Databricks service principal with the Databricks permissions outlined below and set up with personal access token (PAT) authentication.
The Immuta service principal requires the following Databricks privileges to connect to Databricks to create the integration catalog, configure the necessary procedures and functions, and maintain state between Databricks and Immuta:
OWNER
permission on the Immuta catalog you configure.OWNER
permission on catalogs with schemas and tables registered as Immuta data sources so that Immuta can administer Unity Catalog row-level and column-level security controls. This permission can be applied by grantingOWNER
on a catalog to a Databricks group that includes the Immuta service principal to allow for multiple owners. If theOWNER
permission cannot be applied at the catalog- or schema-level, each table registered as an Immuta data source must individually have theOWNER
permission granted to the Immuta service principal.USE CATALOG
andUSE SCHEMA
on parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta service principal can interact with those tables.SELECT
andMODIFY
on all tables registered as Immuta data sources so that the Immuta service principal can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.USE CATALOG
on thesystem
catalog for native query audit.USE SCHEMA
on thesystem.access
schema for native query audit.SELECT
on the following system tables for native query audit:system.access.audit
system.access.table_lineage
system.access.column_lineage
Step 2: Enable Unity Catalog in Immuta
Enable Databricks Unity Catalog on the Immuta app settings page:
Click the App Settings icon in the left sidebar.
Scroll to the Global Integrations Settings section and check the Enable Databricks Unity Catalog support in Immuta checkbox.
Step 3: Set up native query audit
Enable native query audit by completing these steps in Unity Catalog:
Grant the service principal from step 1 access to the Databricks Unity Catalog system tables. For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.
USE CATALOG
on thesystem
catalogUSE SCHEMA
on thesystem.access
schemaSELECT
on the following system tables:system.access.audit
system.access.table_lineage
system.access.column_lineage
Step 4: Generate the script
POST
/integrations/scripts/create
Copy the request and update the <placeholder_values>
with your connection details. Then submit the request.
Find descriptions of the editable attributes in the table below and of the full payload in the Integration configuration payload reference guide. All values should be included and those you should not edit are noted.
Payload parameters
Attribute | Description | Required |
---|---|---|
config.workspaceUrl | Your Databricks workspace URL. | Yes |
config.httpPath | The HTTP path of your Databricks cluster or SQL warehouse. | Yes |
config.token | The Databricks personal access token for the service principal created in step one for Immuta. | Yes |
config.catalog | The name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number. | Yes |
Step 5: Run the script in Databricks Unity Catalog
Step one will return a script. Copy the script and run it in your Databricks Unity Catalog environment as a user with the permissions listed in the requirements section.
The script will use the service principal that will authenticate using the personal access token (PAT) that you specified in step four. Additionally, the script will create the catalog you specified in step four.
Step 6: Create the host in Immuta
POST
/data/connection
Copy the request and update the <placeholder_values>
with your connection details. Note that the connection details here should match the ones used in step four. Then submit the request.
Find descriptions of the editable attributes in the table below and of the full payload in the Databricks Unity Catalog host payload table. All values should be included and those you should not edit are noted.
Test run
Opt to test and validate the create connection payload using a dry run:
POST
/data/connection/test
Payload parameters
Attribute | Description | Required |
---|---|---|
connectionKey | A unique name for the host connection. | Yes |
connection | Configuration attributes that should match the values used when getting the script from the integration endpoint. | Yes |
connection.hostname | Your Databricks workspace URL. This is the same as | Yes |
connection.port | The port to use when connecting to your Databricks account host. Defaults to | Yes |
connection.httpPath | The HTTP path of your Databricks cluster or SQL warehouse. | Yes |
connection.token | The Databricks personal access token for the service principal created in step one for Immuta. | Yes |
nativeIntegration | Configuration attributes that should match the values used when getting the script from the integration endpoint. | Yes |
nativeIntegration.config.token | Same as | Yes |
nativeIntegration.config.host | Same as | Yes |
nativeIntegration.config.port | Same as | Yes |
nativeIntegration.config.catalog | The name of the Databricks catalog created with the script from step four. | Yes |
Response schema
Attribute | Description |
---|---|
objectPath | The list of names that uniquely identify the path to a data object in the remote platform's hierarchy. The first element should be the associated |
bulkId | A bulk ID that can be used to search for the status of background jobs triggered by this request. |
Example response
Last updated