Register a Databricks Unity Catalog Connection
Requirements
APPLICATION_ADMIN
Immuta permissionThe Databricks user registering the connection and running the script must have the following privileges:
CREATE CATALOG
privilege on the Unity Catalog metastore to create an Immuta-owned catalog and tables
See the Databricks documentation for more details about Unity Catalog privileges and securable objects.
Prerequisites
Unity Catalog metastore created and attached to a Databricks workspace.
Unity Catalog enabled on your Databricks cluster or SQL warehouse. All SQL warehouses have Unity Catalog enabled if your workspace is attached to a Unity Catalog metastore. Immuta recommends linking a SQL warehouse to your Immuta tenant rather than a cluster for both performance and availability reasons.
Create the Databricks service principal
In Databricks, create a service principal with the privileges listed below. Immuta uses this service principal continuously to orchestrate Unity Catalog policies and maintain state between Immuta and Databricks.
USE CATALOG
andMANAGE
on all catalogs containing securables you want registered as Immuta data sources.USE SCHEMA
on all schemas containing securables you want registered as Immuta data sources.MODIFY
andSELECT
on all securables you want registered as Immuta data sources.
See the Databricks documentation for more details about Unity Catalog privileges and securable objects.
Set up query audit
Enable query audit by completing these steps in Unity Catalog:
Grant the service principal access to the Databricks Unity Catalog system tables. For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.
USE CATALOG
on thesystem
catalogUSE SCHEMA
on thesystem.access
schemaSELECT
on the following system tables:system.access.audit
system.access.table_lineage
system.access.column_lineage
Access to system tables is governed by Unity Catalog. No user has access to these system schemas by default. To grant access, a user that is both a metastore admin and an account admin must grant
USE
andSELECT
privileges on the system schemas to the service principal. See Manage privileges in Unity Catalog. Thesystem.access
schema must also be enabled on the metastore before it can be used.
Register a connection
Click Data and select the Connections tab in the navigation menu.
Click the + Add Connection button.
Select the Databricks data platform tile.
Enter the connection information:
Host: The hostname of your Databricks workspace.
Port: Your Databricks port.
HTTP Path: The HTTP path of your Databricks cluster or SQL warehouse.
Immuta Catalog: The name of the catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
Display Name: The display name represents the unique name of your connection and will be used as prefix in the name for all data objects associated with this connection. It will also appear as the display name in the UI and will be used in all API calls made to update or delete the connection.
Click Next.
Select your authentication method from the dropdown:
Access Token: Enter the Access Token in the Immuta System Account Credentials section. This is the access token for the Immuta service principal, which can be an on-behalf token created in Databricks. This service principal must have the metastore privileges listed above for the metastore associated with the Databricks workspace. If this token is configured to expire, update this field regularly for the connection to continue to function. This authentication information will be included in the script populated later on the page.
OAuth M2M:
AWS Databricks:
Follow Databricks documentation to create a client secret for the Immuta service principal and assign this service principal the privileges listed above for the metastore associated with the Databricks workspace.
Fill out the Token Endpoint with the full URL of the identity provider. This is where the generated token is sent. The default value is
https://<your workspace name>.cloud.databricks.com/oidc/v1/token
.Fill out the Client ID. This is a combination of letters, numbers, or symbols, used as a public identifier and is the client ID displayed in Databricks when creating the client secret for the service principal.
Enter the Scope (string). The scope limits the operations and roles allowed in Databricks by the access token. See the OAuth 2.0 documentation for details about scopes.
Enter the Client Secret you created above. Immuta uses this secret to authenticate with the authorization server when it requests a token.
Azure Databricks:
Follow Databricks documentation to create a service principal within Azure and then populate to your Databricks account and workspace.
Assign this service principal the privileges listed above for the metastore associated with the Databricks workspace.
Within Databricks, create an OAuth client secret for the service principal. This completes your Databricks-based service principal setup.
Within Immuta, fill out the Token Endpoint with the full URL of the identity provider. This is where the generated token is sent. The default value is
https://<your workspace name>.azuredatabricks.net/oidc/v1/token
.Fill out the Client ID. This is a combination of letters, numbers, or symbols, used as a public identifier and is the client ID displayed in Databricks when creating the client secret for the service principal (note that Azure Databricks uses the Azure SP Client ID; it will be identical).
Enter the Scope (string). The scope limits the operations and roles allowed in Databricks by the access token. See the OAuth 2.0 documentation for details about scopes.
Enter the Client Secret you created above. Immuta uses this secret to authenticate with the authorization server when it requests a token.
Copy the provided script and run it in Databricks as a user with the privileges listed in the requirements section.
Click Validate Connection.
If the connection is successful, click Next. If there are any errors, check the connection details and credentials to ensure they are correct and try again.
Ensure all the details are correct in the summary and click Complete Setup.
Last updated
Was this helpful?