Immuta’s integration with Unity Catalog allows you to manage multiple Databricks workspaces through Unity Catalog while protecting your data with Immuta policies. Instead of manually creating UDFs or granting access to each table in Databricks, you can author your policies in Immuta and have Immuta manage and enforce Unity Catalog access-control policies on your data in Databricks clusters or SQL warehouses.
A service principal with the following privileges is required for Immuta to manage permissions on objects in the Unity Catalog metastore:
CREATE CATALOG on the Unity Catalog metastore to create an Immuta-owned catalog and tables
OWNERSHIP on the Immuta catalog you configure
USE CATALOG and USE SCHEMA on all catalogs and schemas with tables managed by Immuta
SELECT and MODIFY on all tables in the metastore managed my Immuta
Native query audit:
USE CATALOG on the system catalog for native query audit
USE SCHEMA on the system.access schema for native query audit
SELECT on the following system tables for native query audit:
system.access.audit
system.access.table_lineage
system.access.column_lineage
Authentication:
Access token authentication: If using this method, generate a personal access token for the service principal that Immuta will use to manage policies in Unity Catalog. This service principal must have the privileges listed above for the metastore associated with the Databricks workspace.
OAuth machine-to-machine (M2M) authentication: If using this method, follow Databricks documentation to create a client secret for the Immuta service principal. This service principal must have the privileges listed above for the metastore associated with the Databricks workspace.
Prerequisite
Enable Databricks Unity Catalog on the Immuta app settings page:
Click the App Settings icon in the left sidebar.
Scroll to the Global Integrations Settings section and check the Enable Databricks Unity Catalog support in Immuta checkbox.
Configure the integration
You have two options for configuring your Databricks Unity Catalog integration:
Automatic setup: When performing an automatic setup, the Databricks personal access token you configure below must be attached to an account with these Databricks permissions for the metastore associated with the specified Databricks workspace. Immuta creates the catalogs, schemas, tables, and functions using the integration's configured personal access token.
Manual setup: Run the Immuta script in Databricks yourself to create the catalog. You can also modify the script to customize your storage location for tables, schemas, or catalogs. The user running the script needs to have the CREATE CATALOG permission on the workspace metastore. The Databricks personal access token you configure must be attached to an account with the Databricks permissions listed in the requirements section.
Automatic setup
Copy the request example, and replace the values with your own as directed to configure the integration settings. The examples provided use JSON format, but the request also accepts YAML.
See the config object description for parameter definitions, value types, and additional configuration options.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
token is the Databricks personal access token. This is the access token for the Immuta service principal.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
oAuthClientConfig specifies your client ID, client secret, and authority URL. See the object description for details about child parameters.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
Response
The response returns the status of the Databricks Unity Catalog integration configuration connection. See the response schema reference for details about the response schema.
A successful response includes the validation tests statuses.
An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.
{"statusCode":409,"error":"Conflict","message":"Databricks Unity Catalog integration already exists on www.example-workspace.cloud.databricks.com (id = 123456789)"}
Manual setup
To manually configure the integration, complete the following steps:
Copy the request example, and replace the values with your own as directed to configure the integration settings. The examples provided use JSON format, but the request also accepts YAML.
See the config object description for parameter definitions, value types, and additional configuration options.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
token is the Databricks personal access token. This is the access token for the Immuta service principal.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
Run the script returned in the response in your Databricks environment.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
oAuthClientConfig specifies your client ID, client secret, and authority URL. See the object description for details about child parameters.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
Run the script returned in the response in your Databricks environment.
Response
The response returns the script for you to run in your environment.
Configure the integration in Immuta
Copy the request example, and replace the values with your own as directed to configure the integration settings. The examples provided use JSON format, but the request also accepts YAML. The payload you provide must match the payload sent when generating the script.
See the config object description for parameter definitions, value types, and additional configuration options.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
token is the Databricks personal access token. This is the access token for the Immuta service principal.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
oAuthClientConfig specifies your client ID, client secret, and authority URL. See the object description for details about child parameters.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
Response
The response returns the status of the Databricks Unity Catalog integration configuration connection. See the response schema reference for details about the response schema.
A successful response includes the validation tests statuses.
An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.
{"statusCode":409,"error":"Conflict","message":"Databricks Unity Catalog integration already exists on www.example-workspace.cloud.databricks.com (id = 123456789)"}
Replace the {id} request parameter with the unique identifier of the integration you want to get. Alternatively, you can get a list of all integrations and their IDs with the GET /integrationsendpoint.
Response
The response returns a Databricks Unity Catalog integration configuration. See the response schema reference for details about the response schema. An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.
The response returns the configuration for all integrations. See the response schema reference for details about the response schema. An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.
Copy the request example, and replace the values with your own as directed to configure the integration settings. The examples provided use JSON format, but the request also accepts YAML.
See the config object description for parameter definitions, value types, and additional configuration options.
Replace the {id} request parameter with the unique identifier of the integration you want to update.
Change the config values to your own, where
workspaceUrl is your Databricks workspace URL.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
token is the Databricks personal access token. This is the access token for the Immuta service principal.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
Response
The response returns the status of the Databricks Unity Catalog integration configuration connection. See the response schema reference for details about the response schema.
A successful response includes the validation tests statuses.
An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.
{"statusCode":409,"error":"Conflict","message":"Unable to edit integration with ID 123456789 in current state editing."}
Manual update
To manually update the integration, complete the following steps:
Copy the request example, and replace the values with your own as directed to generate the script. The example provided uses JSON format, but the request also accepts YAML.
See the config object description for parameter definitions, value types, and additional configuration options.
Replace the {id} request parameter with the unique identifier of the integration you want to update.
Change the config values to your own, where
workspaceUrl is your Databricks workspace URL.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
token is the Databricks personal access token. This is the access token for the Immuta service principal.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
Run the script returned in the response in your Databricks environment.
Response
The response returns the script for you to run in your Databricks environment.
Update the integration in Immuta
Copy the request example, and replace the values with your own as directed to update the integration settings. The example provided uses JSON format, but the request also accepts YAML. The payload you provide must match the payload sent when generating the script.
See the config object description for parameter definitions, value types, and additional configuration options.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
token is the Databricks personal access token. This is the access token for the Immuta service principal.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
Response
The response returns the status of the Databricks Unity Catalog integration configuration connection. See the response schema reference for details about the response schema.
A successful response includes the validation tests statuses.
An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.
{"statusCode":409,"error":"Conflict","message":"Unable to edit integration with ID 123456789 in current state editing."}
Replace the {id} request parameter with the unique identifier of the integration you want to delete.
Response
The response returns the status of the Databricks Unity Catalog integration configuration that has been deleted. See the response schema reference for details about the response schema. An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.