Immuta’s integration with Unity Catalog allows you to manage multiple Databricks workspaces through Unity Catalog while protecting your data with Immuta policies. Instead of manually creating UDFs or granting access to each table in Databricks, you can author your policies in Immuta and have Immuta manage and enforce Unity Catalog access-control policies on your data in Databricks clusters or SQL warehouses.
Several different accounts are used to set up and maintain the Databricks Unity Catalog integration. The permissions required for each are outlined below.
Immuta account (required): This user configures the integration on the app settings page in Immuta. To access the app settings page, this user needs the following permission:
APPLICATION_ADMIN Immuta permission
Databricks service principal (required): This service principal is used continuously by Immuta to orchestrate Unity Catalog policies and maintain state between Immuta and Databricks. In the automatic setup option, Immuta also uses this service principal to create the Immuta-managed catalog. This service principal needs the following Databricks privileges:
CREATE CATALOG privilege on the Unity Catalog metastore. This is only required if you have Immutaautomatically configure the integration in Databricks for you. If a separate user will run the Immuta script in Databricks to manually configure the integration, that Databricks user account needs this privilege instead.
OWNER permission on the Immuta catalog you configure.
OWNER privilege on one of the securables below so that Immuta can administer Unity Catalog row-level and column-level security controls.
on catalogs with schemas and tables registered as Immuta data sources. This permission could also be applied by granting OWNER on a catalog to a Databricks group that includes the Immuta service principal to allow for multiple owners.
on schemas with tables registered as Immuta data sources.
on all tables registered as Immuta data sources - if the OWNER permission cannot be applied at the catalog- or schema-level. In this case, each table registered as an Immuta data source must individually have the OWNER permission granted to the Immuta service principal.
USE CATALOG and USE SCHEMAon parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta service principal can SELECT and MODIFY securables within the parent catalog and schema.
SELECT and MODIFYon all tables registered as Immuta data sources so that the Immuta service principal can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.
For native query audit (optional)
USE CATALOGon the system catalog
USE SCHEMAon the system.access schema
SELECTon the following system tables:
system.access.audit
system.access.table_lineage
system.access.column_lineage
Databricks account (recommended): This user account can manually configure the integration in Databricks to create the Immuta-managed catalog. To do so, this account requires the following Databricks privileges:
CREATE CATALOGon the Unity Catalog metastore
ACCOUNT ADMINon the Unity Catalog metastore for native query audit (optional)
Authentication
Access token authentication: If using this method, generate a personal access token for the service principal that Immuta will use to manage policies in Unity Catalog. This service principal must have the privileges listed above for the metastore associated with the Databricks workspace.
If you will configure the integration using the manual setup option, the Immuta script you will generate includes the SQL statements for granting required privileges to the service principal, so you can skip this step and continue to the manual setup section. Otherwise, manually grant the Immuta service principal access to the Databricks Unity Catalog system tables. For Databricks Unity Catalog audit to work, the service principal must have the following access at minimum:
USE CATALOG on the system catalog
USE SCHEMA on the system.access schema
SELECT on the following system tables:
system.access.audit
system.access.table_lineage
system.access.column_lineage
You have two options for configuring your Databricks Unity Catalog integration. Select the method you prefer below to navigate to configuration instructions:
Automatic setup: Immuta creates the catalogs, schemas, tables, and functions using the service principal you created.
Manual setup: Run the Immuta script in Databricks yourself to create the catalog. You can also modify the script to customize your storage location for tables, schemas, or catalogs. The user running the script must have the Databricks privileges listed above.
Automatic setup
Copy the request example, and replace the values with your own as directed to configure the integration settings. The examples provided use JSON format, but the request also accepts YAML.
See the config object description for parameter definitions, value types, and additional configuration options.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
token is the Databricks personal access token. This is the access token for the Immuta service principal.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
oAuthClientConfig specifies your client ID, client secret, and authority URL. See the object description for details about child parameters.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
oAuthClientConfig specifies your client ID, client secret, and authority URL. See the object description for details about child parameters.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
additionalWorkspaceConnections.workspaceURL: The Databricks workspace URL.
additionalWorkspaceConnections.HTTPpath: The HTTP path of the compute for the workspace.
additionalWorkspaceConnections.authenticationType: Specifies the authentication type to use to access the workspace. The additional workspace credentials will be used when processing objects in bound catalogs that are not accessible via the default workspace. See the additionalWorkspaceConnections section for details about additional authentication types and required and child attributes.
additionalWorkspaceConnections.catalogs: The to use for the additional workspace connection.
If the integration tries to process an object that is in a bound catalog and none of the specified additional workspaces have access to that catalog, the operation will fail and an error will be reported.
Response
The response returns the status of the Databricks Unity Catalog integration configuration connection. See the response schema reference for details about the response schema.
A successful response includes the validation tests statuses.
An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.
{"statusCode":409,"error":"Conflict", "message": "Databricks Unity Catalog integration already exists on www.example-workspace.cloud.databricks.com (id = 123456789)"
}
Manual setup
To manually configure the integration, complete the following steps:
Copy the request example, and replace the values with your own as directed to configure the integration settings. The examples provided use JSON format, but the request also accepts YAML.
See the config object description for parameter definitions, value types, and additional configuration options.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
token is the Databricks personal access token. This is the access token for the Immuta service principal.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
Run the script returned in the response in your Databricks environment.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
oAuthClientConfig specifies your client ID, client secret, and authority URL. See the object description for details about child parameters.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
Run the script returned in the response in your Databricks environment.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
oAuthClientConfig specifies your client ID, client secret, and authority URL. See the object description for details about child parameters.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
additionalWorkspaceConnections.workspaceURL: The Databricks workspace URL.
additionalWorkspaceConnections.HTTPpath: The HTTP path of the compute for the workspace.
additionalWorkspaceConnections.authenticationType: Specifies the authentication type to use to access the workspace. The additional workspace credentials will be used when processing objects in bound catalogs that are not accessible via the default workspace. See the additionalWorkspaceConnections section for details about additional authentication types and required and child attributes.
additionalWorkspaceConnections.catalogs: The to use for the additional workspace connection.
Run the script returned in the response in your Databricks environment.
If the integration tries to process an object that is in a bound catalog and none of the specified additional workspaces have access to that catalog, the operation will fail and an error will be reported.
Response
The response returns the script for you to run in your environment.
Configure the integration in Immuta
Copy the request example, and replace the values with your own as directed to configure the integration settings. The examples provided use JSON format, but the request also accepts YAML. The payload you provide must match the payload sent when generating the script.
See the config object description for parameter definitions, value types, and additional configuration options.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
token is the Databricks personal access token. This is the access token for the Immuta service principal.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
oAuthClientConfig specifies your client ID, client secret, and authority URL. See the object description for details about child parameters.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
oAuthClientConfig specifies your client ID, client secret, and authority URL. See the object description for details about child parameters. The additional workspace credentials will be used when processing objects in bound catalogs that are not accessible via the default workspace.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
additionalWorkspaceConnections.workspaceURL: The Databricks workspace URL.
additionalWorkspaceConnections.HTTPpath: The HTTP path of the compute for the workspace.
additionalWorkspaceConnections.authenticationType: Specifies the authentication type to use to access the workspace. The additional workspace credentials will be used when processing objects in bound catalogs that are not accessible via the default workspace. See the additionalWorkspaceConnections section for details about additional authentication types and required and child attributes.
additionalWorkspaceConnections.catalogs: The to use for the additional workspace connection.
If the integration tries to process an object that is in a bound catalog and none of the specified additional workspaces have access to that catalog, the operation will fail and an error will be reported.
Response
The response returns the status of the Databricks Unity Catalog integration configuration connection. See the response schema reference for details about the response schema.
A successful response includes the validation tests statuses.
An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.
{"statusCode":409,"error":"Conflict", "message": "Databricks Unity Catalog integration already exists on www.example-workspace.cloud.databricks.com (id = 123456789)"
}
Replace the {id} request parameter with the unique identifier of the integration you want to get. Alternatively, you can get a list of all integrations and their IDs with the GET /integrationsendpoint.
Response
The response returns a Databricks Unity Catalog integration configuration. See the response schema reference for details about the response schema. An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.
The response returns the configuration for all integrations. See the response schema reference for details about the response schema. An unsuccessful request returns the status code and an error message. See the HTTP status codes and error messages for a list of statuses, error messages, and troubleshooting guidance.
Copy the request example, and replace the values with your own as directed to configure the integration settings. The examples provided use JSON format, but the request also accepts YAML.
See the config object description for parameter definitions, value types, and additional configuration options.
Replace the {id} request parameter with the unique identifier of the integration you want to update.
Change the config values to your own, where
workspaceUrl is your Databricks workspace URL.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
token is the Databricks personal access token. This is the access token for the Immuta service principal.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
This example adds additional workspace connections to an existing configuration.
Replace the {id} request parameter with the unique identifier of the integration you want to update.
Change the config values to your own, where
workspaceUrl is your Databricks workspace URL.
httpPath is the HTTP path of your Databricks cluster or SQL warehouse.
oAuthClientConfig specifies your client ID, client secret, and authority URL. See the object description for details about child parameters.
catalog is the name of the Databricks catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
additionalWorkspaceConnections.workspaceURL: The Databricks workspace URL.
additionalWorkspaceConnections.HTTPpath: The HTTP path of the compute for the workspace.
additionalWorkspaceConnections.authenticationType: Specifies the authentication type to use to access the workspace. The additional workspace credentials will be used when processing objects in bound catalogs that are not accessible via the default workspace. Note: The credentials themselves can be omitted from the payload if they are not being updated.
additionalWorkspaceConnections.catalogs: The to use for the additional workspace connection.
If the integration tries to process an object that is in a bound catalog and none of the specified _additional workspaces_ have access to that catalog, the operation will fail and an error will be reported.