OAuth Token Passthrough for Databricks Unity Catalog
Limited availability
This feature is in preview and only available to select accounts.
You can authenticate with OAuth token passthrough when setting up your Databricks Unity Catalog integration or registering Databricks data sources. Immuta's OAuth authentication method uses the Client Credentials Flow. When a user configures the integration or connects a data source, Immuta uses the token credentials to craft an authenticated access token to connect with Databricks.
Requirements
Immuta SaaS tenant or Immuta v2023.4 or newer
Unity Catalog metastore created and attached to a Databricks workspace. Immuta supports configuring a single metastore for each configured integration, and that metastore may be attached to multiple Databricks workspaces.
Unity Catalog enabled on your Databricks cluster or SQL warehouse. All SQL warehouses have Unity Catalog enabled if your workspace is attached to a Unity Catalog metastore. Immuta recommends linking a SQL warehouse to your Immuta tenant rather than a cluster for both performance and availability reasons.
Personal access token generated for the user that Immuta will use to manage policies in Unity Catalog.
No Databricks Spark integrations with Unity Catalog support are configured in your Immuta tenant. Immuta does not support that integration and the Databricks Unity Catalog integration concurrently. See the Unity Catalog overview for supported cluster configurations.
Best practices
Ensure your integration with Unity Catalog goes smoothly by following these guidelines:
Use a Databricks SQL warehouse to configure the integration. Databricks SQL warehouses are faster to start than traditional clusters, require less management, and can run all the SQL that Immuta requires for policy administration. A serverless warehouse provides nearly instant startup time and is the preferred option for connecting to Immuta.
Move all data into Unity Catalog before configuring Immuta with Unity Catalog. The default catalog used once Unity Catalog support is enabled in Immuta is the
hive_metastore
, which is not supported by the Unity Catalog native integration. Data sources in the Hive Metastore must be managed by the Databricks Spark integration. Existing data sources will need to be re-created after they are moved to Unity Catalog and the Unity Catalog integration is configured.
Permissions
APPLICATION_ADMIN
Immuta permission for the user configuring the integration in Immuta.CREATE_DATA_SOURCE
Immuta permission for the user registering Databricks data sources in Immuta.Databricks privileges:
An account with the
CREATE CATALOG
privilege on the Unity Catalog metastore to create an Immuta-owned catalog and tables. For automatic setups, this privilege must be granted to the Immuta system account user. For manual setups, the user running the Immuta script must have this privilege.An Immuta service principal requires the following Databricks privileges:
OWNERSHIP
on the Immuta catalog you configure.USE CATALOG
andUSE SCHEMA
on parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta service principal can interact with those tables.SELECT
andMODIFY
on all tables registered as Immuta data sources so that the system account user can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.
Migrate data to Unity Catalog
Disable existing Databricks SQL and Databricks Spark with Unity Catalog Support integrations.
Ensure that all Databricks clusters that have Immuta installed are stopped and the Immuta configuration is removed from the cluster. Immuta-specific cluster configuration is no longer needed with the Databricks Unity Catalog integration.
Move all data into Unity Catalog before configuring Immuta with Unity Catalog. Existing data sources will need to be re-created after they are moved to Unity Catalog and the Unity Catalog integration is configured. If you don't move all data before configuring the integration, metastore magic will protect your existing data sources throughout the migration process.
Enable OAuth token passthrough
To enable OAuth token passthrough, reach out to your Immuta representative.
Configure the Databricks Unity Catalog integration using OAuth token passthrough
You have two options for configuring your Databricks Unity Catalog integration:
Automatic setup: Immuta creates the catalogs, schemas, tables, and functions using the integration's configured personal access token.
Manual setup: Run the Immuta script in Databricks yourself to create the catalog. You can also modify the script to customize your storage location for tables, schemas, or catalogs.
Automatic setup
Required permissions
When performing an automatic setup, the Databricks personal access token you configure below must be attached to an account with the following permissions for the metastore associated with the specified Databricks workspace:
USE CATALOG
andUSE SCHEMA
on parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta service principal can interact with those tables.SELECT
andMODIFY
on all tables registered as Immuta data sources so that the system account user can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.OWNERSHIP
on the Immuta catalog created below.CREATE CATALOG
on the workspace metastore.
Click the App Settings icon in the left sidebar.
Scroll to the Global Integrations Settings section and check the Enable Databricks Unity Catalog support in Immuta checkbox. The additional settings in this section are only relevant to the Databricks Spark with Unity Catalog integration and will not have any effect on the Unity Catalog integration. These can be left with their default values.
Scroll to the Integration Settings section, and click + Add Native Integration.
Select Databricks Unity Catalog from the dropdown menu.
Complete the following fields:
Server Hostname is the hostname of your Databricks workspace.
HTTP Path is the HTTP path of your Databricks cluster or SQL warehouse.
Immuta Catalog is the name of the catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
If using a proxy server with Databricks Unity Catalog, click the Enable Proxy Support checkbox and complete the Proxy Host and Proxy Port fields. The username and password fields are optional.
Opt to fill out the Exemption Group field with the name of a group in Databricks that will be excluded from having data policies applied and must not be changed from the default value. Create this account-level group for privileged users and service accounts that require an unmasked view of data before configuring the integration in Immuta.
Opt to scope the query audit ingestion by entering in Unity Catalog Workspace IDs. Enter a comma-separated list of the workspace IDs that you want Immuta to ingest audit records for. If left empty, Immuta will audit all tables and users in Unity Catalog.
Unity Catalog query audit is enabled by default; you can disable it by clicking the Enable Native Query Audit checkbox. Ensure you have enabled system tables in Unity Catalog and provided the required access to the Immuta system account.
Configure the audit frequency by scrolling to Integrations Settings and find the Unity Catalog Audit Sync Schedule section.
Enter how often, in hours, you want Immuta to ingest audit events from Unity Catalog as an integer between 1 and 24.
Continue with your integration configuration.
Select OAuth Token Passthrough as the authentication method from the dropdown menu. To enable this authentication method, reach out to your Immuta representative:
Fill out the Token Endpoint. This is where the generated token is sent.
Fill out the Client ID. This is the subject of the generated token.
Check the Use Certificate checkbox.
Opt to fill out the Resource field with a URI of the resource where the requested token will be used.
Enter the x509 Certificate Thumbprint. This identifies the corresponding key to the token and is often abbreviated as
x5t
or is calledsub
(subject).Upload the PEM Certificate, which is the client certificate that is used to sign the authorization request.
Click Test Databricks Unity Catalog Connection.
Save and Confirm your changes.
Manual setup
Required permissions
When performing a manual setup, the following Databricks permissions are required:
The user running the script must have the
CREATE CATALOG
permission on the workspace metastore.The Databricks personal access token you configure below must be attached to an account with the following permissions:
USE CATALOG
andUSE SCHEMA
on parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta service principal can interact with those tables.SELECT
andMODIFY
on all tables registered as Immuta data sources so that the system account user can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.OWNERSHIP
on the Immuta catalog created below.
Click the App Settings icon in the left sidebar.
Scroll to the Global Integrations Settings section and check the Enable Databricks Unity Catalog support in Immuta checkbox. The additional settings in this section are only relevant to the Databricks Spark with Unity Catalog integration and will not have any effect on the Unity Catalog integration. These can be left with their default values.
Scroll to the Integration Settings section, and click + Add Native Integration.
Select Databricks Unity Catalog from the dropdown menu.
Complete the following fields:
Server Hostname is the hostname of your Databricks workspace.
HTTP Path is the HTTP path of your Databricks cluster or SQL warehouse.
Immuta Catalog is the name of the catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
If using a proxy server with Databricks Unity Catalog, click the Enable Proxy Support checkbox and complete the Proxy Host and Proxy Port fields. The username and password fields are optional.
Opt to fill out the Exemption Group field with the name of a group in Databricks that will be excluded from having data policies applied and must not be changed from the default value. Create this account-level group for privileged users and service accounts that require an unmasked view of data before configuring the integration in Immuta.
Opt to scope the query audit ingestion by entering in Unity Catalog Workspace IDs. Enter a comma-separated list of the workspace IDs that you want Immuta to ingest audit records for. If left empty, Immuta will audit all tables and users in Unity Catalog.
Unity Catalog query audit is enabled by default; you can disable it by clicking the Enable Native Query Audit checkbox. Ensure you have enabled system tables in Unity Catalog and provided the required access to the Immuta system account.
Configure the audit frequency by scrolling to Integrations Settings and find the Unity Catalog Audit Sync Schedule section.
Enter how often, in hours, you want Immuta to ingest audit events from Unity Catalog as an integer between 1 and 24.
Continue with your integration configuration.
Select OAuth Token Passthrough as the authentication method from the dropdown menu. To enable this authentication method, reach out to your Immuta representative:
Fill out the Token Endpoint. This is where the generated token is sent.
Fill out the Client ID. This is the subject of the generated token.
Check the Use Certificate checkbox.
Opt to fill out the Resource field with a URI of the resource where the requested token will be used.
Enter the x509 Certificate Thumbprint. This identifies the corresponding key to the token and is often abbreviated as
x5t
or is calledsub
(subject).Upload the PEM Certificate, which is the client certificate that is used to sign the authorization request.
Select the Manual toggle and copy or download the script. You can modify the script to customize your storage location for tables, schemas, or catalogs.
Run the script in Databricks.
Click Test Databricks Unity Catalog Connection.
Save and Confirm your changes.
Enable native query audit for Unity Catalog
To enable native query audit for Unity Catalog, complete the following steps before configuring the integration:
Grant your Immuta service principal access to the Databricks Unity Catalog system tables. For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.
USE CATALOG
on thesystem
catalogUSE SCHEMA
on thesystem.access
schemaSELECT
on the following system tables:system.access.audit
system.access.table_lineage
system.access.column_lineage
Use the Databricks Personal Access Token in the configuration above for the account you just granted system table access. This account will be the Immuta system account user.
Register data
Follow the steps in the enter connection section of the create a data source how-to guide to begin registering your data in Immuta.
Select OAuth Token Passthrough as the authentication method from the dropdown menu. To enable this authentication method, reach out to your Immuta representative:
Fill out the Token Endpoint. This is where the generated token is sent.
Fill out the Client ID. This is the subject of the generated token.
Check the Use Certificate checkbox.
Opt to fill out the Resource field with a URI of the resource where the requested token will be used.
Enter the x509 Certificate Thumbprint. This identifies the corresponding key to the token and is often abbreviated as
x5t
or is calledsub
(subject).Upload the PEM Certificate, which is the client certificate that is used to sign the authorization request.
If you are using a proxy server with Databricks, specify it in the Additional Connection String Options:
Click Test Connection.
Follow the rest of the instructions in the create a data source how-to guide to finish registering your data.
Considerations
Immuta pushes down joins to be processed on the native database when possible. To ensure this happens, make sure the connection information matches between data sources, including host, port, ssl, username, and password. You will see performance degradation on joins against the same database if this information doesn't match.
If a client certificate is required to connect to the source database, you can add it in the Upload Certificates section at the bottom of the form.
Last updated
Was this helpful?