1 of 3

Databricks Unity Catalog

Databricks Unity Catalog allows you to manage and access data in your Databricks account across all of your workspaces. With Immuta’s Databricks Unity Catalog integration, you can write your policies in Immuta and have them enforced automatically by Databricks across data in your Unity Catalog metastore.

Permissions

APPLICATION_ADMIN Immuta permission for the user configuring the integration in Immuta.
Databricks privileges:
- An account with the CREATE CATALOG privilege on the Unity Catalog metastore to create an Immuta-owned catalog and tables. For automatic setups, this privilege must be granted to the Immuta system account user. For manual setups, the user running the Immuta script must have this privilege.
- An Immuta system account user requires the following Databricks privileges:
  - OWNER permission on the Immuta catalog you configure.
  - OWNER permission on catalogs with schemas and tables registered as Immuta data sources so that Immuta can administer Unity Catalog row-level and column-level security controls. This permission can be applied by granting OWNER on a catalog to a Databricks group that includes the Immuta system account user to allow for multiple owners. If the OWNER permission cannot be applied at the catalog- or schema-level, each table registered as an Immuta data source must individually have the OWNER permission granted to the Immuta system account user.
  - USE CATALOG and USE SCHEMA on parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta system account user can interact with those tables.
  - SELECT and MODIFY on all tables registered as Immuta data sources so that the system account user can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.
  - USE CATALOG on the system catalog for native query audit.
  - USE SCHEMA on the system.access schema for native query audit.
  - SELECT on the following system tables for native query audit:
    system.access.audit
    system.access.table_lineage
    system.access.column_lineage

Requirements

Before you configure the Databricks Unity Catalog integration, ensure that you have fulfilled the following requirements:

Unity Catalog metastore created and attached to a Databricks workspace. Immuta supports configuring a single metastore for each configured integration, and that metastore may be attached to multiple Databricks workspaces.
Unity Catalog enabled on your Databricks cluster or SQL warehouse. All SQL warehouses have Unity Catalog enabled if your workspace is attached to a Unity Catalog metastore. Immuta recommends linking a SQL warehouse to your Immuta instance rather than a cluster for both performance and availability reasons.
Personal access token generated for the user that Immuta will use to manage policies in Unity Catalog.
No Databricks SQL integrations are configured in your Immuta instance. The Databricks Unity Catalog integration replaces the Databricks SQL integration entirely and cannot coexist with it. If there are configured Databricks SQL integrations, remove them and add a Databricks Unity Catalog integration in its place. Databricks data sources will also need to be migrated if they are defined in the hive_metastore catalog.
No Databricks Spark integrations with Unity Catalog support are configured in your Immuta instance. Immuta does not support that integration and the Databricks Unity Catalog integration concurrently. See the Unity Catalog overview for supported cluster configurations.
Unity Catalog system tables enabled for native query audit.

Best practices

Ensure your integration with Unity Catalog goes smoothly by following these guidelines:

Use a Databricks SQL warehouse to configure the integration. Databricks SQL warehouses are faster to start than traditional clusters, require less management, and can run all the SQL that Immuta requires for policy administration. A serverless warehouse provides nearly instant startup time and is the preferred option for connecting to Immuta.
Move all data into Unity Catalog before configuring Immuta with Unity Catalog. The default catalog used once Unity Catalog support is enabled in Immuta is the hive_metastore, which is not supported by the Unity Catalog native integration. Data sources in the Hive Metastore must be managed by the Databricks Spark integration. Existing data sources will need to be re-created after they are moved to Unity Catalog and the Unity Catalog integration is configured.

Migrate data to Unity Catalog

Disable existing Databricks SQL and Databricks Spark with Unity Catalog Support integrations.
Ensure that all Databricks clusters that have Immuta installed are stopped and the Immuta configuration is removed from the cluster. Immuta-specific cluster configuration is no longer needed with the Databricks Unity Catalog integration.
Move all data into Unity Catalog before configuring Immuta with Unity Catalog. Existing data sources will need to be re-created after they are moved to Unity Catalog and the Unity Catalog integration is configured. If you don't move all data before configuring the integration, metastore magic will protect your existing data sources throughout the migration process.

Configure the Databricks Unity Catalog integration

Existing data source migration

If you have existing Databricks data sources, complete these migration steps before proceeding.

You have two options for configuring your Databricks Unity Catalog integration:

Automatic setup: Immuta creates the catalogs, schemas, tables, and functions using the integration's configured personal access token.
Manual setup: Run the Immuta script in Databricks yourself to create the catalog. You can also modify the script to customize your storage location for tables, schemas, or catalogs.

Automatic setup

Required permissions

When performing an automatic setup, the Databricks personal access token you configure below must be attached to an account with the following permissions for the metastore associated with the specified Databricks workspace:

USE CATALOG and USE SCHEMA on parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta system account user can interact with those tables.
SELECT and MODIFY on all tables registered as Immuta data sources so that the system account user can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.
OWNER permission on the Immuta catalog created below.
OWNER permission on catalogs with schemas and tables registered as Immuta data sources so that Immuta can administer Unity Catalog row-level and column-level security controls. This permission can be applied by granting OWNER on a catalog to a Databricks group that includes the Immuta system account user to allow for multiple owners. If the OWNER permission cannot be applied at the catalog- or schema-level, each table registered as an Immuta data source must individually have the OWNER permission granted to the Immuta system account user.
CREATE CATALOG on the workspace metastore.
USE CATALOG on the system catalog for native query audit.
USE SCHEMA on the system.access schema for native query audit.
SELECT on the following system tables for native query audit:
- system.access.audit
- system.access.table_lineage
- system.access.column_lineage

Click the App Settings icon in the left sidebar.
Scroll to the Global Integration Settings section and check the Enable Databricks Unity Catalog support in Immuta checkbox. The additional settings in this section are only relevant to the Databricks Spark with Unity Catalog integration and will not have any effect on the Unity Catalog integration. These can be left with their default values.
Click the Integrations tab.
Click + Add Native Integration and select Databricks Unity Catalog from the dropdown menu.
Complete the following fields:
- Server Hostname is the hostname of your Databricks workspace.
- HTTP Path is the HTTP path of your Databricks cluster or SQL warehouse.
- Immuta Catalog is the name of the catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
If using a proxy server with Databricks Unity Catalog, click the Enable Proxy Support checkbox and complete the Proxy Host and Proxy Port fields. The username and password fields are optional.
Opt to fill out the Exemption Group field with the name of a group in Databricks that will be excluded from having data policies applied and must not be changed from the default value. Create this account-level group for privileged users and service accounts that require an unmasked view of data before configuring the integration in Immuta.
Unity Catalog query audit is enabled by default; you can disable it by clicking the Enable Native Query Audit checkbox. Ensure you have enabled system tables in Unity Catalog and provided the required access to the Immuta system account.
1. Configure the audit frequency by scrolling to Integrations Settings and find the Unity Catalog Audit Sync Schedule section.
2. Enter how often, in hours, you want Immuta to ingest audit events from Unity Catalog as an integer between 1 and 24.
3. Continue with your integration configuration.
Enter a Databricks Personal Access Token. This is the access token for the Immuta service principal. This service principal must have the metastore privileges listed above for the metastore associated with the Databricks workspace. If this token is configured to expire, update this field regularly for the integration to continue to function.
Click Test Databricks Unity Catalog Connection.
Save and Confirm your changes.

Manual setup

Required permissions

When performing a manual setup, the following Databricks permissions are required:

The user running the script must have the CREATE CATALOG permission on the workspace metastore.
The Databricks personal access token you configure below must be attached to an account with the following permissions:
- USE CATALOG and USE SCHEMA on parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta system account user can interact with those tables.
- SELECT and MODIFY on all tables registered as Immuta data sources so that the system account user can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.
- OWNER permission on the Immuta catalog created below.
- OWNER permission on catalogs with schemas and tables registered as Immuta data sources so that Immuta can administer Unity Catalog row-level and column-level security controls. This permission can be applied by granting OWNER on a catalog to a Databricks group that includes the Immuta system account user to allow for multiple owners. If the OWNER permission cannot be applied at the catalog- or schema-level, each table registered as an Immuta data source must individually have the OWNER permission granted to the Immuta system account user.
- USE CATALOG on the system catalog for native query audit.
- USE SCHEMA on the system.access schema for native query audit.
- SELECT on the following system tables for native query audit:
  - system.access.audit
  - system.access.table_lineage
  - system.access.column_lineage

Click the App Settings icon in the left sidebar.
Scroll to the Global Integration Settings section and check the Enable Databricks Unity Catalog support in Immuta checkbox. The additional settings in this section are only relevant to the Databricks Spark with Unity Catalog integration and will not have any effect on the Unity Catalog integration. These can be left with their default values.
Click the Integrations tab.
Click + Add Native Integration and select Databricks Unity Catalog from the dropdown menu.
Complete the following fields:
- Server Hostname is the hostname of your Databricks workspace.
- HTTP Path is the HTTP path of your Databricks cluster or SQL warehouse.
- Immuta Catalog is the name of the catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
If using a proxy server with Databricks Unity Catalog, click the Enable Proxy Support checkbox and complete the Proxy Host and Proxy Port fields. The username and password fields are optional.
Opt to fill out the Exemption Group field with the name of a group in Databricks that will be excluded from having data policies applied and must not be changed from the default value. Create this account-level group for privileged users and service accounts that require an unmasked view of data before configuring the integration in Immuta.
Unity Catalog query audit is enabled by default; you can disable it by clicking the Enable Native Query Audit checkbox. Ensure you have enabled system tables in Unity Catalog and provided the required access to the Immuta system account.
1. Configure the audit frequency by scrolling to Integrations Settings and find the Unity Catalog Audit Sync Schedule section.
2. Enter how often, in hours, you want Immuta to ingest audit events from Unity Catalog as an integer between 1 and 24.
3. Continue with your integration configuration.
Enter a Databricks Personal Access Token. This is the access token for the Immuta service principal. This service principal must have the metastore privileges listed above for the metastore associated with the Databricks workspace. If this token is configured to expire, update this field regularly for the integration to continue to function.
Select the Manual toggle and copy or download the script. You can modify the script to customize your storage location for tables, schemas, or catalogs.
Run the script in Databricks.
Click Test Databricks Unity Catalog Connection.
Save and Confirm your changes.

Enable native query audit for Unity Catalog

To enable native query audit for Unity Catalog, complete the following steps before configuring the integration:

Enable a system schema where the <SCHEMA_NAME> is access.
Grant your Immuta system account user access to the Databricks Unity Catalog system tables. For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.
- USE CATALOG on the system catalog
- USE SCHEMA on the system.access schema
- SELECT on the following system tables:
  - system.access.audit
  - system.access.table_lineage
  - system.access.column_lineage
Enable verbose audit logs in Unity Catalog.
Use the Databricks Personal Access Token in the configuration above for the account you just granted system table access. This account will be the Immuta system account user.

Register your data

External data connectors and query-federated tables are preview features in Databricks. See the Databricks documentation for details about the support and limitations of these features before registering them as data sources in the Unity Catalog integration.

Protect your data

Map Databricks usernames to Immuta to ensure Immuta properly enforces policies and audits user queries.
Build global policies in Immuta to enforce table-, column-, and row-level security.

Unity Catalog Integration Reference

allows you to manage and access data in your Databricks account across all of your workspaces and introduces fine-grained access controls in Databricks.

Immuta’s integration with Unity Catalog allows you to manage multiple Databricks workspaces through Unity Catalog while protecting your data with Immuta policies. Instead of manually creating UDFs or granting access to each table in Databricks, you can author your policies in Immuta and have Immuta manage and enforce Unity Catalog access-control policies on your data in Databricks clusters or SQL warehouses:

Subscription policies: Immuta subscription policies automatically grant and revoke access to Databricks tables.
: Immuta data policies enforce row- and column-level security without creating views, so users can query tables as they always have without their workflows being disrupted.

Unity Catalog object model

Unity Catalog uses the following hierarchy of data objects:

Metastore: Created at the account level and is attached to one or more Databricks workspaces. The metastore contains metadata of all the catalogs, schemas, and tables available to query. All clusters on that workspace use the configured metastore and all workspaces that are configured to use a single metastore share those objects.
Catalog: A catalog sits on top of schemas (also called databases) and tables to manage permissions across a set of schemas.
Schema: Organizes tables and views.
Table: Tables can be managed or external tables.

For details about the Unity Catalog object model, see the .

Feature support

The Databricks Unity Catalog integration supports

:
- applying column masking and row-redaction policies on tables
- applying subscription polices on tables and views
enforcing Unity Catalog access controls, even if Immuta becomes disconnected
Delta and Parquet files
allowing non-Immuta reads and writes
using Photon
using a proxy server

Architecture

Immuta uses this Immuta system account user to run queries that set up all the tables, user-defined functions (UDFs), and other data necessary for policy enforcement. Upon enabling the native integration, Immuta will create a catalog named after your provided workspaceName that contains two schemas:

immuta_system: Contains internal Immuta data.
immuta_policies: Contains policy UDFs.

When policies require changes to be pushed to Unity Catalog, Immuta updates the internal tables in the immuta_system schema with the updated policy information. If necessary, new UDFs are pushed to replace any out-of-date policies in the immuta_policies schema and any row filters or column masks are updated to point at the new policies. Many of these operations require compute on the configured Databricks cluster or SQL endpoint, so compute must be available for these policies to succeed.

Policy enforcement

Immuta’s Unity Catalog integration applies Databricks table-, row-, and column-level security controls that are enforced natively within Databricks. Immuta's management of these Databricks security controls is automated and ensures that they synchronize with Immuta policy or user entitlement changes.

Row-level security: Immuta applies SQL UDFs to restrict access to rows for querying users.
Column-level security: Immuta applies column-mask SQL UDFs to tables for querying users. These column-mask UDFs run for any column that requires masking.

The Unity Catalog integration supports the following policy types:

- Conditional masking
- Constant
- Custom masking
- Hashing
- Null
- Rounding (date and numeric rounding)
- Matching (only show rows where)
  - Custom WHERE
  - Never
  - Where user
  - Where value in column
- Minimization
- Time-based restrictions

Policy exemption groups

Some users may need to be exempt from masking and row-level policy enforcement. When you add user accounts to the configured exemption group in Databricks, Immuta will not enforce policies for those users. Exemption groups are created when the Unity Catalog integration is configured, and no policies will apply to these users' queries, despite any policies enforced on the tables they query.

The principal used to register data sources in Immuta will be automatically added to this exemption group for that Databricks table. Consequently, users added to this list and used to register data sources in Immuta should be limited to service accounts.

Policy support with `hive_metastore`

When enabling Unity Catalog support in Immuta, the catalog for all Databricks data sources will be updated to point at the default hive_metastore catalog. Internally, Databricks exposes this catalog as a proxy to the workspace-level Hive metastore that schemas and tables were kept in before Unity Catalog. Since this catalog is not a real Unity Catalog catalog, it does not support any Unity Catalog policies. Therefore, Immuta will ignore any data sources in the hive_metastore in any Databricks Unity Catalog integration, and policies will not be applied to tables there.

Authentication method

Immuta data sources in Unity Catalog

External data connectors and query-federated tables

Native query audit

Access requirements

For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.

USE CATALOG on the system catalog
USE SCHEMA on the system.access schema
SELECT on the following system tables:
- system.access.audit
- system.access.table_lineage
- system.access.column_lineage

Configuration requirements

Supported Databricks cluster configurations

The table below outlines the integrations supported for various Databricks cluster configurations. For example, the only integration available to enforce policies on a cluster configured to run on Databricks Runtime 9.1 is the Databricks Spark integration.

Legend:

Unity Catalog caveats

Row access policies with more than 1023 columns are unsupported. This is an underlying limitation of UDFs in Databricks. Immuta will only create row access policies with the minimum number of referenced columns. This limit will therefore apply to the number of columns referenced in the policy and not the total number in the table.
If you disable table grants, Immuta revokes the grants. Therefore, if users had access to a table before enabling Immuta, they’ll lose access.
You must use the global regex flag (g) when creating a regex masking policy in this integration, and you cannot use the case insensitive regex flag (i) when creating a regex masking policy in this integration. See the examples below for guidance:
- regex with a global flag (supported): /^ssn|social ?security$/g
- regex without a global flag (unsupported): /^ssn|social ?security$/
- regex with a case insensitive flag (unsupported): /^ssn|social ?security$/gi
- regex without a case insensitive flag (supported): /^ssn|social ?security$/g

Azure Databricks Unity Catalog limitation

If a registered data source is owned by a Databricks group at the table level, then the Unity Catalog integration cannot apply data masking policies to that table in Unity Catalog.

Therefore, set all table-level ownership on your Unity Catalog data sources to an individual user or service principal instead of a Databricks group. Catalogs and schemas can still be owned by a Databricks group, as ownership at that level doesn't interfere with the integration.

Feature limitations

The following features are currently unsupported:

Databricks change data feed support
Immuta projects
Multiple IAMs on a single cluster
Column masking policies on views
Mixing masking policies on the same column
Row-redaction policies on views
R and Scala cluster support
Scratch paths
User impersonation
Policy enforcement on raw Spark reads
Python UDFs for advanced masking functions
Direct file-to-SQL reads
Data policies on ARRAY, MAP, or STRUCT type columns

Known issue

Snippets for Databricks data sources may be empty in the Immuta UI.

Migrate to Unity Catalog

When you enable Unity Catalog, Immuta automatically migrates your existing Databricks data sources in Immuta to reference the legacy hive_metastore catalog to account for Unity Catalog's three-level hierarchy. New data sources will reference the Unity Catalog metastore you create and attach to your Databricks workspace.

Because the hive_metastore catalog is not managed by Unity Catalog, existing data sources in the hive_metastore cannot have Unity Catalog access controls applied to them. Data sources in the Hive Metastore must be managed by the Databricks Spark integration.

To allow Immuta to administer Unity Catalog access controls on that data, move the data to Unity Catalog and re-register those tables in Immuta by completing the steps below. If you don't move all data before configuring the integration, metastore magic will protect your existing data sources throughout the migration process.

Disable all existing Databricks Spark integrations with Unity Catalog support or Databricks SQL integrations. Note: Immuta supports running the Databricks Spark integration with the Unity Catalog integration concurrently, so Databricks Spark integrations do not have to be disabled before migrating to Unity Catalog.
Ensure that all Databricks clusters that have Immuta installed are stopped and the Immuta configuration is removed from the cluster. Immuta-specific cluster configuration is no longer needed with the Databricks Unity Catalog integration.
Move all data into Unity Catalog before configuring Immuta with Unity Catalog. Existing data sources will need to be re-created after they are moved to Unity Catalog and the Unity Catalog integration is configured.
Enable Unity Catalog.

Databricks Unity Catalog

Permissions

APPLICATION_ADMIN Immuta permission for the user configuring the integration in Immuta.
Databricks privileges:
- An account with the CREATE CATALOG privilege on the Unity Catalog metastore to create an Immuta-owned catalog and tables. For automatic setups, this privilege must be granted to the Immuta system account user. For manual setups, the user running the Immuta script must have this privilege.
- An Immuta system account user requires the following Databricks privileges:
  - OWNER permission on the Immuta catalog you configure.
  - OWNER permission on catalogs with schemas and tables registered as Immuta data sources so that Immuta can administer Unity Catalog row-level and column-level security controls. This permission can be applied by granting OWNER on a catalog to a Databricks group that includes the Immuta system account user to allow for multiple owners. If the OWNER permission cannot be applied at the catalog- or schema-level, each table registered as an Immuta data source must individually have the OWNER permission granted to the Immuta system account user.
  - USE CATALOG and USE SCHEMA on parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta system account user can interact with those tables.
  - SELECT and MODIFY on all tables registered as Immuta data sources so that the system account user can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.
  - USE CATALOG on the system catalog for native query audit.
  - USE SCHEMA on the system.access schema for native query audit.
  - SELECT on the following system tables for native query audit:
    system.access.audit
    system.access.table_lineage
    system.access.column_lineage

Requirements

Before you configure the Databricks Unity Catalog integration, ensure that you have fulfilled the following requirements:

Unity Catalog metastore created and attached to a Databricks workspace. Immuta supports configuring a single metastore for each configured integration, and that metastore may be attached to multiple Databricks workspaces.
Unity Catalog enabled on your Databricks cluster or SQL warehouse. All SQL warehouses have Unity Catalog enabled if your workspace is attached to a Unity Catalog metastore. Immuta recommends linking a SQL warehouse to your Immuta instance rather than a cluster for both performance and availability reasons.
Personal access token generated for the user that Immuta will use to manage policies in Unity Catalog.
No Databricks SQL integrations are configured in your Immuta instance. The Databricks Unity Catalog integration replaces the Databricks SQL integration entirely and cannot coexist with it. If there are configured Databricks SQL integrations, remove them and add a Databricks Unity Catalog integration in its place. Databricks data sources will also need to be migrated if they are defined in the hive_metastore catalog.
No Databricks Spark integrations with Unity Catalog support are configured in your Immuta instance. Immuta does not support that integration and the Databricks Unity Catalog integration concurrently. See the Unity Catalog overview for supported cluster configurations.
Unity Catalog system tables enabled for native query audit.

Best practices

Ensure your integration with Unity Catalog goes smoothly by following these guidelines:

Use a Databricks SQL warehouse to configure the integration. Databricks SQL warehouses are faster to start than traditional clusters, require less management, and can run all the SQL that Immuta requires for policy administration. A serverless warehouse provides nearly instant startup time and is the preferred option for connecting to Immuta.
Move all data into Unity Catalog before configuring Immuta with Unity Catalog. The default catalog used once Unity Catalog support is enabled in Immuta is the hive_metastore, which is not supported by the Unity Catalog native integration. Data sources in the Hive Metastore must be managed by the Databricks Spark integration. Existing data sources will need to be re-created after they are moved to Unity Catalog and the Unity Catalog integration is configured.

Migrate data to Unity Catalog

Disable existing Databricks SQL and Databricks Spark with Unity Catalog Support integrations.
Ensure that all Databricks clusters that have Immuta installed are stopped and the Immuta configuration is removed from the cluster. Immuta-specific cluster configuration is no longer needed with the Databricks Unity Catalog integration.
Move all data into Unity Catalog before configuring Immuta with Unity Catalog. Existing data sources will need to be re-created after they are moved to Unity Catalog and the Unity Catalog integration is configured. If you don't move all data before configuring the integration, metastore magic will protect your existing data sources throughout the migration process.

Configure the Databricks Unity Catalog integration

Existing data source migration

If you have existing Databricks data sources, complete these migration steps before proceeding.

You have two options for configuring your Databricks Unity Catalog integration:

Automatic setup: Immuta creates the catalogs, schemas, tables, and functions using the integration's configured personal access token.
Manual setup: Run the Immuta script in Databricks yourself to create the catalog. You can also modify the script to customize your storage location for tables, schemas, or catalogs.

Automatic setup

Required permissions

USE CATALOG and USE SCHEMA on parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta system account user can interact with those tables.
SELECT and MODIFY on all tables registered as Immuta data sources so that the system account user can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.
OWNER permission on the Immuta catalog created below.
OWNER permission on catalogs with schemas and tables registered as Immuta data sources so that Immuta can administer Unity Catalog row-level and column-level security controls. This permission can be applied by granting OWNER on a catalog to a Databricks group that includes the Immuta system account user to allow for multiple owners. If the OWNER permission cannot be applied at the catalog- or schema-level, each table registered as an Immuta data source must individually have the OWNER permission granted to the Immuta system account user.
CREATE CATALOG on the workspace metastore.
USE CATALOG on the system catalog for native query audit.
USE SCHEMA on the system.access schema for native query audit.
SELECT on the following system tables for native query audit:
- system.access.audit
- system.access.table_lineage
- system.access.column_lineage

Click the App Settings icon in the left sidebar.
Scroll to the Global Integration Settings section and check the Enable Databricks Unity Catalog support in Immuta checkbox. The additional settings in this section are only relevant to the Databricks Spark with Unity Catalog integration and will not have any effect on the Unity Catalog integration. These can be left with their default values.
Click the Integrations tab.
Click + Add Native Integration and select Databricks Unity Catalog from the dropdown menu.
Complete the following fields:
- Server Hostname is the hostname of your Databricks workspace.
- HTTP Path is the HTTP path of your Databricks cluster or SQL warehouse.
- Immuta Catalog is the name of the catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
If using a proxy server with Databricks Unity Catalog, click the Enable Proxy Support checkbox and complete the Proxy Host and Proxy Port fields. The username and password fields are optional.
Opt to fill out the Exemption Group field with the name of a group in Databricks that will be excluded from having data policies applied and must not be changed from the default value. Create this account-level group for privileged users and service accounts that require an unmasked view of data before configuring the integration in Immuta.
Unity Catalog query audit is enabled by default; you can disable it by clicking the Enable Native Query Audit checkbox. Ensure you have enabled system tables in Unity Catalog and provided the required access to the Immuta system account.
1. Configure the audit frequency by scrolling to Integrations Settings and find the Unity Catalog Audit Sync Schedule section.
2. Enter how often, in hours, you want Immuta to ingest audit events from Unity Catalog as an integer between 1 and 24.
3. Continue with your integration configuration.
Enter a Databricks Personal Access Token. This is the access token for the Immuta service principal. This service principal must have the metastore privileges listed above for the metastore associated with the Databricks workspace. If this token is configured to expire, update this field regularly for the integration to continue to function.
Click Test Databricks Unity Catalog Connection.
Save and Confirm your changes.

Manual setup

Required permissions

When performing a manual setup, the following Databricks permissions are required:

The user running the script must have the CREATE CATALOG permission on the workspace metastore.
The Databricks personal access token you configure below must be attached to an account with the following permissions:
- USE CATALOG and USE SCHEMA on parent catalogs and schemas of tables registered as Immuta data sources so that the Immuta system account user can interact with those tables.
- SELECT and MODIFY on all tables registered as Immuta data sources so that the system account user can grant and revoke access to tables and apply Unity Catalog row- and column-level security controls.
- OWNER permission on the Immuta catalog created below.
- OWNER permission on catalogs with schemas and tables registered as Immuta data sources so that Immuta can administer Unity Catalog row-level and column-level security controls. This permission can be applied by granting OWNER on a catalog to a Databricks group that includes the Immuta system account user to allow for multiple owners. If the OWNER permission cannot be applied at the catalog- or schema-level, each table registered as an Immuta data source must individually have the OWNER permission granted to the Immuta system account user.
- USE CATALOG on the system catalog for native query audit.
- USE SCHEMA on the system.access schema for native query audit.
- SELECT on the following system tables for native query audit:
  - system.access.audit
  - system.access.table_lineage
  - system.access.column_lineage

Click the App Settings icon in the left sidebar.
Scroll to the Global Integration Settings section and check the Enable Databricks Unity Catalog support in Immuta checkbox. The additional settings in this section are only relevant to the Databricks Spark with Unity Catalog integration and will not have any effect on the Unity Catalog integration. These can be left with their default values.
Click the Integrations tab.
Click + Add Native Integration and select Databricks Unity Catalog from the dropdown menu.
Complete the following fields:
- Server Hostname is the hostname of your Databricks workspace.
- HTTP Path is the HTTP path of your Databricks cluster or SQL warehouse.
- Immuta Catalog is the name of the catalog Immuta will create to store internal entitlements and other user data specific to Immuta. This catalog will only be readable for the Immuta service principal and should not be granted to other users. The catalog name may only contain letters, numbers, and underscores and cannot start with a number.
If using a proxy server with Databricks Unity Catalog, click the Enable Proxy Support checkbox and complete the Proxy Host and Proxy Port fields. The username and password fields are optional.
Opt to fill out the Exemption Group field with the name of a group in Databricks that will be excluded from having data policies applied and must not be changed from the default value. Create this account-level group for privileged users and service accounts that require an unmasked view of data before configuring the integration in Immuta.
Unity Catalog query audit is enabled by default; you can disable it by clicking the Enable Native Query Audit checkbox. Ensure you have enabled system tables in Unity Catalog and provided the required access to the Immuta system account.
1. Configure the audit frequency by scrolling to Integrations Settings and find the Unity Catalog Audit Sync Schedule section.
2. Enter how often, in hours, you want Immuta to ingest audit events from Unity Catalog as an integer between 1 and 24.
3. Continue with your integration configuration.
Enter a Databricks Personal Access Token. This is the access token for the Immuta service principal. This service principal must have the metastore privileges listed above for the metastore associated with the Databricks workspace. If this token is configured to expire, update this field regularly for the integration to continue to function.
Select the Manual toggle and copy or download the script. You can modify the script to customize your storage location for tables, schemas, or catalogs.
Run the script in Databricks.
Click Test Databricks Unity Catalog Connection.
Save and Confirm your changes.

Enable native query audit for Unity Catalog

To enable native query audit for Unity Catalog, complete the following steps before configuring the integration:

Enable a system schema where the <SCHEMA_NAME> is access.
Grant your Immuta system account user access to the Databricks Unity Catalog system tables. For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.
- USE CATALOG on the system catalog
- USE SCHEMA on the system.access schema
- SELECT on the following system tables:
  - system.access.audit
  - system.access.table_lineage
  - system.access.column_lineage
Enable verbose audit logs in Unity Catalog.
Use the Databricks Personal Access Token in the configuration above for the account you just granted system table access. This account will be the Immuta system account user.

Register your data

Protect your data

Map Databricks usernames to Immuta to ensure Immuta properly enforces policies and audits user queries.
Build global policies in Immuta to enforce table-, column-, and row-level security.

Unity Catalog Integration Reference

allows you to manage and access data in your Databricks account across all of your workspaces and introduces fine-grained access controls in Databricks.

Subscription policies: Immuta subscription policies automatically grant and revoke access to Databricks tables.
: Immuta data policies enforce row- and column-level security without creating views, so users can query tables as they always have without their workflows being disrupted.

Unity Catalog object model

Unity Catalog uses the following hierarchy of data objects:

Metastore: Created at the account level and is attached to one or more Databricks workspaces. The metastore contains metadata of all the catalogs, schemas, and tables available to query. All clusters on that workspace use the configured metastore and all workspaces that are configured to use a single metastore share those objects.
Catalog: A catalog sits on top of schemas (also called databases) and tables to manage permissions across a set of schemas.
Schema: Organizes tables and views.
Table: Tables can be managed or external tables.

For details about the Unity Catalog object model, see the .

Feature support

The Databricks Unity Catalog integration supports

:
- applying column masking and row-redaction policies on tables
- applying subscription polices on tables and views
enforcing Unity Catalog access controls, even if Immuta becomes disconnected
Delta and Parquet files
allowing non-Immuta reads and writes
using Photon
using a proxy server

Architecture

Unity Catalog supports managing permissions at the Databricks account level through controls applied directly to objects in the metastore. To interact with the metastore and apply controls to any table, Immuta requires a personal access token (PAT) for an Immuta system account user with permissions to manage all data protected by Immuta. See the for a list of specific Databricks privileges.

immuta_system: Contains internal Immuta data.
immuta_policies: Contains policy UDFs.

Policy enforcement

Table-level security: Immuta manages and privileges on securable objects in Databricks through subscription policies. When you create a subscription policy in Immuta, Immuta uses the Unity Catalog API to issue GRANTS or REVOKES against the catalog, schema, or table in Databricks for every user affected by that subscription policy.
Row-level security: Immuta applies SQL UDFs to restrict access to rows for querying users.
Column-level security: Immuta applies column-mask SQL UDFs to tables for querying users. These column-mask UDFs run for any column that requires masking.

The Unity Catalog integration supports the following policy types:

- Conditional masking
- Constant
- Custom masking
- Hashing
- Null
- Regex: You must use the global regex flag (g) when creating a regex masking policy in this integration. You cannot use the case insensitive regex flag (i) when creating a regex masking policy in this integration. See the for examples.
- Rounding (date and numeric rounding)
- Matching (only show rows where)
  - Custom WHERE
  - Never
  - Where user
  - Where value in column
- Minimization
- Time-based restrictions

Policy exemption groups

Policy support with `hive_metastore`

However, with you can use hive_metastore and enforce subscription and data policies with the .

Authentication method

The Databricks Unity Catalog integration supports the access token method to configure the integration and create data sources in Immuta. This is the access token for the Immuta service principal. This service principal must have the metastore privileges listed in the section for the metastore associated with the Databricks workspace. If this token is configured to expire, update this field regularly for the integration to continue to function.

Immuta data sources in Unity Catalog

The Unity Catalog data object model introduces a 3-tiered namespace, as . Consequently, your Databricks tables registered as data sources in Immuta will reference the catalog, schema (also called a database), and table.

External data connectors and query-federated tables

External data connectors and query-federated tables are preview features in Databricks. See the for details about the support and limitations of these features before registering them as data sources in the Unity Catalog integration.

Native query audit

Access requirements

For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.

USE CATALOG on the system catalog
USE SCHEMA on the system.access schema
SELECT on the following system tables:
- system.access.audit
- system.access.table_lineage
- system.access.column_lineage

The Databricks Unity Catalog integration audits user queries run in clusters or SQL warehouses for deployments configured with the Databricks Unity Catalog integration. The audit ingest is set when and the audit logs can be scoped to only ingest specific workspaces if needed.

See the for details about manually prompting ingest of audit logs and the contents of the logs.

Configuration requirements

for a list of requirements.

Supported Databricks cluster configurations

Example cluster

Databricks Runtime

Unity Catalog in Databricks

Databricks Spark integration

Databricks Spark with Unity Catalog support

Databricks Unity Catalog integration

Legend:

The feature or integration is enabled.
The feature or integration is disabled.

Unity Catalog caveats

Unity Catalog row- and column-level security controls are unsupported for single-user clusters. See the for details about this limitation.
Row access policies with more than 1023 columns are unsupported. This is an underlying limitation of UDFs in Databricks. Immuta will only create row access policies with the minimum number of referenced columns. This limit will therefore apply to the number of columns referenced in the policy and not the total number in the table.
If you disable table grants, Immuta revokes the grants. Therefore, if users had access to a table before enabling Immuta, they’ll lose access.
You must use the global regex flag (g) when creating a regex masking policy in this integration, and you cannot use the case insensitive regex flag (i) when creating a regex masking policy in this integration. See the examples below for guidance:
- regex with a global flag (supported): /^ssn|social ?security$/g
- regex without a global flag (unsupported): /^ssn|social ?security$/
- regex with a case insensitive flag (unsupported): /^ssn|social ?security$/gi
- regex without a case insensitive flag (supported): /^ssn|social ?security$/g

Azure Databricks Unity Catalog limitation

If a registered data source is owned by a Databricks group at the table level, then the Unity Catalog integration cannot apply data masking policies to that table in Unity Catalog.

Feature limitations

The following features are currently unsupported:

Databricks change data feed support
Immuta projects
Multiple IAMs on a single cluster
Column masking policies on views
Mixing masking policies on the same column
Row-redaction policies on views
R and Scala cluster support
Scratch paths
User impersonation
Policy enforcement on raw Spark reads
Python UDFs for advanced masking functions
Direct file-to-SQL reads
Data policies on ARRAY, MAP, or STRUCT type columns

Known issue

Snippets for Databricks data sources may be empty in the Immuta UI.

Databricks Unity Catalog

Permissions

Requirements

Migrate data to Unity Catalog

Configure the Databricks Unity Catalog integration

Automatic setup

Manual setup

Enable native query audit for Unity Catalog

Register your data

Protect your data

Unity Catalog Integration Reference

Unity Catalog object model

Feature support

Architecture

Policy enforcement

Policy exemption groups

Policy support with hive_metastore

Authentication method

Immuta data sources in Unity Catalog

External data connectors and query-federated tables

Native query audit

Configuration requirements

Supported Databricks cluster configurations

Unity Catalog caveats

Azure Databricks Unity Catalog limitation

Feature limitations

Known issue

Next

Migrate to Unity Catalog

Migrate to Unity Catalog

Databricks Unity Catalog

Permissions

Requirements

Migrate data to Unity Catalog

Configure the Databricks Unity Catalog integration

Automatic setup

Manual setup

Enable native query audit for Unity Catalog

Register your data

Protect your data

Unity Catalog Integration Reference

Unity Catalog object model

Feature support

Architecture

Policy enforcement

Policy exemption groups

Policy support with hive_metastore

Authentication method

Immuta data sources in Unity Catalog

External data connectors and query-federated tables

Native query audit

Configuration requirements

Supported Databricks cluster configurations

Unity Catalog caveats

Azure Databricks Unity Catalog limitation

Feature limitations

Known issue

Next

Policy support with `hive_metastore`

Policy support with `hive_metastore`