LogoLogo
SaaS
  • Immuta Documentation - SaaS
  • Configuration
    • Connect Data Platforms
      • Data Platforms Overview
      • Amazon S3 Integration
      • AWS Lake Formation
        • Getting Started with AWS Lake Formation
        • Register an AWS Lake Formation Connection
        • Reference Guides
          • AWS Lake Formation
          • Security and Compliance
          • Protecting Data
          • Accessing Data
      • Azure Synapse Analytics
        • Getting Started with Azure Synapse Analytics
        • Configure Azure Synapse Analytics Integration
        • Reference Guides
          • Azure Synapse Analytics Overview
          • Azure Synapse Analytics Pre-Configuration Details
      • Databricks
        • Databricks Spark
          • Getting Started with Databricks Spark
          • How-to Guides
            • Configure a Databricks Spark Integration
            • Manually Update Your Databricks Cluster
            • Install a Trusted Library
            • Project UDFs Cache Settings
            • Run R and Scala spark-submit Jobs on Databricks
            • DBFS Access
            • Troubleshooting
          • Reference Guides
            • Databricks Spark Integration Configuration
              • Installation and Compliance
              • Customizing the Integration
              • Setting Up Users
              • Spark Environment Variables
              • Ephemeral Overrides
            • Security and Compliance
            • Registering and Protecting Data
            • Accessing Data
              • Delta Lake API
        • Databricks Unity Catalog
          • Getting Started with Databricks Unity Catalog
          • How-to Guides
            • Configure a Databricks Unity Catalog Integration
            • Migrating to Unity Catalog
          • Databricks Unity Catalog Integration Reference Guide
      • Google BigQuery Integration
      • Redshift
        • Getting Started with Redshift
        • How-to Guides
          • Configure Redshift Integration
          • Configure Redshift Spectrum
        • Reference Guides
          • Redshift Overview
          • Redshift Pre-Configuration Details
      • Snowflake
        • Getting Started with Snowflake
        • How-to Guides
          • Configure a Snowflake Integration
          • Edit or Remove Your Snowflake Integration
          • Integration Settings
            • Snowflake Table Grants Private Preview Migration
            • Enable Snowflake Table Grants
            • Using Snowflake Data Sharing with Immuta
            • Enable Snowflake Low Row Access Policy Mode
              • Upgrade Snowflake Low Row Access Policy Mode
            • Configure Snowflake Lineage Tag Propagation
        • Reference Guides
          • Snowflake Integration
          • Snowflake Table Grants
          • Snowflake Data Sharing with Immuta
          • Snowflake Low Row Access Policy Mode
          • Snowflake Lineage Tag Propagation
          • Warehouse Sizing Recommendations
        • Explanatory Guides
          • Phased Snowflake Onboarding
      • Starburst (Trino)
        • Getting Started with Starburst (Trino)
        • How-to Guides
          • Configure Starburst (Trino) Integration
          • Customize Read and Write Access Policies for Starburst (Trino)
        • Starburst (Trino) Integration Reference Guide
      • Queries Immuta Runs in Your Data Platform
      • Connect Your Data
        • Registering a Connection
          • How-to Guides
            • Register a Snowflake Connection
            • Register a Databricks Unity Catalog Connection
            • Manually Run Object Sync
            • Manage Connection Settings
            • Use the Connection Upgrade Manager
              • Troubleshooting
          • Reference Guides
            • Connections
            • Upgrading to Connections
              • Before You Begin
              • API Changes
              • FAQ
        • Registering Metadata
          • Data Sources in Immuta
          • Register Data Sources
            • Amazon S3 Data Source
            • Azure Synapse Analytics Data Source
            • Databricks Data Source
            • Google BigQuery Data Source
            • Redshift Data Source
            • Snowflake Data Source
              • Bulk Create Snowflake Data Sources
            • Create a Starburst (Trino) Data Source
          • Data Source Settings
            • How-to Guides
              • Manage Data Source Settings
              • Manage Data Source Members
              • Manage Access Requests and Tasks
              • Manage Data Dictionary Descriptions
              • Disable Immuta from Sampling Raw Data
            • Data Source Health Checks Reference Guide
          • Schema Monitoring
            • How-to Guides
              • Manage Schema Monitoring
              • Run Schema Monitoring and Column Detection Jobs
            • Reference Guides
              • Schema Monitoring
              • Schema Projects
            • Why Use Schema Monitoring Concept Guide
    • Manage Data Metadata
      • Connect External Catalogs
        • Configure an External Catalog
        • Reference Guides
          • External Catalog Introduction
          • Custom REST Catalog Interface Introduction
          • Custom REST Catalog Interface Endpoints
      • Data Discovery
        • Introduction
        • Getting Started with Data Discovery
        • How-to Guides
          • Use Identifiers in Domains
          • Use Sensitive Data Discovery (SDD)
          • Manage Identification Frameworks
          • Manage Identifiers
          • Run and Manage Sensitive Data Discovery on Data Sources
        • Reference Guides
          • Identifiers in Domains
          • Built-in Identifier Reference
          • Improved Pack: Built-in Identifier Reference
          • Built-in Discovered Tags Reference
          • How Competitive Pattern Analysis Works
      • Data Classification
        • How-to Guides
          • Activate Classification Frameworks
          • Adjust Identification and Classification Framework Tags
          • How to Use a Classification Framework with Your Own Tags
        • Reference Guide
          • Classification Frameworks
      • Manage Tags
        • How-to Guides
          • Create and Manage Tags
          • Add Tags to Data Sources and Projects
        • Tags Reference Guide
    • Manage Users
      • Getting Started with Users
      • Identity Managers (IAMs)
        • How-to Guides
          • Okta LDAP Interface
          • OpenID Connect
            • OpenID Connect Protocol
            • Okta and OpenID Connect
            • OneLogin with OpenID Connect
          • SAML
            • SAML Protocol
            • Microsoft Entra ID
            • Okta SAML SCIM
        • Reference Guides
          • Identity Managers
          • SAML Protocol Configuration Options
          • SAML Single Logout
      • Immuta Users
        • How-to Guides
          • Managing Personas and Permissions
          • User Impersonation
          • Manage Attributes and Groups
          • External User ID Mapping
          • External User Info Endpoint
        • Reference Guides
          • Permissions and Personas
          • Attributes and Groups in Immuta
    • Organize Data into Domains
      • Getting Started with Domains
      • Domains Reference Guide
    • Application Settings
      • How-to Guides
        • App Settings
        • Private Networking Support
          • Data Connection Private Networking
            • AWS PrivateLink for Redshift
            • AWS PrivateLink for API Gateway
            • Databricks Private Connectivity
              • AWS PrivateLink for Databricks
              • Azure Private Link for Databricks
            • Snowflake Private Connectivity
              • AWS PrivateLink for Snowflake
              • Azure Private Link for Snowflake
            • Starburst (Trino) Private Connectivity
              • AWS PrivateLink for Starburst (Trino)
              • Azure Private Link for Starburst (Trino)
          • Immuta SaaS Private Networking
            • Immuta SaaS Private Networking Over AWS PrivateLink
        • BI Tools
          • BI Tool Configuration Recommendations
          • Power BI Configuration Example
          • Tableau Configuration Example
        • IP Filtering
        • System Status Bundle
      • Reference Guides
        • Deployment Options
        • Data Processing
        • Encryption and Masking Practices
  • Marketplace
    • Introduction
      • User Types
      • Walkthrough
    • Share Data Products
      • How-to Guides
        • Manage Data Products
        • View and Respond to Access Requests
        • Customize the Marketplace Branding
      • Reference Guides
        • Marketplace App Requirements
        • Data Products
        • Marketplace Permissions Matrix
        • Understanding Access Provisioning and Underlying Policies in Immuta
          • S3 Provisioning Best Practices
        • Integrating with Existing Catalogs
        • Setting Up Domains for Marketplace
    • Access Data Products
      • How-to Guides
        • Logging into Marketplace
        • Requesting Access to a Data Product
      • Reference Guide
        • Data Source Access Status
    • Short-Term Limitations
  • Governance
    • Introduction
      • Automate Data Access Control Decisions
        • The Two Paths
        • Managing User Metadata
        • Managing Data Metadata
        • Author Policy
        • Test and Deploy Policy
      • Compliantly Open More Sensitive Data for ML and Analytics
        • Managing User Metadata
        • Managing Data Metadata
        • Author Policy
    • Author Policies for Data Access Control
      • Introduction
        • Scalability and Evolvability
        • Understandability
        • Distributed Stewardship
        • Consistency
        • Availability of Data
      • Policies
        • Authoring Policies at Scale
        • Data Engineering with Limited Policy Downtime
        • Subscription Policies
          • Overview
          • How-to Guides
            • Author a Subscription Policy
            • Author an ABAC Subscription Policy
            • Subscription Policies Advanced DSL Guide
            • Author a Restricted Subscription Policy
            • Clone, Activate, or Stage a Global Policy
          • Reference Guides
            • Subscription Policy Access Types
            • Advanced Use of Special Functions
        • Data Policies
          • Overview
          • How-to Guides
            • Author a Masking Data Policy
            • Author a Minimization Policy
            • Author a Purpose-Based Restriction Policy
            • Author a Restricted Data Policy
            • Author a Row-Level Policy
            • Author a Time-Based Restriction Policy
            • Policy Certifications and Diffs
          • Reference Guides
            • Data Policy Types
            • Masking Policies
            • Row-Level Policies
            • Custom WHERE Clause Functions
            • Data Policy Conflicts and Fallback
            • Custom Data Policy Certifications
            • Orchestrated Masking Policies
      • Projects and Purpose-Based Access Control
        • Projects and Purpose Controls
          • Getting Started
          • How-to Guides
            • Create a Project
            • Create and Manage Purposes
            • Adjust a Policy
            • Project Management
              • Manage Projects and Project Settings
              • Manage Project Data Sources
              • Manage Project Members
          • Reference Guides
            • Projects and Purposes
            • Policy Adjustments
          • Concept Guide
            • Why Use Purposes?
        • Equalized Access
          • Manage Project Equalization How-to Guide
          • Equalized Access Reference Guide
          • Why Use Project Equalization?
        • Masked Joins
          • Enable Masked Joins How-to Guide
          • Why Use Masked Joins?
        • Writing to Projects
          • How-to Guides
            • Create and Manage Snowflake Project Workspaces
            • Create and Manage Databricks Spark Project Workspaces
            • Write Data to the Workspace
          • Reference Guides
            • Writing to Projects
            • Project UDFs (Databricks)
      • Data Consumers
        • Subscribe to a Data Source
        • Query Data
          • Querying Snowflake Data
          • Querying Databricks Data
          • Querying Starburst (Trino) Data
          • Querying Databricks SQL Data
          • Querying Redshift Data
          • Querying Azure Synapse Analytics Data
        • Subscribe to Projects
    • Observe Access and Activity
      • Introduction
      • Audit
        • How-to Guides
          • Export Audit Logs to S3
          • Export Audit Logs to ADLS
          • Use Immuta Audit
          • Run Governance Reports
        • Reference Guides
          • Universal Audit Model (UAM)
            • UAM Schema Reference Guide
          • Query Audit Logs
            • Snowflake Query Audit Logs
            • Databricks Unity Catalog Query Audit Logs
            • Databricks Spark Query Audit Logs
            • Starburst (Trino) Query Audit Logs
          • Audit Export GraphQL Reference Guide
          • Unknown Users in Audit Logs
          • Governance Report Types
        • Deprecated Audit Guides
          • Legacy to UAM Migration
          • Manage Audit Logs
      • Dashboards
        • Use the Audit Dashboards How-To Guide
        • Audit Dashboards Reference Guide
      • Monitors
        • Manage Monitors and Observations
        • Monitors Reference Guide
  • Releases
    • Deployment Notes
      • 2024
      • 2023
      • 2022
    • Scheduled Maintenance Windows
    • Immuta Support Matrix Overview
    • Immuta CLI Release Notes
    • Preview Features
      • Features in Preview
    • Deprecations
  • Developer Guides
    • The Immuta CLI
      • Install and Configure the Immuta CLI
      • Manage Your Immuta Tenant
      • Manage Data Sources
      • Manage Sensitive Data Discovery
        • Manage Sensitive Data Discovery Rules
        • Manage Identification Frameworks
        • Run Sensitive Data Discovery on Data Sources
      • Manage Policies
      • Manage Projects
      • Manage Purposes
      • Manage Audit Export
    • The Immuta API
      • Integrations API
        • Getting Started
        • How-to Guides
          • Configure an Amazon S3 Integration
          • Configure an Azure Synapse Analytics Integration
          • Configure a Databricks Unity Catalog Integration
          • Configure a Google BigQuery Integration
          • Configure a Redshift Integration
          • Configure a Snowflake Integration
          • Configure a Starburst (Trino) Integration
        • Reference Guides
          • Integrations API Endpoints
          • Integration Configuration Payload
          • Response Schema
          • HTTP Status Codes and Error Messages
      • Connections API
        • How-to Guides
          • Register a Connection
            • Register a Snowflake Connection
            • Register a Databricks Unity Catalog Connection
            • Register an AWS Lake Formation Connection
          • Manage a Connection
          • Deregister a Connection
        • Connection Registration Payloads Reference Guide
      • Marketplace API
        • Marketplace API Endpoints
        • Source Controlling Data Products
      • Immuta V2 API
        • Data Source Payload Attribute Details
          • Data Source Request Payload Examples
        • Create Policies API Examples
        • Create Projects API Examples
        • Create Purposes API Examples
      • Immuta V1 API
        • Authenticate with the API
        • Configure Your Instance of Immuta
          • Get Job Status
          • Manage Frameworks
          • Manage IAMs
          • Manage Licenses
          • Manage Notifications
          • Manage Sensitive Data Discovery (SDD)
          • Manage Tags
          • Manage Webhooks
          • Search Filters
        • Connect Your Data
          • Create and Manage an Amazon S3 Data Source
          • Create an Azure Synapse Analytics Data Source
          • Create a Databricks Data Source
          • Create a Redshift Data Source
          • Create a Snowflake Data Source
          • Create a Starburst (Trino) Data Source
          • Manage the Data Dictionary
        • Use Domains
        • Manage Data Access
          • Manage Access Requests
          • Manage Data and Subscription Policies
          • Manage Write Policies
            • Write Policies Payloads and Response Schema Reference Guide
          • Policy Handler Objects
          • Search Audit Logs
          • Search Connection Strings
          • Search for Organizations
          • Search Schemas
        • Subscribe to and Manage Data Sources
        • Manage Projects and Purposes
          • Manage Projects
          • Manage Purposes
        • Generate Governance Reports
Powered by GitBook

Self-managed versions

  • 2024.3
  • 2024.2

Resources

  • Immuta Changelog

Copyright © 2014-2025 Immuta Inc. All rights reserved.

On this page
  • Unity Catalog object model
  • Feature support
  • What does Immuta do in my Databricks environment?
  • Workspace-catalog binding
  • Databricks Unity Catalog privileges
  • Policy enforcement
  • Project-scoped purpose exceptions for Databricks Unity Catalog
  • Masked joins for Databricks Unity Catalog
  • Policy exemption groups
  • Policy support with hive_metastore
  • Authentication methods
  • Integration health status
  • Immuta data sources in Unity Catalog
  • Supported object types
  • External data connectors and query-federated tables
  • Query audit
  • Tag ingestion
  • Syncing tag changes
  • Limitations
  • Configuration requirements
  • Unity Catalog caveats
  • Azure Databricks Unity Catalog limitation
  • Feature limitations
  • Known issue
  • Next

Was this helpful?

Export as PDF
  1. Configuration
  2. Connect Data Platforms
  3. Databricks
  4. Databricks Unity Catalog

Databricks Unity Catalog Integration Reference Guide

PreviousMigrating to Unity CatalogNextGoogle BigQuery Integration

Last updated 11 days ago

Was this helpful?

Immuta’s integration with Unity Catalog allows you to enforce fine-grained access controls on Unity Catalog securable objects with Immuta policies. Instead of manually creating UDFs or granting access to each table in Databricks, you can author your policies in Immuta and have Immuta manage and orchestrate Unity Catalog access-control policies on your data in Databricks clusters or SQL warehouses:

  • Subscription policies: Immuta subscription policies automatically grant and revoke access to specific Databricks securable objects.

  • : Immuta data policies enforce row- and column-level security.

Unity Catalog object model

Unity Catalog uses the following hierarchy of data objects:

  • Metastore: Created at the account level and is attached to one or more Databricks workspaces. The metastore contains metadata of all the catalogs, schemas, and tables available to query. All clusters on that workspace use the configured metastore and all workspaces that are configured to use a single metastore share those objects.

  • Catalog: Sits on top of schemas (also called databases) and tables to manage permissions across a set of schemas

  • Schema: Organizes tables and views

  • Table-etc: Table (managed or external tables), view, volume, model, and function

For details about the Unity Catalog object model, see the .

Feature support

The Databricks Unity Catalog integration supports

  • :

    • applying column masks and row filters on specific securable objects

    • applying subscription polices on tables and views

  • enforcing Unity Catalog access controls, even if Immuta becomes disconnected

  • allowing non-Immuta reads and writes

  • using Photon

  • using a proxy server

What does Immuta do in my Databricks environment?

Immuta uses this service principal to run queries that set up user-defined functions (UDFs) and other data necessary for policy enforcement. Upon enabling the integration, Immuta will create a catalog that contains these schemas:

  • immuta_system: Contains internal Immuta data.

  • immuta_policies_n: Contains policy UDFs.

When policies require changes to be pushed to Unity Catalog, Immuta updates the internal tables in the immuta_system schema with the updated policy information. If necessary, new UDFs are pushed to replace any out-of-date policies in the immuta_policies_n schemas and any row filters or column masks are updated to point at the new policies. Many of these operations require compute on the configured Databricks cluster or SQL warehouse, so compute must be available for these policies to succeed.

Workspace-catalog binding

Use cases

Typical use cases for binding a catalog to specific workspaces include

  1. Ensuring users can only access production data from a production workspace environment.

    For example, you may have production data in a prod_catalog, as well as a production workspace you are introducing to your organization. Binding the prod_catalog to the prod_workspace ensures that workspace admins and users can only access prod_catalog from the prod_workspace environment.

  2. Ensuring users can only process sensitive data from a specific workspace. Limiting the environments from which users can access sensitive data helps better secure your organization’s data. Limiting access to one workspace also simplifies any monitoring, auditing, and understanding of which users are accessing specific data. This would entail a similar setup as the example above.

  3. Giving users read-only access to production data from a developer workspace.

    This enables your organization to effectively conduct development and testing, while minimizing risk to production data. All user access to this catalog from this workspace can be specified as read-only, ensuring developers can access the data they need for testing without risk of any unwanted updates.

Additional workspace connections

Limitations

  • Each additional workspace connection must be in the same metastore as the primary workspace used to set up the integration.

  • No two additional workspace connections can be responsible for the same catalog.

Databricks Unity Catalog privileges

The privileges the Databricks Unity Catalog integration requires align to the least privilege security principle. The table below describes each privilege required in Databricks Unity Catalog for the and the .

Databricks Unity Catalog privilege
User requiring the privilege
Explanation

Account admin

Setup user

This privilege allows the setup user to grant the Immuta service principal the necessary permissions to orchestrate Unity Catalog access controls and maintain state between Immuta and Databricks Unity Catalog.

CREATE CATALOG on the Unity Catalog metastore

Setup user

This privilege allows the setup user to create an Immuta-owned catalog and tables.

Metastore admin

Setup user

  • USE CATALOG and MANAGE on all catalogs containing securables registered as Immuta data sources

  • USE SCHEMA on all schemas containing securables registered as Immuta data sources

Immuta service principal

These privileges allow the service principal to apply row filters and column masks on the securable.

MODIFY and SELECT on all securables registered as Immuta data sources

Immuta service principal

OWNER on the Immuta catalog

Immuta service principal

The Immuta service principal must own the catalog Immuta creates during setup that stores the Immuta policy information. The Immuta setup script grants ownership of this catalog to the Immuta service principal when you configure the integration.

  • USE CATALOG on the system catalog

  • USE SCHEMA on the system.access schema

  • SELECT on the following system tables:

    • system.access.audit

    • system.access.table_lineage

    • system.access.column_lineage

Immuta service principal

These privileges allow Immuta to audit user queries in Databricks Unity Catalog.

Policy enforcement

Immuta’s Unity Catalog integration applies Databricks table-, row-, and column-level security controls that are enforced natively within Databricks. Immuta's management of these Databricks security controls is automated and ensures that they synchronize with Immuta policy or user entitlement changes.

  • Row-level security: Immuta applies SQL UDFs to restrict access to rows for querying users.

  • Column-level security: Immuta applies column-mask SQL UDFs to tables for querying users. These column-mask UDFs run for any column that requires masking.

The Unity Catalog integration supports the following policy types:

    • Conditional masking

    • Constant

    • Custom masking

    • Hashing

    • Null (including on ARRAY, MAP, and STRUCT type columns)

    • Rounding (date and numeric rounding)

    • Matching (only show rows where)

      • Custom WHERE

      • Never

      • Where user

      • Where value in column

    • Minimization

    • Time-based restrictions

Project-scoped purpose exceptions for Databricks Unity Catalog

Databricks Unity Catalog views

If you are using views in Databricks Unity Catalog, one of the following must be true for project-scoped purpose exceptions to apply to the views in Databricks:

  • The view and underlying table are registered as Immuta data sources and added to a project: If a view and its underlying table are both added as Immuta data sources, both of these assets must be added to the project for the project-scoped purpose exception to apply. If a view and underlying table are both added as data sources but the table is not added to an Immuta project, the purpose exception will not apply to the view because Databricks does not support fine-grained access controls on views.

  • Only the underlying table is registered as an Immuta data source and added to a project: If only the underlying table is registered as an Immuta data source but the view is not registered, the purpose exception will apply to both the table and corresponding view in Databricks. Views are the only Databricks object that will have Immuta policies applied to them even if they're not registered as Immuta data sources (as long as their underlying tables are registered).

Masked joins for Databricks Unity Catalog

Policy exemption groups

Some users may need to be exempt from masking and row-level policy enforcement. When you add user accounts to the configured exemption group in Databricks, Immuta will not enforce policies for those users. Exemption groups are created when the Unity Catalog integration is configured, and no policies will apply to these users' queries, despite any policies enforced on the tables they query.

The principal used to register data sources in Immuta will be automatically added to this exemption group for that Databricks table. Consequently, users added to this list and used to register data sources in Immuta should be limited to service accounts.

Policy support with hive_metastore

When enabling Unity Catalog support in Immuta, the catalog for all Databricks data sources will be updated to point at the default hive_metastore catalog. Internally, Databricks exposes this catalog as a proxy to the workspace-level Hive metastore that schemas and tables were kept in before Unity Catalog. Since this catalog is not a real Unity Catalog catalog, it does not support any Unity Catalog policies. Therefore, Immuta will ignore any data sources in the hive_metastore in any Databricks Unity Catalog integration, and policies will not be applied to tables there.

Authentication methods

The Databricks Unity Catalog integration supports the following authentication methods to configure the integration and create data sources:

Integration health status

The status of the integration is visible on the integrations tab of the Immuta application settings page. If errors occur in the integration, a banner will appear in the Immuta UI with guidance for remediating the error.

Immuta data sources in Unity Catalog

Supported object types

  • Table

  • View

  • Materialized view

  • Streaming table

  • External table

  • Foreign table

External data connectors and query-federated tables

Query audit

Access requirements

For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.

  • USE CATALOG on the system catalog

  • USE SCHEMA on the system.access schema

  • SELECT on the following system tables:

    • system.access.audit

    • system.access.table_lineage

    • system.access.column_lineage

Tag ingestion

Design partner preview: This feature is available to select accounts. Reach out to your Immuta representative to enable this feature.

You can enable tag ingestion to allow Immuta to ingest Databricks Unity Catalog table and column tags so that you can use them in Immuta policies to enforce access controls. When you enable this feature, Immuta uses the credentials and connection information from the Databricks Unity Catalog integration to pull tags from Databricks and apply them to data sources as they are registered in Immuta. If Databricks data sources preexist the Databricks Unity Catalog tag ingestion enablement, those data sources will automatically sync to the catalog and tags will apply. Immuta checks for changes to tags in Databricks and syncs Immuta data sources to those changes every 24 hours.

Syncing tag changes

When syncing data sources to Databricks Unity Catalog tags, Immuta pulls the following information:

  • Table tags: These tags apply to the table and appear on the data source details tab. Databricks tags' key and value pairs are reflected in Immuta as a hierarchy with each level separated by a . delimiter. For example, the Databricks Unity Catalog tag Location: US would be represented as Location.US in Immuta.

  • Column tags: These tags are applied to data source columns and appear on the columns listed in the data dictionary tab. Databricks tags' key and value pairs are reflected in Immuta as a hierarchy with each level separated by a . delimiter. For example, the Databricks Unity Catalog tag Location: US would be represented as Location.US in Immuta.

  • Table comments field: This content appears as the data source description on the data source details tab.

  • Column comments field: This content appears as dictionary column descriptions on the data dictionary tab.

Limitations

  • Only tags that apply to Databricks data sources in Immuta are available to build policies in Immuta. Immuta will not pull tags in from Databricks Unity Catalog unless those tags apply to registered data sources.

  • Cost implications: Tag ingestion in Databricks Unity Catalog requires compute resources. Therefore, having many Databricks data sources or frequently manually syncing data sources to Databricks Unity Catalog may incur additional costs.

  • Databricks Unity Catalog tag ingestion only supports tenants with fewer than 2,500 data sources registered.

Configuration requirements

Unity Catalog caveats

  • Row access policies with more than 1023 columns are unsupported. This is an underlying limitation of UDFs in Databricks. Immuta will only create row access policies with the minimum number of referenced columns. This limit will therefore apply to the number of columns referenced in the policy and not the total number in the table.

  • If you disable table grants, Immuta revokes the grants. Therefore, if users had access to a table before enabling Immuta, they’ll lose access.

  • You must use the global regex flag (g) when creating a regex masking policy in this integration, and you cannot use the case insensitive regex flag (i) when creating a regex masking policy in this integration. See the examples below for guidance:

    • regex with a global flag (supported): /^ssn|social ?security$/g

    • regex without a global flag (unsupported): /^ssn|social ?security$/

    • regex with a case insensitive flag (unsupported): /^ssn|social ?security$/gi

    • regex without a case insensitive flag (supported): /^ssn|social ?security$/g

Azure Databricks Unity Catalog limitation

If a registered data source is owned by a Databricks group at the table level, then the Unity Catalog integration cannot apply data masking policies to that table in Unity Catalog.

Therefore, set all table-level ownership on your Unity Catalog data sources to an individual user or service principal instead of a Databricks group. Catalogs and schemas can still be owned by a Databricks group, as ownership at that level doesn't interfere with the integration.

Feature limitations

The following features are currently unsupported:

  • Databricks change data feed support

  • Immuta project workspaces

  • Multiple IAMs on a single cluster

  • Column masking policies on views

  • Mixing masking policies on the same column

  • Row-redaction policies on views

  • R and Scala cluster support

  • Scratch paths

  • User impersonation

  • Policy enforcement on raw Spark reads

  • Python UDFs for advanced masking functions

  • Direct file-to-SQL reads

  • Data policies (except for masking with NULL) on ARRAY, MAP, or STRUCT type columns

  • Shallow clones

Known issue

Snippets for Databricks data sources may be empty in the Immuta UI.

Next

Unity Catalog supports managing permissions account-wide in Databricks through controls applied directly to objects in the metastore. To establish a connection with Databricks and apply controls to securable objects within the metastore, Immuta requires a service principal with privileges to manage all data protected by Immuta. (OAuth M2M) or a personal access token (PAT) can be provided for Immuta to authenticate as the service principal. See the for a list of specific Databricks privileges.

Workspace-catalog binding allows users to leverage Databricks’ catalog isolation mode to limit catalog access to specific Databricks workspaces. The default isolation mode is OPEN, meaning all workspaces can access the catalog (with the exception of the automatically-created ), provided they are in the metastore attached to the catalog. Setting this mode to ISOLATED allows the catalog owner to specify a workspace-catalog binding, which means the owner can dictate which workspaces are authorized to access the catalog. This prevents other workspaces from accessing the specified catalogs. To bind a catalog to a specific workspace in Databricks Unity Catalog, see the .

Immuta’s Databricks Unity Catalog integration allows users to configure additional workspace connections to support using Databricks' feature. Users can configure additional workspace connections in their Immuta integrations to be consistent with the workspace-catalog bindings that are set up in Databricks. Immuta will use each additional workspace connection to govern the catalog(s) that workspace is bound to in Databricks. If desired, each set of bound catalogs can also be configured to run on its own compute.

To use this feature, you should first . Once that is configured, you can use Immuta's Integrations API to configure an additional workspace connection. This can be added when you or by .

Additional workspace connections in Databricks Unity Catalog are not currently supported in Immuta's .

This privilege is required only if enabling query audit, which requires granting access to system tables to the Immuta service principal. To grant access, a user that is both a metastore admin and an account admin must grant USE and SELECT permissions on the system schemas to the service principal. See for more details.

These privileges allow the service principal to apply row filters and column masks on the securable. Additionally, they are required for to run on the securable.

Table-level security: Immuta manages and privileges on securable objects in Databricks through subscription policies. When you create a subscription policy in Immuta, Immuta uses the Unity Catalog API to issue GRANTS or REVOKES against the catalog, schema, or table in Databricks for every user affected by that subscription policy.

Regex: You must use the global regex flag (g) when creating a regex masking policy in this integration. You cannot use the case insensitive regex flag (i) when creating a regex masking policy in this integration. See the for examples.

Project-scoped purpose exceptions for Databricks Unity Catalog integrations allow you to apply to Databricks data sources in a project. As a result, users can only access that data when they are working within that specific project.

This feature allows masked columns to be joined across data sources that belong to the same project. When data sources do not belong to a project, Immuta uses a unique salt per data source for hashing to prevent masked values from being joined. (See the guide for an explanation of that behavior.) However, once you add Databricks Unity Catalog data sources to a project and enable masked joins, Immuta uses a consistent salt across all the data sources in that project to allow the join.

For more information about masked joins and enabling them for your project, see the of documentation.

However, with you can use hive_metastore and enforce subscription and data policies with the .

Personal access token (PAT): This is the access token for the Immuta service principal. This service principal must have the metastore privileges listed in the section for the metastore associated with the Databricks workspace. If this token is configured to expire, update this field regularly for the integration to continue to function.

OAuth machine-to-machine (M2M): Immuta uses the to integrate with , which allows Immuta to authenticate with Databricks using a client secret. Once Databricks verifies the Immuta service principal’s identity using the client secret, Immuta is granted a temporary OAuth token to perform token-based authentication in subsequent requests. When that token expires (after one hour), Immuta requests a new temporary token. See the for more details.

The definitions for each status and the state of configured data platform integrations is available in the . However, the UI consolidates these error statuses and provides detail in the error messages.

The Unity Catalog data object model introduces a 3-tiered namespace, as . Consequently, your Databricks tables registered as data sources in Immuta will reference the catalog, schema (also called a database), and table.

The supported object types for Databricks Unity Catalog are listed below. When applying read and write access policies to these data sources, the privileges granted by Immuta vary depending on the object type. See an outline of privileges granted by Immuta on the .

External data connectors and query-federated tables are preview features in Databricks. See the for details about the support and limitations of these features before registering them as data sources in the Unity Catalog integration.

The Databricks Unity Catalog integration audits all user queries run in the integration's clusters or SQL warehouses. See the for details about the contents of the logs.

The audit ingest is set when and can be scoped to only ingest specific workspaces if needed. The default ingest frequency is every hour, but this can be configured to a different frequency on the . Additionally, audit ingestion can be manually requested at any time from the Immuta audit page. When manually requested, it will only search for new queries that were created since the last query that had been audited. The job is run in the background, so the new queries will not be immediately available.

Once external tags are applied to Databricks data sources, those tags can be used to create and .

To enable Databricks Unity Catalog tag ingestion, see the page.

After making changes to tags in Databricks, you can so that the changes immediately apply to the data sources in Immuta. Otherwise, tag changes will automatically sync within 24 hours.

for a list of requirements.

.

workspace catalog
Databricks documentation
workspace-catalog binding
connections
REVOKE
GRANT
Subscription policies
Why use masked joins?
Masked joins section
Client Credentials Flow
Databricks OAuth machine-to-machine authentication
Databricks OAuth machine-to-machine (M2M) authentication page
Databricks documentation
Databricks Unity Catalog audit page
subscription
data policies
Configure the Databricks Unity Catalog integration
Databricks Unity Catalog documentation
auditing activity of both Immuta users and non-Immuta users
Data policies
managing and accessing data across multiple Databricks workspaces
enforcing Unity Catalog row-, column-, and table-level access controls on Databricks clusters and SQL warehouses
Databricks OAuth for service principals
Databricks Unity Catalog privileges section
limitations section
outlined above
Manage privileges in Unity Catalog
sensitive data discovery
set up a workspace-catalog binding in your Databricks account
Databricks Spark integration
configuring the integration
permissions
Configure a Databricks Unity Catalog integration
See the Enable Unity Catalog guide
response schema of the integrations API
Databricks metastore magic
Select masking policies
Row-level policies
purpose-based policies
Immuta app settings page
manually sync the catalog
Subscription policy access types page
initially set up the integration
updating your existing integration configuration