1 of 100

2025.1

Immuta Documentation - 2025.1

One platform to optimize how you access and control data.

Immuta gives everyone fast, governed access to data with the built-in controls, collaboration workflows, automated provisioning, and continuous monitoring you need to keep risk low and compliance high

Configure Immuta

Explore Immuta

Configuration

Deploy Immuta

This section illustrates how to install Immuta on Kubernetes using the Immuta Enterprise Helm chart.

Requirements

This reference guide provides an overview of the Immuta Enterprise Helm chart version requirements and infrastructure recommendations.

The guides in this section illustrate how to install and deploy Immuta in your Kubernetes environment.

The guides in this section illustrate how to upgrade Immuta.

The guides in this section illustrate how to configure your Immuta Enterprise Helm chart for various scenarios, including optimizing your deployment for production environments.

This guide provides links to additional resources for disaster recovery strategies.

This page provides troubleshooting guidance and outlines frequently asked questions for the Immuta installation.

This page introduces the core concepts and terminology essential for understanding the installation material.

Install

The guides in this section illustrate how to install and deploy Immuta in your Kubernetes environment.

Prerequisites

Helm installation

The following guides use the

Upgrade

Introduced in 2024.2, the Immuta Enterprise Helm chart (IEHC) is an entirely new Helm chart used to deploy Immuta. Unlike the previous Immuta Helm chart (IHC), the IEHC shares the same version as the Immuta product. Each version of the chart supports a singular version of Immuta. Upgrading the Immuta version now entails upgrading the underlying Helm chart. Failure to do so will lead to an unsupported configuration.

Chart name

Common name

Immuta versions

Registry

Description

immuta

Immuta Helm chart (IHC)

<2024.2

ocir.immuta.com

Version independent of the Immuta product

Helm chart deprecation notice

As of Immuta version 2024.2, the IHC has been deprecated in favor of the IEHC. The immuta-values.yaml Helm values files are not cross-compatible.

Upgrade guides

Upgrading from Immuta 2024.1.x or older

If upgrading from 2024.1.x, it is first necessary to migrate to the new Helm chart, as these versions were all installed using the legacy IHC. This migration process will include upgrading Immuta.

: If you're upgrading from 2024.1.x or older, you must first migrate to the new Helm chart, as these versions were all installed using the legacy IHC. This migration process will include upgrading Immuta.
: Upgrade from v2024.2 LTS using the Immuta Enterprise Helm chart.

Migrating to the New Helm Chart

This guide demonstrates how to upgrade an existing Immuta deployment installed with the older Immuta Helm chart (IHC) to v2024.2 LTS using the Immuta Enterprise Helm chart (IEHC).

Helm chart deprecation notice

As of Immuta version 2024.2, the IHC has been deprecated in favor of the IEHC. Their respective immuta-values.yaml Helm values files are not compatible.

Prerequisites

Create a PostgreSQL database

The PostgreSQL instance has been provisioned and is actively running.
The PostgreSQL instance's hostname/FQDN is .
The PostgreSQL instance is .

For additional information, consult the Deployment requirements.

Validate the Helm release

Fetch the metadata for the Helm release associated with Immuta.
Review the output from the previous step and verify the following:
- The Immuta version (appVersion) is

Metadata database

The new IEHC no longer supports deploying a Metadata database (PostgreSQL) inside the Kubernetes cluster. Before transitioning to the new IEHC, it's first necessary to externalize the Metadata database.

Built-in

The following demonstrates how to take a database backup and import the data into each cloud provider's managed PostgreSQL service.

Create backup of old database

Get the metadata database pod name.
Spawn a shell inside the running metadata database pod.
Perform a database backup.
Type exit, and then press Enter to exit the shell prompt.

Setup new database

Create a pod named immuta-setup-db and spawn a shell.
Connect to the new PostgreSQL database as a superuser. Depending on the cloud provider, the default superuser name (postgres) might differ.
Create immuta, temporal, and temporal_visiblity

Restore backup to new database

Create a pod named immuta-restore-db and spawn a shell.
Copy file bometadata.dump from the host's working directory to pod immuta-restore-db.
Spawn a shell inside pod immuta-restore-db.

External

No additional work is required. The existing database can be reused with the new IEHC.

Helm values

Helm values file compatibility

The immuta-values.yaml Helm values file used by the IHC is not compatible with the new IEHC.

Rename the existing immuta-values.yaml Helm values file used by the IHC.
Follow the for your Kubernetes distribution of choice.

Upgrading IEHC

This guide demonstrates how to upgrade an existing 2024.2+ Immuta deployment installed with the Immuta Enterprise Helm chart (IEHC) to the latest Immuta release.

Requirements

Temporal

Guides

The following guides offer practical guidance for handling common challenges and configurations.

Ingress configuration

Configure Ingress to complete your installation and access your Immuta application.

Configure TLS termination for an Ingress resource.

Verify artifacts hosted on the ocir.immuta.com OCI registry.

Follow these best practices when deploying Immuta in your production environment.

Update the credentials referenced in the Immuta Enterprise Helm chart.

Configure an external key-value cache (such as Redis or Memcached) with the Immuta Enterprise Helm chart.

Enable this legacy service for your deployment if you are using any of the .

Configure pulling images from a private registry.

Tips when installing Immuta without internet access.

Ingress Configuration

This guide demonstrates how to configure Ingress. Ingress can be configured in numerous ways. Configurations for the most popular controllers are outlined below.

Kubernetes namespace

The following section(s) presume the Immuta Enterprise Helm chart was deployed into namespace immuta and that the current namespace is immuta.

The Immuta web service listens on the following ports:

Port

Protocol

Description

Optional

Ingress hostname

This is the fully qualified domain name (FQDN) as defined by RFC 3986 used to access the Immuta UI. If a FQDN has yet to be determined set Secure's ingress hostname to immuta.local.

Edit the immuta-values.yaml file to include the following Helm values.
Perform a to apply the changes made to immuta-values.yaml.

Refer to the for further assistance.

Edit immuta-values.yaml to include the following Helm values.
Create a file named frontendconfig.yaml with the following content.
Apply the FrontendConfig CRD.

Refer to the for further assistance.

Edit immuta-values.yaml to include the following Helm values.
Perform a to apply the changes made to immuta-values.yaml.

Refer to the for further assistance.

Edit immuta-values.yaml to include the following Helm values.
Perform a to apply the changes made to immuta-values.yaml.

Refer to the for further assistance.

Edit immuta-values.yaml to include the following Helm values.
Create a file named middleware.yaml with the following content.
Apply the Middleware CRD.

Refer to the for further assistance.

Edit immuta-values.yaml to include the following Helm values. Because the Ingress resource will be managed by the OpenShift route you create and not the Immuta Enterprise Helm chart, ingress is set to false below.
Get the service name for Secure.
Create a file named route.yaml with the following content. Update all with your own values.

Refer to the for further assistance.

TLS Configuration

This guide demonstrates how to configure TLS termination for an Ingress resource.

Kubernetes namespace

The following section(s) presume the Immuta Enterprise Helm chart was deployed into namespace immuta and that the current namespace is immuta.

Prerequisite

The must be completed before proceeding.

Edit immuta-values.yaml to include the following Helm values.
from a given public/private PEM formatted key pair.
Perform a to apply the changes made to immuta-values.yaml.

Refer to the for further assistance.

Edit immuta-values.yaml to include the following Helm values.
Perform a to apply the changes made to immuta-values.yaml.

Refer to the for further assistance.

Edit immuta-values.yaml to include the following Helm values.
Perform a to apply the changes made to immuta-values.yaml.

Refer to the for further assistance.

Edit immuta-values.yaml to include the following Helm values.
Perform a to apply the changes made to immuta-values.yaml.

Refer to the for further assistance.

Edit immuta-values.yaml to include the following Helm values.
from a given public/private PEM formatted key pair.
Perform a to apply the changes made to immuta-values.yaml.

Refer to the for further assistance.

Cosign Verification

This guide demonstrates how to verify signed artifacts (i.e., container images, Helm charts) hosted on ocir.immuta.com using from .

Cosign installation

This guide utilizes the cosign command to verify artifacts; ensure it's installed before proceeding. Refer to the for further assistance.

Rotating Credentials

This guide demonstrates how to update credentials referenced in the Immuta Enterprise Helm chart (IEHC).

Kubernetes namespace

The following section(s) presume the IEHC was deployed into namespace immuta and that the current namespace is immuta.

External Cache Configuration

This guide demonstrates how to configure an external key-value cache (such as Redis or Memcached) with the Immuta Enterprise Helm chart (IEHC).

Kubernetes namespace

The following section(s) presume the IEHC was deployed into namespace immuta and that the current namespace is immuta.

Enabling Legacy Query Engine

The query engine is no longer installed by default. This guide demonstrates how to enable the query engine using the Immuta Enterprise Helm chart (IEHC).

If you are using any of the , you must enable the query engine.

Kubernetes namespace

The following section(s) presume the IEHC was deployed into namespace immuta, and that the current namespace is immuta.

Private Container Registries

This guide demonstrates how to configure a private container registry with the Immuta Enterprise Helm chart (IEHC).

Image availability

This guide assumes that you have already copied all Immuta container images to your private registry. The process of copying images to a private registry can vary significantly depending on your specific environment and tools and is therefore outside the scope of this document.

Air-Gapped Environments

This guide demonstrates how to download and package the Immuta Enterprise Helm chart and its dependencies for consumption on a separate network with no internet access.

Prerequisite

Skopeo installation

OpenSearch Authentication

Setting Up OpenSearch User Permissions for an AWS Role

If you're using AWS OpenSearch in your Immuta installation, use this how-to to set up the proper access and permissions needed for AWS role authentication.

Requirements

An OpenSearch domain

Setting Up OpenSearch User Permissions for Username and Password Authentication

If you're using AWS OpenSearch in your Immuta installation, use this how-to to set up the proper permissions needed for username and password authentication.

In the AWS console, and create a master user. This user will set up the permissions for the audit user.
In the OpenSearch console, . This user will be the audit user. You will enter the username and password for this user when installing Immuta.
for the audit user.

Disaster Recovery

Planning a disaster recovery strategy

As of 2024.2 LTS, there is no longer a backup/restore mechanism built into the Immuta Enterprise Helm chart. Organizations are now solely responsible for creating and enacting an effective disaster recovery strategy for their installation.

All application state is stored in the PostgreSQL metadata database; therefore, recovering from a disaster event only entails restoring the aforementioned PostgreSQL database. Consult each cloud provider's point-in-time recovery (PITR) documentation for guidance:

Amazon RDS for PostgreSQL

For more details about point-in-time recovery, see the .

Troubleshooting

Frequently asked questions

How can I ensure the fully qualified domain name (FQDN) is resolvable from within the Kubernetes cluster?

Conventions

The following conventions are used throughout the installation material.

Angle brackets ( < and > )

Phrases wrapped in angle brackets (i.e., <, >) are placeholders used to indicate values that must be substituted with user-provided values. Placeholders are typically written in either kebab case, or snake case; the following placeholders are equivalent:

<the-quick-brown-fox>
<the_quick_brown_fox>

Example

Input

Output

Connect Integrations

Immuta integrates with your data platforms and external catalogs so you can register your data and effectively manage access controls on that data.

This section includes concept, reference, and how-to guides for configuring your data platform integration, registering data sources, and connecting your external catalog so that you can discover, monitor, and protect sensitive data.

Integrations overview

This reference guide outlines the features, policies, and audit capabilities supported by each integration.

Integrations

The guides in these sections include information about how to connect your data platform to Immuta.

This reference guide outlines the actions and features that trigger Immuta queries in your remote platform that may incur cost.

Immuta integrates with your data platforms so you can register your data and effectively manage access controls on that data. This section includes concept, reference, and how-to guides for registering and managing data sources and your connections.

Azure Synapse Analytics

In this integration, Immuta generates policy-enforced views in a schema in your configured Azure Synapse Analytics Dedicated SQL pool for tables registered as Immuta data sources.

Getting started

This guide outlines how to integrate Azure Synapse Analytics with Immuta.

How-to guide

: Configure the integration in Immuta.

Reference guides

: This guide describes the design and components of the integration.
: This guide describes the prerequisites, supported features, and limitations of the integration.

Getting Started with Azure Synapse Analytics

The how-to guides linked on this page illustrate how to integrate Azure Synapse Analytics with Immuta. See the for information about the Azure Synapse Analytics integration.

Requirement: A running Dedicated SQL pool

Connect your technology

These guides provide instructions on getting your data set up in Immuta.

: Configure an Azure Synapse Analytics integration with Immuta so that Immuta can create policy protected views for your users to query.

Reference Guides

Getting Started with Databricks Spark

The how-to guides linked on this page illustrate how to integrate Databricks Spark with Immuta.

Requirements

If Databricks Unity Catalog is enabled in a Databricks workspace, you must use an when you set up the Databricks Spark integration to create an Immuta-enabled cluster.
If Databricks Unity Catalog is not enabled in your Databricks workspace, you must disable Unity Catalog in your Immuta tenant before proceeding with your configuration of Databricks Spark:

How-to Guides

DBFS Access

This page outlines how to enable access to DBFS in Databricks for non-sensitive data. Databricks administrators should place the desired configuration in the Spark environment variables.

DBFS FUSE mount

This Databricks feature mounts DBFS to the local cluster filesystem at /dbfs. Although disabled when using process isolation, this feature can safely be enabled if raw, unfiltered data is not stored in DBFS and all users on the cluster are authorized to see each other’s files. When enabled, the entirety of DBFS essentially becomes a scratch path where users can read and write files in /dfbs/path/to/my/file as though they were local files.

Reference Guides

Databricks Spark Integration Configuration

The Databricks Spark integration is one of two integrations Immuta offers for Databricks.

In this integration, Immuta installs an Immuta-maintained Spark plugin on your Databricks cluster. When a user queries data that has been registered in Immuta as a data source, the plugin injects policy logic into the plan Spark builds so that the results returned to the user only include data that specific user should see.

The reference guides in this section are written for Databricks administrators who are responsible for setting up the integration, securing Databricks clusters, and setting up users:

Installation and compliance: This guide includes information about what Immuta creates in your Databricks environment and securing your Databricks clusters.
: Consult this guide for information about customizing the Databricks Spark integration settings.
: Consult this guide for information about connecting data users and setting up user impersonation.
: This guide provides a list of Spark environment variables used to configure the integration.
: This guide describes ephemeral overrides and how to configure them to reduce the risk that a user has overrides set to a cluster (or multiple clusters) that aren't currently up.

Setting Up Users

When the Databricks Spark plugin is running on a Databricks cluster, all Databricks users running jobs or queries are either a privileged user or a non-privileged user:

Privileged users: Privileged users can effectively read from and write to any table or view in the cluster Metastore, or any file path accessible by the cluster, without restriction. Privileged users are either or users specified in . Any user writing queries or jobs impersonating another user is a non-privileged user, even if they are impersonating a privileged user.\
Privileged users have effective authority to read from and write to any securable in the cluster metastore or file path, because in almost all cases Databricks clusters running with the Immuta Spark plug-in installed have disabled . However, if Hive metastore table access control is enabled on the cluster, privileged users will have the authority granted to them that is specified by table access control.

Databricks Unity Catalog

This integration allows you to manage and access data in your Databricks account across all of your workspaces. With Immuta’s Databricks Unity Catalog integration, you can write your policies in Immuta and have them enforced automatically by Databricks across data in your Unity Catalog metastore.

This getting started guide outlines how to integrate Databricks Unity Catalog with Immuta.

How-to Guides

Redshift

In this integration, Immuta generates policy-enforced views in your configured Redshift schema for tables registered as Immuta data sources.

Getting started

This guide outlines how to integrate Redshift with Immuta.

How-to guides

: Configure the integration in Immuta.
: Configure Redshift Spectrum in Immuta.

Reference guides

: This guide describes the design and components of the integration.
: This guide describes the prerequisites, supported features, and limitations of the integration.

Reference Guides

How-to Guides

Integration Settings

Reference Guides

Explanatory Guides

Security and Compliance

Immuta offers several features to provide security for your users and Databricks clusters and to prove compliance and monitor for anomalies.

Authentication

Configuring the integration and registering data

Immuta supports the following authentication methods to configure the Databricks Spark integration and register data sources:

OAuth machine-to-machine (M2M): Immuta uses the to integrate with , which allows Immuta to authenticate with Databricks using a client secret. Once Databricks verifies the Immuta service principal’s identity using the client secret, Immuta is granted a temporary OAuth token to perform token-based authentication in subsequent requests. When that token expires (after one hour), Immuta requests a new temporary token. See the for more details.\
Personal access token (PAT): This token gives Immuta temporary permission to push the cluster policies to the configured Databricks workspace and overwrite any cluster policy templates previously applied to the workspace when configuring the integration or to register securables as Immuta data sources.

User authentication

The built-in Immuta IAM can be used as a complete solution for authentication and fine-grained user entitlement. However, you can connect your existing identity management provider to Immuta to use that system for authentication and fine-grained user entitlement instead.

Each of the supported identity providers includes a specific set of configuration options that enable Immuta to communicate with the IAM system and map the users, permissions, groups, and attributes into Immuta.

See the guide for a list of supported providers and details.

See the for details and instructions on mapping Databricks user accounts to Immuta.

Cluster security

Data processing and encryption

See the for more information about transmission of policy decision data, encryption of data in transit and at rest, and encryption key management.

Protecting the Immuta configuration

Non-administrator users on an Immuta-enabled Databricks cluster must not have access to view or modify Immuta configuration, as this poses a security loophole around Immuta policy enforcement. allow you to securely apply environment variables to Immuta-enabled clusters.

Databricks secrets can be used in the environment variables configuration section for a cluster by referencing the secret path instead of the actual value of the environment variable.

See the for details and instructions on using Databricks secrets.

Scala cluster security

There are limitations to isolation among users in Scala jobs on a Databricks cluster. When data is broadcast, cached (spilled to disk), or otherwise saved to SPARK_LOCAL_DIR, it's impossible to distinguish between which user’s data is composed in each file/block. To address this vulnerability, Immuta suggests that you

limit Scala clusters to Scala jobs only and
require equalized projects, which will force all users to act under the same set of attributes, groups, and purposes with respect to their data access. This requirement guarantees that data being dropped into SPARK_LOCAL_DIR will have policies enforced and that those policies will be homogeneous for all users on the cluster. Since each user will have access to the same data, if they attempt to manually access other users' cached/spilled data, they will only see what they have access to via equalized permissions on the cluster. If project equalization is not turned on, users could dig through that directory and find data from another user with heightened access, which would result in a data leak.

See the for more details and configuration instructions.

Auditing and compliance

Immuta provides auditing features and governance reports so that data owners and governors can monitor users' access to data and detect anomalies in behavior.

You can view the information in these audit logs on or export the full audit logs to S3 and ADLS for long-term backup and processing with log data processors and tools. This capability fosters convenient integrations with log monitoring services and data pipelines.

See the for details about these capabilities and how they work with the Databricks Spark integration.

Databricks query audit

Immuta captures the code or query that triggers the Spark plan in Databricks, making audit records more useful in assessing what users are doing.

To audit what triggers the Spark plan, Immuta hooks into Databricks where notebook cells and JDBC queries execute and saves the cell or query text. Then, Immuta pulls this information into the audits of the resulting Spark jobs.

Immuta will audit queries that come from interactive notebooks, notebook jobs, and JDBC connections, but will not audit . Furthermore, Immuta only audits Spark jobs that are associated with Immuta tables. Consequently, Immuta will not audit a query in a notebook cell that does not trigger a Spark job, unless

See the page for examples of saved queries and the resulting audit records. To exclude query text from audit events, see the page.

Auditing all queries

Immuta supports auditing all queries run on a Databricks cluster, regardless of whether users touch Immuta-protected data or not.

See the for details and instructions.

Auditing queries run while impersonating another user

When a query is run by a user impersonating another user, the extra.impersonationUser field in the audit log payload is populated with the Databricks username of the user impersonating another user. The userId field will return the Immuta username of the user being impersonated:

See the for details about user impersonation.

Governance reports

Immuta governance reports allow users with the GOVERNANCE Immuta permission to use a natural language builder to instantly create reports that delineate user activity across Immuta. These reports can be based on various entity types, including users, groups, projects, data sources, purposes, policy types, or connection types.

See the page for a list of report types and guidance.

Installation and Compliance

In the Databricks Spark integration, Immuta installs an Immuta-maintained Spark plugin on your Databricks cluster. When a user queries data that has been registered in Immuta as a data source, the plugin injects policy logic into the plan Spark builds so that the results returned to the user only include data that specific user should see.

The sequence diagram below breaks down this process of events when an Immuta user queries data in Databricks.

System requirements

A Databricks workspace with the Premium tier, which includes cluster policies (required to configure the Spark integration)
A cluster that uses one of these supported Databricks Runtimes:
- 11.3 LTS
- 14.3 (private preview) - Requires Immuta version 2025.1.x or newer
Supported languages
- Python
- R (not supported for Databricks Runtime 14.3)
- Scala (not supported for Databricks Runtime 14.3)
A Databricks cluster that is one of these supported compute types:
Custom access mode
A Databricks workspace and cluster with the ability to directly make HTTP calls to the Immuta web service. The Immuta web service also must be able to connect to and perform queries on the Databricks cluster, and to call .
The Databricks Spark integration only works with Spark 3.

What does Immuta do in my Databricks environment?

When an administrator configures the Databricks Spark integration, Immuta generates a cluster policy that the administrator then applies to the Databricks cluster. When the cluster starts after the cluster policy has been applied, the Databricks cluster that Immuta provides downloads Spark plugin artifacts onto the cluster that has the init script and puts the artifacts in the appropriate locations on local disk for use by Spark.

Once the init script runs, the Spark application running on the Databricks cluster will have the appropriate artifacts on its CLASSPATH to use Immuta for authorization and policy enforcement.

Immuta adds the following artifacts to your Databricks environment:

Immuta-maintained Spark plugin

The Databricks Spark integration injects this Immuta-maintained Spark plugin into the SparkSQL stack at cluster startup time. Policy determinations are obtained from the connected Immuta tenant and applied before returning results to the user. The plugin includes wrappers and Immuta analysis hook plan rewrites to enforce policies.

Immuta Security Manager

Note: The Security Manager is disabled for.

The Immuta Security Manager ensures users can't perform unauthorized actions when using Scala and R, since those languages have features that allow users to circumvent policies without the Security Manager enabled. The Immuta Security Manager blocks users from executing code that could allow them to gain access to sensitive data by only allowing select code paths to access sensitive files and methods. These select code paths provide Immuta's code access to sensitive resources while blocking end users from these sensitive resources directly.

Performance

The Security Manager must inspect the call stack every time a permission check is triggered, which adds overhead to queries. To improve Immuta's query performance on Databricks, Immuta disables the Security Manager when Scala and R are not being used.

The cluster init script checks the cluster’s configuration and automatically removes the Security Manager configuration when

immuta database

When a table is registered in Immuta as a data source, users can see that table in the native Databricks database and in the immuta database. This allows for an option to use a single database (immuta) for all tables.

The immuta database on Immuta-enabled clusters allows Immuta to track Immuta-managed data sources separately from remote Databricks tables so that policies and other security features can be applied. However, Immuta supports raw tables in Databricks, so table-backed queries do not need to reference this database.

When configuring a Databricks cluster, you can hide immuta from any calls to

Once the Immuta-enabled cluster is running, the following user actions spur various processes. The list below provides an overview of each process:

: When a data owner registers a Databricks securable as a data source, the data source metadata (column type, securable name, column names, etc.) is retrieved from the Metastore and stored in the Immuta Metadata Database. If tags are then applied to the data source, Immuta stores this metadata in the Metadata Database as well.
Data source is deleted: When a data source is deleted, the data source metadata is deleted from the Metadata Database. Depending on the settings configured for the integration, users will either be able to query that data now that it is no longer registered in Immuta, or access to the securable will be revoked for all users. See the for details about this setting.
: Information about the policy and the columns or securables it applies to is stored in the Metadata Database. When a user queries the data in Databricks, the Spark plugin retrieves the policy information, the user metadata, and the data source metadata from the Metadata Database and injects this information as policy logic into the Spark logical plan. Immuta caches policy information and data source definitions in memory on the Spark application to reduce calls to the Metadata Database and boost performance.

The image below illustrates these processes and how they interact.

Supported policies

The Databricks Spark integration allows users to author subscription and data policies to enforce access controls. See the corresponding pages for details about specific types of policies supported:

Databricks Runtime 14.3

Private preview: Support for this Databricks Runtime is available to select accounts. Contact your Immuta representative for details.

Immuta supports clusters on Databricks Runtime 14.3. The integration for this Databricks Runtime differs from the integration for other supported Runtimes in the following ways:

: The Security Manager is disabled for Databricks Runtime 14.3. Because the Security Manager is used to prevent users from circumventing access controls when using R and Scala, those languages are unsupported. Only Python and SQL clusters are supported.
Py4J security and process isolation automatically enabled: Immuta relies on Databricks process isolation and Py4J security to prevent user code from performing unauthorized actions. After selecting Runtime 14.3 during configuration, Immuta will automatically enable process isolation and Py4J security.
dbutils is unsupported: Immuta relies on Databricks process isolation and Py4J security to prevent user code from performing unauthorized actions. This means that dbutils is not supported for Databricks Spark integrations using Runtime 14.3.

Cluster security and compliance

Authentication methods

The Databricks Spark integration supports the following authentication methods to configure the integration:

OAuth machine-to-machine (M2M): Immuta uses the to integrate with , which allows Immuta to authenticate with Databricks using a client secret. Once Databricks verifies the Immuta service principal’s identity using the client secret, Immuta is granted a temporary OAuth token to perform token-based authentication in subsequent requests. When that token expires (after one hour), Immuta requests a new temporary token. See the for more details.
Personal access token (PAT): This token gives Immuta temporary permission to push the cluster policies to the configured Databricks workspace and overwrite any cluster policy templates previously applied to the workspace when configuring the integration or to register securables as Immuta data sources.

Audit

Immuta captures the code or query that triggers the Spark plan in Databricks, making audit records more useful in assessing what users are doing. To audit what triggers the Spark plan, Immuta hooks into Databricks where notebook cells and JDBC queries execute and saves the cell or query text. Then, Immuta pulls this information into the audits of the resulting Spark jobs.

Immuta supports auditing all queries run on a Databricks cluster, regardless of whether users touch Immuta-protected data or not. To configure Immuta to do so, set the in the Spark cluster configuration when configuring your integration.

See the for more details about the audit capabilities in the Databricks Spark integration.

Protecting the Immuta configuration

Non-administrator users on an Immuta-enabled Databricks cluster must not have access to view or modify Immuta configuration or the immuta-spark-hive.jar file, as this poses a security loophole around Immuta policy enforcement. allow you to securely apply environment variables to Immuta-enabled clusters.

Databricks secrets can be used in the environment variables configuration section for a cluster by referencing the secret path instead of the actual value of the environment variable. For example, if a user wanted to make the MY_SECRET_ENV_VAR=abcd_1234 value secret, they could instead create a Databricks secret and reference it as the value of that variable by following these steps:

Create the secret scope my_secrets and add a secret with the key my_secret_env_var containing the sensitive environment variable.
Reference the secret in the environment variables section as MY_SECRET_ENV_VAR={{secrets/my_secrets/my_secret_env_var}}.

At runtime, {{secrets/my_secrets/my_secret_env_var}} would be replaced with the actual value of the secret if the owner of the cluster has access to that secret.

Scala clusters

There are limitations to isolation among users in Scala jobs on a Databricks cluster, even when using Immuta’s Security Manager. When data is broadcast, cached (spilled to disk), or otherwise saved to SPARK_LOCAL_DIR, it's impossible to distinguish between which user’s data is composed in each file/block. If you are concerned about this vulnerability, Immuta suggests that you

limit Scala clusters to Scala jobs only and
require equalized projects, which will force all users to act under the same set of attributes, groups, and purposes with respect to their data access. To require that Scala clusters be used in equalized projects and avoid the risk described above, set the to true. Once this configuration is complete, users on the cluster will need to switch to an Immuta equalized project before running a job. Once the first job is run using that equalized project, all subsequent jobs, no matter the user, must also be run under that same equalized project. If you need to change a cluster's project, you must restart the cluster.

When data is read in Spark using an Immuta policy-enforced plan, the masking and redaction of rows is performed at the leaf level of the physical Spark plan, so a policy such as "Mask using hashing the column social_security_number for everyone" would be implemented as an expression on a project node right above the FileSourceScanExec/LeafExec node at the bottom of the plan. This process prevents raw data from being shuffled in a Spark application and, consequently, from ending up in SPARK_LOCAL_DIR.

This policy implementation coupled with an equalized project guarantees that data being dropped into SPARK_LOCAL_DIR will have policies enforced and that those policies will be homogeneous for all users on the cluster. Since each user will have access to the same data, if they attempt to manually access other users' cached data, they will only see what they have access to via equalized permissions on the cluster. If project equalization is not turned on, users could dig through that directory and find data from another user with heightened access, which would result in a data leak.

Troubleshooting the installation

The has guidance for resolving issues with your installation.

Configure a Databricks Unity Catalog Integration

Deprecation notice

Support for configuring the Databricks Unity Catalog integration using this legacy workflow has been deprecated. Instead, configure your integration and register your data using connections.

Databricks Unity Catalog allows you to manage and access data in your Databricks account across all of your workspaces. With Immuta’s Databricks Unity Catalog integration, you can write your policies in Immuta and have them enforced automatically by Databricks across data in your Unity Catalog metastore.

Permissions

The permissions outlined in this section are the Databricks privileges required for a basic configuration. See the for a list of privileges necessary for additional features and settings.

APPLICATION_ADMIN Immuta permission
The Databricks user running the installation script must have the following privileges:
- Account admin

See the for more details about Unity Catalog privileges and securable objects.

Requirements

Before you configure the Databricks Unity Catalog integration, ensure that you have fulfilled the following requirements:

Unity Catalog and attached to a Databricks workspace. Immuta supports configuring a single metastore for each configured integration, and that metastore may be attached to multiple Databricks workspaces.
Unity Catalog enabled on your Databricks cluster or SQL warehouse. All SQL warehouses have Unity Catalog enabled if your workspace is attached to a Unity Catalog metastore. Immuta recommends linking a SQL warehouse to your Immuta tenant rather than a cluster for both performance and availability reasons.
If you select single user access mode for your cluster, you must

Unity Catalog best practices

Ensure your integration with Unity Catalog goes smoothly by following these guidelines:

Use a Databricks SQL warehouse to configure the integration. Databricks SQL warehouses are faster to start than traditional clusters, require less management, and can run all the SQL that Immuta requires for policy administration. A serverless warehouse provides nearly instant startup time and is the preferred option for connecting to Immuta.

Migrate data to Unity Catalog

Ensure that all Databricks clusters that have Immuta installed are stopped and the Immuta configuration is removed from the cluster. Immuta-specific cluster configuration is no longer needed with the Databricks Unity Catalog integration.
Move all data into Unity Catalog before configuring Immuta with Unity Catalog. Existing data sources will need to be re-created after they are moved to Unity Catalog and the Unity Catalog integration is configured. If you don't move all data before configuring the integration, c will protect your existing data sources throughout the migration process.

Create the Databricks service principal

In Databricks, with the privileges listed below. Immuta uses this service principal continuously to orchestrate Unity Catalog policies and maintain state between Immuta and Databricks.

USE CATALOG and MANAGE on all catalogs containing securables registered as Immuta data sources.
USE SCHEMA on all schemas containing securables registered as Immuta data sources.
MODIFY and SELECT on all securables you want registered as Immuta data sources.

MANAGE and MODIFY are required so that the service principal can apply row filters and column masks on the securable; to do so, the service principal must also have SELECT on the securable as well as USE CATALOG on its parent catalog and USE SCHEMA on its parent schema. Since privileges are inherited, you can grant the service principal the MODIFY and SELECT privilege on all catalogs or schemas containing Immuta data sources, which automatically grants the service principal the MODIFY and SELECT

See the for more details about Unity Catalog privileges and securable objects.

Opt to enable query audit for Unity Catalog

.
.
If you will configure the integration using the manual setup option, the Immuta script you will use includes the SQL statements for granting required privileges to the service principal, so you can skip this step and continue to the . Otherwise, . For Databricks Unity Catalog audit to work, the service principal must have the following access at minimum:

Configure the Databricks Unity Catalog integration

Existing data source migration: If you have existing Databricks data sources, complete these steps before proceeding.

You have two options for configuring your Databricks Unity Catalog integration:

: Immuta creates the catalogs, schemas, tables, and functions using the integration's configured service principal.
: Run the Immuta script in Databricks yourself to create the catalog. You can also modify the script to customize your storage location for tables, schemas, or catalogs. The user running the script must have the .

Automatic setup

Required permissions: When performing an automatic setup, the credentials provided must have the .

Click the App Settings icon in the left sidebar.
Click the Integrations tab.
Click + Add Integration and select Databricks Unity Catalog from the dropdown menu.
Complete the following fields:

Create a separate Immuta catalog for each Immuta tenant

If multiple Immuta tenants are connected to your Databricks environment, create a separate Immuta catalog for each of those tenants. Having multiple Immuta tenants use the same Immuta catalog causes failures in policy enforcement.

If using a proxy server with Databricks Unity Catalog, click the Enable Proxy Support checkbox and complete the Proxy Host and Proxy Port fields. The username and password fields are optional.
Opt to fill out the Exemption Group field with the name of an account-level group in Databricks that must be exempt from having data policies applied. This group is created and managed in Databricks and should only include privileged users and service accounts that require an unmasked view of data. Create this group in Databricks before configuring the integration in Immuta.

Exemption group cannot be changed after configuration is saved

The exemption group field cannot be edited after you save the integration configuration. If you need to change this group name, you can choose one of the following options:

Update the group name in Databricks to match what you have configured here.
Delete the integration in Immuta and create a new Databricks Unity Catalog integration with the new exemption group name.

is enabled by default. .
1. Opt to scope the query audit ingestion by entering in Unity Catalog Workspace IDs. Enter a comma-separated list of the workspace IDs that you want Immuta to ingest audit records for. If left empty, Immuta will audit all tables and users in Unity Catalog.
2. Configure the by scrolling to Integrations Settings and find the Unity Catalog Audit Sync Schedule section.

Manual setup

Required permissions: When performing a manual setup, the Databricks user running the script must have the .

Click the App Settings icon in the left sidebar.
Click the Integrations tab.
Click + Add Integration and select Databricks Unity Catalog from the dropdown menu.
Complete the following fields:

Create a separate Immuta catalog for each Immuta tenant

If using a proxy server with Databricks Unity Catalog, click the Enable Proxy Support checkbox and complete the Proxy Host and Proxy Port fields. The username and password fields are optional.
Opt to fill out the Exemption Group field with the name of an account-level group in Databricks that must be exempt from having data policies applied. This group is created and managed in Databricks and should only include privileged users and service accounts that require an unmasked view of data. Create this group in Databricks before configuring the integration in Immuta.

Exemption group cannot be changed after configuration is saved

The exemption group field cannot be edited after you save the integration configuration. If you need to change this group name, you can choose one of the following options:

Update the group name in Databricks to match what you have configured here.
Delete the integration in Immuta and create a new Databricks Unity Catalog integration with the new exemption group name.

is enabled by default. .
1. Opt to scope the query audit ingestion by entering in Unity Catalog Workspace IDs. Enter a comma-separated list of the workspace IDs that you want Immuta to ingest audit records for. If left empty, Immuta will audit all tables and users in Unity Catalog.
2. Configure the by scrolling to Integrations Settings and find the Unity Catalog Audit Sync Schedule section.

Map Databricks users to Immuta

If the usernames in Immuta do not match usernames in Databricks, map each Databricks username to each Immuta user account to ensure Immuta properly enforces policies using one of the methods linked below:

If the Databricks user doesn't exist in Databricks when you configure the integration, after they are created in Databricks. Otherwise, policies will not be enforced correctly for them in Databricks. Databricks user identities for Immuta users are automatically marked as invalid when the user is not found during policy application, preventing them from being affected by Databricks policy until their Immuta user identity is manually mapped to their Databricks identity.

Opt to enable Databricks Unity Catalog tag ingestion

Design partner preview

This feature is only available to select accounts. Reach out to your Immuta representative to enable this feature.

Requirements:

A configured Databricks Unity Catalog integration
Fewer than 2,500 Databricks Unity Catalog data sources registered in Immuta

To allow Immuta to automatically import table and column tags from Databricks Unity Catalog, enable Databricks Unity Catalog tag ingestion in the external catalog section of the Immuta app settings page.

Navigate to the App Settings page.
Scroll to 2 External Catalogs, and click Add Catalog.
Enter a Display Name and select Databricks Unity Catalog from the dropdown menu.
Click Save and confirm your changes.

Register data

Amazon S3

Private preview: This integration is available to select accounts. Contact your Immuta representative for details.

Getting started

Immuta's Amazon S3 integration allows users to apply subscription policies to data in S3 to restrict what prefixes, buckets, or objects users can access. To enforce access controls on this data, Immuta creates S3 grants that are administered by S3 Access Grants, an AWS feature that defines access permissions to data in S3.

Requirements

No location is registered in your S3 Access Grants instance before configuring the integration in Immuta
; contact your Immuta representative to get this feature enabled
Enable AWS IAM Identity Center (IDC) (recommended): is the best approach for user provisioning because it treats users as users, not users as roles. Consequently, access controls are enforced for the querying user, nothing more. This approach eliminates over-provisioning and permits granular access control. Furthermore, IDC uses trusted identity propagation, meaning AWS propagates a user's identity wherever that user may operate within the AWS ecosystem. As a result, a user's identity always remains known and consistent as they navigate across AWS services, which is a key requirement for organizations to properly govern that user. Enabling IDC does not impact any existing access controls; it is additive. Immuta will manage the GRANTs for you using IDC if it is enabled and configured in Immuta. See the

Permissions

APPLICATION_ADMIN Immuta permission to configure the integration
CREATE_S3_DATASOURCE Immuta permission to register S3 prefixes
The AWS account credentials or optional AWS IAM role you provide Immuta to configure the integration must

Set up S3 Access Grants instance

. AWS supports one Access Grants instance per region per AWS account.
. You will add this role to your integration configuration in Immuta so that Immuta can register this role with your Access Grants location. The policy should include at least the following permissions, but might need additional permissions depending on other local setup factors. An example trust policy is provided below.
- sts:AssumeRole

IAM role trust policy example

with the following permissions, and attach the policy to the IAM role you created to grant the permissions to the role. The policy should include the following permissions. An example policy is provided below.

s3:GetObject
s3:GetObjectVersion
s3:GetObjectAcl

IAM policy example

Replace <bucket_arn> in the example below with the ARN of the bucket scope that contains data you want to grant access to.

If you use server-side encryption with AWS Key Management Service (AWS KMS) keys to encrypt your data, the following permissions are required for the IAM role in the policy. If you do not use this feature, do not include these permissions in your IAM policy:

kms:Decrypt
kms:GenerateDataKey

that Immuta can use to create Access Grants locations and issue grants. This role must have the S3 permissions listed in the . An example policy is provided below.

IAM policy example

Replace <role_arn> and <access_grants_instance_arn> in the example below with the ARNs of the role you created and your Access Grants instance, respectively. The Access Grants instance resource ARN should be scoped to apply to any future locations that will be created under this Access Grants instance. For example, "Resource": "arn:aws:s3:us-east-2:6********499:access-grants/default*" ensures that the role would have permissions for both of these locations:

arn:aws:s3:us-east-2:6********499:access-grants/default/newlocation1

If you use AWS IAM Identity Center, associate . Then add the permissions listed in the sample policy below to your IAM policy, and attach the policy to the IAM role you created to grant the permissions to the role.

IAM policy example

Copy the JSON below and replace the following bracketed placeholder values with your own. For details about the actions and resource values, see the .

<iam_identity_center_instance_arn>: The that is configured with the application.
<iam_identity_center_application_arn_for_s3_access_grants>: The configured with IAM Identity Center.

Configure the integration in Immuta

In Immuta, click App Settings in the navigation menu and click the Integrations tab.
Click + Add Integration.
Select Amazon S3 from the dropdown menu and click Continue Configuration.
Complete the connection details fields, where

Register S3 data

Follow the to register prefixes in Immuta.

To create an S3 data source using the API, see the .

Editing an integration

You can edit the following settings for an existing Amazon S3 integration on the app settings page:

friendly name
authentication type and values (access key, secret, and role)

To edit settings for an existing integration via the API, see the .

Protect data

Requirements: USER_ADMIN Immuta permission and either the GOVERNANCE or CREATE_S3_DATASOURCE Immuta permission

in Immuta to enforce access controls.
Map AWS IAM principals to each Immuta user to ensure Immuta properly enforces policies:
1. Click Identities and select Users in the navigation menu.
2. Navigate to the user's page and click the more actions

Access data

Requirement: User must be subscribed to the data source in Immuta

. If you're accessing S3 data through one of the supported (such as Amazon EMR on EC2), that application will make this request on your behalf, so you can skip this step.
.

S3 integration overview

Immuta's Amazon S3 integration allows users to apply to data in S3 to restrict what prefixes, buckets, or objects users can access. To enforce access controls on this data, Immuta creates S3 grants that are administered by S3 Access Grants, an AWS feature that defines access permissions to data in S3.

With this integration, users can avoid

hand-writing AWS IAM policies
managing AWS IAM role limits
manually tracking what user or role has access to what files in AWS S3 and verifying those are consistent with intent

S3 Access Grants components

To enforce controls on S3 data, Immuta interacts with several S3 Access Grants components:

Access Grants instance: An Access Grants instance is a logical container for individual grants that specify who can access what level of data in S3 in your AWS account and region. AWS supports one Access Grants instance per region per AWS account.
Location: A location specifies what data the Access Grants instance can grant access to. For example, registering a location with a scope of s3:// allows Access Grants to manage access to all S3 buckets in that AWS account and region, whereas setting the bucket s3://research-data as the scope limits Access Grants to managing access to that single bucket for that location. When you configure the S3 integration in Immuta, you specify a location's scope and IAM assumed role, and Immuta registers the location in your Access Grants instance and associates it with the provided IAM role for you. Each S3 integration you configure in Immuta is associated with one location, and Immuta manages all grants in that location. Therefore, grants cannot be manually created by users in an Access Grants instance location that Immuta has registered and manages. During data source registration, this location scope is prepended to the data source prefixes to build the final path used to grant or revoke access to that data in S3. For example, a location scope of

The diagram below illustrates how these S3 Access Grants components interact.

For more details about these Access Grants concepts, see the .

How does the integration work?

After an administrator creates an Access Grants instance and an assumed IAM role in their AWS account, an application administrator configures the Amazon S3 integration in Immuta. During configuration, the administrator provides the following connection information so that Immuta can create and register a location in that Access Grants instance:

AWS account ID and region
ARN for the existing Access Grants instance
ARN for the assumed IAM role

When Immuta registers this location, it associates the assumed IAM role with the location. This allows the IAM role to create temporary credentials with access scoped to a particular S3 prefix, bucket, or object in the location. The IAM role you create for this location must have all the object- and bucket-level permissions listed in the on all buckets and objects in the location; if it is missing permissions, the IAM role will not be able to grant those missing permissions to users or applications requesting temporary credentials.

In the example below, an application administrator registers the following location prefix and IAM role for their Access Grants instance in AWS account 123456:

Location path: s3://. This path allows a single Amazon S3 integration to manage all objects in S3 in that AWS account and region. Data owners can scope down access further when registering specific S3 prefixes and applying policies.
Location IAM role: The arn:aws:iam::123456:role/access-grants-role IAM role will be used to vend temporary credentials to users and applications.

Immuta registers this location and associated IAM role in the user's Access Grants instance:

After the S3 integration is configured, a data owner can register S3 prefixes and buckets that are in the configured Access Grants location path to enforce access controls on resources. Immuta stores the connection information for the prefix so that the metadata can be used to create and enforce subscription policies on S3 data.

A data owner or governor can apply a subscription policy to a registered prefix, bucket, or object to control who can access objects beginning with that prefix or in that bucket after it is registered in Immuta. Once a subscription policy is created and Immuta users are subscribed to the prefix, bucket, or object, Immuta calls the Access Grants API to create a grant for each subscribed user, specifying the following parameters in the payload so that Access Grants can create and store a grant for each user:

Access Grants location
READ access
User or role principle
Registered prefix, bucket, or object

In the example below, a data owner registers the s3://research-data/* bucket, and Immuta stores the connection information in the Immuta metadata database. Once the user, Taylor, is subscribed to s3://research-data/*, Immuta calls the Access Grants API to create a grant for that user to allow them to read and write S3 data in that bucket:

Integration health status

The status of the integration is visible on the integrations tab of the Immuta application settings page. If errors occur in the integration, a banner will appear in the Immuta UI with guidance for remediating the error.

The definitions for each status and the state of configured data platform integrations is available in the . However, the UI consolidates these error statuses and provides detail in the error messages.

Accessing S3 data

To access S3 data registered in Immuta, users must be subscribed to the prefix, bucket, or object in Immuta, and their principals must be . Once users are subscribed, they request temporary credentials from S3 Access Grants. Access Grants looks up the grant ID associated with the requester. If no matching grant exists, they receive an access denied error. If one exists, Access Grants assumes the IAM role associated with the location and requests temporary credentials that are scoped to the prefix, bucket, or object and permissions specified by the individual grant. Access Grants vends the credentials to the requester, who uses those temporary credentials to access the data in S3.

In the example below, Taylor requests temporary credentials from S3 Access Grants. Access Grants looks up the grant ID (1) for that user, assumes the arn:aws:iam::123456:role/access-grants-role IAM role for the location, and vends temporary credentials to Taylor, who then uses the credentials to access the research-data bucket in S3:

Note that when accessing data through S3 Access Grants, the user or application interacts directly with the Access Grants API to request temporary credentials; Immuta does not act in this process at all. See the diagram below for an illustration of the process for accessing data through S3 Access Grants.

AWS services that support S3 Access Grants will request temporary credentials for users automatically. If users are not using a service that supports S3 Access Grants, they must have the to to request temporary credentials to access data through the access grant.

For a list of AWS services that support S3 Access Grants, see the .

Policy enforcement

Immuta's S3 integration allows data owners and governors to apply object-level access controls on data in S3 through subscription policies. When a user is subscribed to a registered prefix, bucket, or object, Immuta calls the Access Grants API to create an individual grant that narrows the scope of access within the location to that registered prefix, bucket, or object. See the diagram below for a visualization of this process.

When a user's entitlements change or a subscription policy is added to, updated, or deleted from a prefix, Immuta performs one of the following processes for each user subscribed to the registered prefix:

User added to the prefix: Immuta specifies a permission (READ or READWRITE) for each user and uses the Access Grants API to create an individual grant for each user.
User updated: Immuta deletes the current grant ID and creates a new one using the Access Grants API.
User deleted: Immuta deletes the grant ID using the Access Grants API.

Immuta offers two to manage read and write access to data in S3:

Read access policies manage who can get objects from S3.
Write access policies manage who can modify data in S3.

Data policies, which provide more granular controls by redacting or masking values in a table, are not supported for S3.

Prefix registration

Data owners can register an S3 prefix at any level in the S3 path by . During this process, Immuta stores the connection information for use in .

Each prefix added in the data registration workflow is created as a single Immuta data source, and a subscription policy added to a data source applies to any objects in that bucket or beginning with that prefix:

Therefore, data owners should register prefixes or buckets at the lowest level of access control they need for that data. Using the example above, if the data owner needed to allow different users to access s3://yellow-bucket/research-data/* than those who should access s3://yellow-bucket/analyst-data/*, the data owner must register the research-data/* and analyst-data/* prefixes separately and then apply a subscription policy to those prefixes:

Deleting registered prefixes

When an S3 data source is deleted, Immuta deletes all the grants associated with that prefix, bucket, or object in that location.

User provisioning

Access can be managed in AWS using IAM users, roles, or Identity Center (IDC). Immuta for user provisioning in the S3 integration.

However, if you manage access in AWS through IAM roles instead of users, user provisioning in Immuta must be done using IAM role principals. This means that if users share IAM roles, you could end up in a situation where you over-provision access to everyone in the IAM role.

See the guidelines below for the best practices to avoid this behavior if you currently use IAM roles to manage access.

Enable (recommended): IDC is the best approach for user provisioning because it treats users as users, not users as roles. Consequently, access controls are enforced for the querying user, nothing more. This approach eliminates over-provisioning and permits granular access control. Furthermore, IDC uses trusted identity propagation, meaning AWS propagates a user's identity wherever that user may operate within the AWS ecosystem. As a result, a user's identity always remains known and consistent as they navigate across AWS services, which is a key requirement for organizations to properly govern that user. Enabling IDC does not impact any existing access controls; it is additive. Immuta will manage the GRANTs for you using IDC if it is enabled and configured in Immuta. See the for instructions on mapping users from AWS IDC to user accounts in Immuta.
Create an IAM role per user: If you do not have IDC enabled, create an IAM role per user that is unique to that user and assign that IAM role to each corresponding user in Immuta. Ensure that the IAM role cannot be shared with other users. This approach can be a challenge because there is an .

Mapping IAM principals in Immuta

Names are case-sensitive

The IAM role name and IAM user name are case-sensitive. See the for details.

Immuta supports mapping an Immuta user to AWS in one of the following ways:

: Only a single Immuta user can be mapped to an IAM role. This restriction prohibits enforcing policies on AWS users who could assume that role. Therefore, if using role principals, create a new user in Immuta that represents the role so that the role then has the permissions applied specifically to it.

See the for instructions on mapping principals to user accounts in Immuta.

Existing S3 integrations

The Amazon S3 integration will not interfere with existing legacy S3 integrations, and multiple S3 integrations can exist in a single Immuta tenant.

Supported AWS services

For a list of AWS services that support S3 Access Grants, see the .

Limitations

During private preview, Immuta supports up to 500 prefixes (data sources) and up to 20 Immuta users that are mapped to S3 identities principals. This is a preview limitation that will be removed in a future phase of the integration.
S3 Access Grants allows 100,000 grants per region per account. Thus, if you have 5 Immuta users with access to 20,000 registered prefixes, you would reach this limit. for details.
The following Immuta features are not currently supported by the integration in private preview:

Databricks Unity Catalog Integration Reference Guide

Immuta’s integration with Unity Catalog allows you to enforce fine-grained access controls on Unity Catalog securable objects with Immuta policies. Instead of manually creating UDFs or granting access to each table in Databricks, you can author your policies in Immuta and have Immuta manage and orchestrate Unity Catalog access-control policies on your data in Databricks clusters or SQL warehouses:

Subscription policies: Immuta subscription policies automatically grant and revoke access to specific Databricks securable objects.
Data policies: Immuta data policies enforce row- and column-level security.

Unity Catalog object model

Unity Catalog uses the following hierarchy of data objects:

Metastore: Created at the account level and is attached to one or more Databricks workspaces. The metastore contains metadata of all the catalogs, schemas, and tables available to query. All clusters on that workspace use the configured metastore and all workspaces that are configured to use a single metastore share those objects.
Catalog: Sits on top of schemas (also called databases) and tables to manage permissions across a set of schemas
Schema: Organizes tables and views

For details about the Unity Catalog object model, see the .

Feature support

The Databricks Unity Catalog integration supports

:
- applying column masks and row filters on specific securable objects
- applying subscription policies on tables and views

What does Immuta do in my Databricks environment?

Unity Catalog supports managing permissions account-wide in Databricks through controls applied directly to objects in the metastore. To establish a connection with Databricks and apply controls to securable objects within the metastore, Immuta requires a service principal with privileges to manage all data protected by Immuta. (OAuth M2M) or a personal access token (PAT) can be provided for Immuta to authenticate as the service principal. See the for a list of specific Databricks privileges.

Immuta uses this service principal to run queries that set up user-defined functions (UDFs) and other data necessary for policy enforcement. Upon enabling the integration, Immuta will create a catalog that contains these schemas:

immuta_system: Contains internal Immuta data.
immuta_policies_n: Contains policy UDFs.

When policies require changes to be pushed to Unity Catalog, Immuta updates the internal tables in the immuta_system schema with the updated policy information. If necessary, new UDFs are pushed to replace any out-of-date policies in the immuta_policies_n schemas and any row filters or column masks are updated to point at the new policies. Many of these operations require compute on the configured Databricks cluster or SQL warehouse, so compute must be available for these policies to succeed.

Workspace-catalog binding

Workspace-catalog binding allows users to leverage Databricks’ catalog isolation mode to limit catalog access to specific Databricks workspaces. The default isolation mode is OPEN, meaning all workspaces can access the catalog (with the exception of the automatically-created ), provided they are in the metastore attached to the catalog. Setting this mode to ISOLATED allows the catalog owner to specify a workspace-catalog binding, which means the owner can dictate which workspaces are authorized to access the catalog. This prevents other workspaces from accessing the specified catalogs. To bind a catalog to a specific workspace in Databricks Unity Catalog, see the .

Use cases

Typical use cases for binding a catalog to specific workspaces include

Ensuring users can only access production data from a production workspace environment.
For example, you may have production data in a prod_catalog, as well as a production workspace you are introducing to your organization. Binding the prod_catalog to the prod_workspace ensures that workspace admins and users can only access prod_catalog from the prod_workspace environment.
Ensuring users can only process sensitive data from a specific workspace. Limiting the environments from which users can access sensitive data helps better secure your organization’s data. Limiting access to one workspace also simplifies any monitoring, auditing, and understanding of which users are accessing specific data. This would entail a similar setup as the example above.

Additional workspace connections

Immuta’s Databricks Unity Catalog integration allows users to configure additional workspace connections to support using Databricks' . Users can configure additional workspace connections in their Immuta integrations to be consistent with the workspace-catalog bindings that are set up in Databricks. Immuta will use each additional workspace connection to govern the catalog(s) that workspace is bound to in Databricks. If desired, each set of bound catalogs can also be configured to run on its own compute.

To use this feature, you should first . Once that is configured, you can use Immuta's Integrations API to configure an additional workspace connection. This can be added when you or by .

Limitations

Additional workspace connections in Databricks Unity Catalog are not currently supported in Immuta's .
Each additional workspace connection must be in the same metastore as the primary workspace used to set up the integration.
No two additional workspace connections can be responsible for the same catalog.

Databricks Unity Catalog privileges

The privileges the Databricks Unity Catalog integration requires align to the least privilege security principle. The table below describes each privilege required in Databricks Unity Catalog for the setup user and the Immuta service principal.

Databricks Unity Catalog privilege

User requiring the privilege

Explanation

Policy enforcement

Immuta’s Unity Catalog integration applies Databricks table-, row-, and column-level security controls that are enforced natively within Databricks. Immuta's management of these Databricks security controls is automated and ensures that they synchronize with Immuta policy or user entitlement changes.

Table-level security: Immuta manages and privileges on Databricks securable objects that have been registered as Immuta data sources. When you register a data source in Immuta, Immuta uses the Unity Catalog API to issue GRANTS or REVOKES against the catalog, schema, or table in Databricks for every user registered in Immuta.
Row-level security: Immuta applies SQL UDFs to restrict access to rows for querying users.

Policy behavior

If you enable a Databricks Unity Catalog object in Immuta and it has no subscription policy set on it, Immuta will REVOKE access to that object in Databricks for all Immuta users, even if they had been directly granted access to that table outside of Immuta.

If you disable a Unity Catalog data source in Immuta, all existing grants and policies on that object will be removed in Databricks for all Immuta users. All existing grants and policies will be removed, regardless of whether they were set in Immuta or in Unity Catalog directly.

If a user is not registered in Immuta, Immuta will have no effect on that user's access to data in Unity Catalog.

Supported policies

The Unity Catalog integration supports the following policy types:

- Conditional masking
- Constant

Project-scoped purpose exceptions for Databricks Unity Catalog

Project-scoped purpose exceptions for Databricks Unity Catalog integrations allow you to apply to Databricks data sources in a project. As a result, users can only access that data when they are working within that specific project.

Databricks Unity Catalog views

If you are using views in Databricks Unity Catalog, one of the following must be true for project-scoped purpose exceptions to apply to the views in Databricks:

The view and underlying table are registered as Immuta data sources and added to a project: If a view and its underlying table are both added as Immuta data sources, both of these assets must be added to the project for the project-scoped purpose exception to apply. If a view and underlying table are both added as data sources but the table is not added to an Immuta project, the purpose exception will not apply to the view because Databricks does not support fine-grained access controls on views.
Only the underlying table is registered as an Immuta data source and added to a project: If only the underlying table is registered as an Immuta data source but the view is not registered, the purpose exception will apply to both the table and corresponding view in Databricks. Views are the only Databricks object that will have Immuta policies applied to them even if they're not registered as Immuta data sources (as long as their underlying tables are registered).

Masked joins for Databricks Unity Catalog

This feature allows masked columns to be joined across data sources that belong to the same project. When data sources do not belong to a project, Immuta uses a unique salt per data source for hashing to prevent masked values from being joined. (See the guide for an explanation of that behavior.) However, once you add Databricks Unity Catalog data sources to a project and enable masked joins, Immuta uses a consistent salt across all the data sources in that project to allow the join.

For more information about masked joins and enabling them for your project, see the of documentation.

Policy exemption group

The Databricks group configured as the policy exemption group in Immuta will be exempt from Immuta data policy enforcement. This account-level group is created and managed in Databricks, not in Immuta.

If you have service or system accounts that need to be exempt from masking and row-level policy enforcement, add them to an account-level group in Databricks and include this group name in the Databricks Unity Catalog configuration in Immuta. Then, group members will be excluded from having data policies applied to them when they query Immuta-protected tables in Databricks.

Typically, service or system accounts that perform the following actions are added to an exemption group in Databricks:

Automated queries
ETL
Report generation

If you have multiple groups that must be exempt from data policies, add each group to a single group in Databricks that you then set as the policy exemption group in Immuta.

The service principal used to register data sources in Immuta will be automatically added to the exemption group for the Databricks securables it registers. Consequently, accounts added to the exemption group and used to register data sources in Immuta should be limited to service accounts.

For guidance on configuring a policy exemption group on the Immuta app settings page, see the . Alternatively, this group can be configured via the or the using the groupPattern object.

Policy support with `hive_metastore`

When enabling Unity Catalog support in Immuta, the catalog for all Databricks data sources will be updated to point at the default hive_metastore catalog. Internally, Databricks exposes this catalog as a proxy to the workspace-level Hive metastore that schemas and tables were kept in before Unity Catalog. Since this catalog is not a real Unity Catalog catalog, it does not support any Unity Catalog policies. Therefore, Immuta will ignore any data sources in the hive_metastore in any Databricks Unity Catalog integration, and policies will not be applied to tables there.

However, with you can use hive_metastore and enforce subscription and data policies with the .

Authentication methods

The Databricks Unity Catalog integration supports the following authentication methods to configure the integration and create data sources:

Personal access token (PAT): This is the access token for the Immuta service principal. This service principal must have the metastore privileges listed in the section for the metastore associated with the Databricks workspace. If this token is configured to expire, update this field regularly for the integration to continue to function.
OAuth machine-to-machine (M2M): Immuta uses the to integrate with , which allows Immuta to authenticate with Databricks using a client secret. Once Databricks verifies the Immuta service principal’s identity using the client secret, Immuta is granted a temporary OAuth token to perform token-based authentication in subsequent requests. When that token expires (after one hour), Immuta requests a new temporary token. See the for more details.

Integration health status

Immuta data sources in Unity Catalog

The Unity Catalog data object model introduces a 3-tiered namespace, as . Consequently, your Databricks tables registered as data sources in Immuta will reference the catalog, schema (also called a database), and table.

Supported object types

When applying read and write access subscription policies to data sources, the privileges granted by Immuta vary depending on the object type. See an outline of privileges granted by Immuta on the .

Object type

Subscription policy support

Data policy support

External data connectors and query-federated tables

External data connectors and query-federated tables are preview features in Databricks. See the for details about the support and limitations of these features before registering them as data sources in the Unity Catalog integration.

Query audit

Access requirements

For Databricks Unity Catalog audit to work, Immuta must have, at minimum, the following access.

USE CATALOG on the system catalog

The Databricks Unity Catalog integration audits all user queries run in the integration's clusters or SQL warehouses. See the for details about the contents of the logs.

The audit ingest is set when and can be scoped to only ingest specific workspaces if needed. The default ingest frequency is every hour, but this can be configured to a different frequency on the . Additionally, audit ingestion can be manually requested at any time from the Immuta audit page. When manually requested, it will only search for new queries that were created since the last query that had been audited. The job is run in the background, so the new queries will not be immediately available.

Databricks on AWS GovCloud limitation

Users running Databricks Unity Catalog on AWS GovCloud in order to leverage Immuta's query audit capability.

As a result, when creating a Databricks Unity Catalog connection, you must create the connection using the and set audit to false in the payload. This will allow the access permission checks for query audit to skip. If you do not update the audit setting, these checks will not pass and you will not be able to create a connection.

Tag ingestion

Design partner preview: This feature is available to select accounts. Reach out to your Immuta representative to enable this feature.

You can enable tag ingestion to allow Immuta to ingest Databricks Unity Catalog table and column tags so that you can use them in Immuta policies to enforce access controls. When you enable this feature, Immuta uses the credentials and connection information from the Databricks Unity Catalog integration to pull tags from Databricks and apply them to data sources as they are registered in Immuta. If Databricks data sources preexist the Databricks Unity Catalog tag ingestion enablement, those data sources will automatically sync to the catalog and tags will apply. Immuta checks for changes to tags in Databricks and syncs Immuta data sources to those changes every 24 hours.

Once external tags are applied to Databricks data sources, those tags can be used to create and .

To enable Databricks Unity Catalog tag ingestion, see the page.

Syncing tag changes

After making changes to tags in Databricks, you can so that the changes immediately apply to the data sources in Immuta. Otherwise, tag changes will automatically sync within 24 hours.

When syncing data sources to Databricks Unity Catalog tags, Immuta pulls the following information:

Table tags: These tags apply to the table and appear on the data source overview tab. Databricks tags' key and value pairs are reflected in Immuta as a hierarchy with each level separated by a . delimiter. For example, the Databricks Unity Catalog tag Location: US would be represented as Location.US in Immuta.
Column tags: These tags are applied to data source columns and appear on the columns listed in the data dictionary tab. Databricks tags' key and value pairs are reflected in Immuta as a hierarchy with each level separated by a . delimiter. For example, the Databricks Unity Catalog tag Location: US would be represented as Location.US

Limitations

Only tags that apply to Databricks data sources in Immuta are available to build policies in Immuta. Immuta will not pull tags in from Databricks Unity Catalog unless those tags apply to registered data sources.
Cost implications: Tag ingestion in Databricks Unity Catalog requires compute resources. Therefore, having many Databricks data sources or frequently manually syncing data sources to Databricks Unity Catalog may incur additional costs.
Databricks Unity Catalog tag ingestion only supports tenants with fewer than 2,500 data sources registered.

Configuration requirements

for a list of requirements.

Unity Catalog caveats

Row access policies with more than 1023 columns are unsupported. This is an underlying limitation of UDFs in Databricks. Immuta will only create row access policies with the minimum number of referenced columns. This limit will therefore apply to the number of columns referenced in the policy and not the total number in the table.
If you disable table grants, Immuta revokes the grants. Therefore, if users had access to a table before enabling Immuta, they’ll lose access.
If multiple Immuta tenants are connected to your Databricks environment, you must create a separate Immuta catalog for each of those tenants during configuration. Having multiple Immuta tenants use the same Immuta catalog causes failures in policy enforcement.

Azure Databricks Unity Catalog limitation

If a registered data source is owned by a Databricks group at the table level, then the Unity Catalog integration cannot apply data masking policies to that table in Unity Catalog.

Therefore, set all table-level ownership on your Unity Catalog data sources to an individual user or service principal instead of a Databricks group. Catalogs and schemas can still be owned by a Databricks group, as ownership at that level doesn't interfere with the integration.

Feature limitations

The following features are currently unsupported:

Immuta project workspaces
Multiple IAMs on a single cluster
Row filters and column masking policies on the following object types:
- Functions

Known issue

Snippets for Databricks data sources may be empty in the Immuta UI.

Snowflake Integration

Snowflake Enterprise Edition required

In this integration, Immuta manages access to Snowflake tables by administering Snowflake row access policies and column masking policies on those tables, allowing users to query tables directly in Snowflake while dynamic policies are enforced.

Like with all Immuta integrations, Immuta can inject its ABAC model into policy building and administration to remove policy management burden and significantly reduce role explosion.

How the integration works

When an administrator configures the Snowflake integration with Immuta, Immuta creates an IMMUTA database and schemas (immuta_procedures, immuta_policies, and immuta_functions) within Snowflake to contain policy definitions and user entitlements. Immuta then creates a system role and gives that system account the privileges required to orchestrate policies in Snowflake and maintain state between Snowflake and Immuta. See the for a list of privileges, the user they must be granted to, and an explanation of why they must be granted.

Data flow

An Immuta application administrator and registers Snowflake warehouse and databases with Immuta.
Immuta creates a database inside the configured Snowflake warehouse that contains Immuta policy definitions and user entitlements.
A data owner .
If was enabled during the configuration, Immuta uses the host provided in the configuration and ingests internal tags on Snowflake tables registered as Immuta data sources.

Policy enforcement

When Immuta users create policies, they are then pushed into the Immuta database within Snowflake; there, the Immuta system account orchestrates Snowflake and directly onto Snowflake tables. Changes in Immuta policies, user attributes, or data sources trigger webhooks that keep the Snowflake policies up-to-date.

For a user to query Immuta-protected data, they must meet two qualifications:

They must be subscribed to the Immuta data source.
They must be granted SELECT access on the table by the Snowflake object owner or automatically via the .

After a user has met these qualifications they can query Snowflake tables directly.

See the integration support matrix on the for a list of supported data policy types in Snowflake.

Comply with column length and precision requirements in a Snowflake masking policy

When a user applies a masking policy to a Snowflake data source, Immuta truncates masked values to align with Snowflake column length ( types) and precision ( types) requirements.

Consider these columns in a data source that have the following masking policies applied:

Column A (VARCHAR(6)): Mask using hashing for everyone
Column B (VARCHAR(5)): Mask using a constant REDACTED for everyone
Column C (VARCHAR(6)): Mask by making null for everyone
Column D (NUMBER(3, 0)): Mask by rounding to the nearest 10 for everyone

Querying this data source in Snowflake would return the following values:

Hashing collisions

Hashing collisions are more likely to occur across or within Snowflake columns restricted to short lengths, since Immuta truncates the hashed value to the limit of the column. (Hashed values truncated to 5 characters have a higher risk of collision than hashed values truncated to 20 characters.) Therefore, avoid applying hashing policies to Snowflake columns with such restrictions.

For more details about Snowflake column length and precision requirements, see the documentation.

Query performance

When a policy is applied to a column, Immuta uses to cache the result of the called function. Then, when a user queries a column that has that policy applied to it, Immuta uses that cached result to dramatically improve query performance.

Snowflake privileges

The privilege grants the Snowflake integration requires align to the least privilege security principle. The table below describes each privilege required in Snowflake for the setup user, the IMMUTA_SYSTEM_ACCOUNT user, or the metadata registration user. The references to IMMUTA_DB , IMMUTA_WH, and IMMUTA_IMPERSONATOR_ROLE in the table can be replaced with what you chose for the name of your Immuta database, warehouse, and impersonation role when setting up the integration, respectively.

Snowflake privilege

User requiring privilege

Features

Explanation

Integration health status

Registering data sources

Register Snowflake data sources using a dedicated Snowflake role. Avoid using individual user accounts for data source onboarding. Instead, create a service account (Snowflake user account TYPE=SERVICE) with SELECT access for onboarding data sources. No policies will apply to that role, ensuring that your integration works with the following use cases:

: Snowflake workspaces generate static views with the credentials used to register the table as an Immuta data source. Those tables must be registered in Immuta by an excepted role so that policies applied to the backing tables are not applied to the project workspace views.
Using views and tables within Immuta: Because this integration uses Snowflake governance policies, users can register tables and views as Immuta data sources. However, if you want to register views and apply different policies to them than their backing tables, the owner of the view must be an ; otherwise, the backing table’s policies will be applied to that view.

Snowflake bulk data source creation

Private preview: This feature is available to select accounts. Contact your Immuta representative to enable this feature.

Bulk data source creation is the more efficient process when loading more than 5000 data sources from Snowflake and allows for data sources to be registered in Immuta before running identification or applying policies.

To use this feature, see the .

Resource allocations

Based on performance tests that create 100,000 data sources, the following minimum resource allocations need to be applied to the appropriate pods in your Kubernetes environment for successful bulk data source creation.

Web

Database

Limitations

Performance gains are limited when enabling identification at the time of data source creation.
External catalog integrations are not recognized during bulk data source creation. Users must manually trigger a catalog sync for tags to appear on the data source through the data source's health check.

Excepted roles/users

Excepted roles and users are assigned when the integration is installed, and no policies will apply to these users' queries, despite any Immuta policies enforced on the tables they are querying. Credentials used to register a data source in Immuta will be automatically added to this excepted list for that Snowflake table. Consequently, roles and users added to this list and used to register data sources in Immuta should be limited to service accounts.

Immuta excludes the listed roles and users from policies by wrapping all policies in a CASE statement that will check if a user is acting under one of the listed usernames or roles. If a user is, then the policy will not be acted on the queried table. If the user is not, then the policy will be executed like normal. Immuta does not distinguish between role and username, so if you have a role and user with the exact same name, both the user and any user acting under that role will have full access to the data sources and no policies will be enforced for them.

Authentication methods

The Snowflake integration supports the following authentication methods to configure the integration and create data sources:

Username and password: Users can authenticate with their Snowflake username and password.
Key pair: Users can authenticate with a .
Snowflake External OAuth: Users can authenticate with .

Snowflake External OAuth

Immuta's OAuth authentication method uses the to integrate with Snowflake External OAuth. When a user configures the Snowflake integration or connects a Snowflake data source, Immuta uses the token credentials (obtained using a certificate or passing a client secret) to craft an authenticated access token to connect with Snowflake. This allows organizations that already use Snowflake External OAuth to use that secure authentication with Immuta.

Workflow

An Immuta application administrator configures the Snowflake integration or creates a data source.
Immuta creates a custom token and sends it to the authorization server.
The authorization server confirms the information sent from Immuta and issues an access token to Immuta.
Immuta sends the access token it received from the authorization server to Snowflake.

Supported Snowflake feature

The Immuta Snowflake integration supports . However, you cannot add a masking policy to an external table column while creating the external table in Snowflake because masking policies cannot be attached to virtual columns.

Supported object types

When applying read and write access subscription policies to data sources, the privileges granted by Immuta vary depending on the object type. See an outline of privileges granted by Immuta on Snowflake object types on the .

Object type

Subscription policy support

Data policy support

Supported Immuta features

The Snowflake integration supports the Immuta features outlined below. Click the links provided for more details.

: Users can have additional write access in their integration using project workspaces.
: Immuta automatically ingests Snowflake object tags from your Snowflake instance and adds them to the appropriate data sources.
User impersonation: Impersonation allows users to query data as another Immuta user. Impersonation is not supported in Snowflake if or is enabled. To enable user impersonation, see the page.

Immuta project workspaces

Immuta system account required Snowflake privileges

CREATE [OR REPLACE] PROCEDURE
DROP ROLE

Users can have additional write access in their integration using project workspaces. For more details, see the page.

Caveat

To use project workspaces with the Snowflake integration, the default role of the account used to create data sources in the project must be added to the "Excepted Roles/Users List." If the role is not added, you will not be able to query the equalized view using the project role in Snowflake.

Tag ingestion

You can enable Snowflake tag ingestion so that Immuta will ingest Snowflake object tags from your Snowflake instance into Immuta and add them to the appropriate data sources.

The Snowflake tags' key and value pairs will be reflected in Immuta as two levels: the key will be the top level and the value the second. As Snowflake tags are hierarchical, Snowflake tags applied to a database will also be applied to all of the schemas in that database, all of the tables within those schemas, and all of the columns within those tables. For example: If a database is tagged PII, all of the tables and columns in that database will also be tagged PII.

To enable Snowflake tag ingestion, see the page.

Credentials

If you want all Snowflake data sources to have Snowflake data tags ingested into Immuta, ensure the credentials provided on the for the external catalog feature can access all the data sources registered in Immuta. Any data sources the credentials do not have access to will not be tagged in Immuta. In practice, it is recommended to just use the same credentials for the and tag ingestion.

Caveats

Snowflake has some . If you manually refresh the governance page to see all tags created globally, users can experience a delay of up to two hours. However, if you run schema detection or a health check to find where those tags are applied, the delay will not occur because Immuta will only refresh tags for those specific tables.

Query audit

The Snowflake integration audits Immuta user queries run in the integration's warehouses by running a query in Snowflake to retrieve user query histories. Those histories are then populated into audit logs. See the for details about the contents of the logs.

The audit ingest is set when . The default ingest frequency is every hour, but this can be configured to a different frequency on the . Additionally, audit ingestion can be manually requested at any time from the Immuta audit page. When manually requested, it will only search for new queries that were created since the last query that had been audited. The job is run in the background, so the new queries will not be immediately available.

Multiple Snowflake instances

A user can to a single Immuta tenant and use them dynamically or with workspaces.

Caveats

There can only be one integration connection with Immuta per host.
The host of the data source must match the host of the integration for the view to be created.
Projects can only be configured to use one Snowflake host.

Limitations

If there are errors in generating or applying policies natively in Snowflake, the data source will be locked and only users on the and the credentials used to create the data source will be able to access the data.
Once a Snowflake integration is disabled in Immuta, the user must remove the access that was granted in Snowflake. If that access is not revoked, users will be able to access the raw table in Snowflake.
Migration must be done using the credentials and credential method (automatic or bootstrap) used to configure the integration.

Custom WHERE clause limitations

The Immuta Snowflake integration uses Snowflake governance features to let users query data natively in Snowflake. This means that Immuta also inherits some Snowflake limitations using correlated subqueries with and . These limitations appear when writing , but do not remove the utility of row-level policies.

Requirements for a custom WHERE policy

All column names must be fully qualified: Any column names that are unqualified (i.e., just the column name) will default to a column of the data source the policy is being applied to (if one matches the name).
The Immuta system account must have SELECT privileges on all tables/views referenced in a subquery: The Immuta system role name is specified by the user, and the role is created when the Snowflake instance is integrated.

Subquery limitations

Any subqueries that error in Snowflake will also error in Immuta.

Including one or more subqueries in the Immuta policy condition may cause errors in Snowflake. If an error occurs, it may happen during policy creation or at query-time. To avoid these errors, limit the number of subqueries, limit the number of JOIN operations, and simplify WHERE clause conditions.
For more information on the Snowflake subquery limitations see

Configure Starburst (Trino) Integration

The plugin comes pre-installed with Starburst Enterprise, so this page provides separate sets of guidelines for configuration:

Starburst Cluster Configuration: These instructions are specific to Starburst Enterprise clusters.
Trino Cluster Configuration: These instructions are specific to open-source Trino clusters.

Starburst Cluster Configuration

Requirement

A valid .

Starburst does not support using Starburst built-in access control (BIAC) concurrently with any other access control providers such as Immuta. If Starburst BIAC is in use, it must be disabled to allow Immuta to enforce policies on cluster.

1 - Enable the Integration

Click the App Settings icon in the left sidebar.
Click the Integrations tab.
Click Add Integration and select Trino from the Integration Type dropdown menu.
Click Save.

OAuth Authentication

If you are using OAuth or asynchronous authentication to create Starburst (Trino) data sources and you encounter , configure the globalAdminUsername property in the advanced configuration section of the Immuta app settings page.

Click the App Settings page icon.
Click Advanced Settings and scroll to Advanced Configuration.
Paste the following YAML configuration snippet in the text box, replacing the email address below with your admin username:

2 - Configure the Immuta System Access Control Plugin in Starburst

Default configuration property values

If you use the default property values in the configuration file described in this section,

you will give users read and write access to tables that are not registered in Immuta and
results for SHOW

TLS Certificate Generation

If you provided your own TLS certificates during Immuta installation, you must ensure that the hostname in your certificate matches the hostname specified in the Starburst (Trino) configuration.

If you did not provide your own TLS certificates, Immuta generated these certificates for you during installation. See notes about your specific deployment method below for details.

Create the Immuta access control configuration file in the Starburst configuration directory (/etc/starburst/immuta-access-control.properties for Docker installations or <starburst_install_directory>/etc/immuta-access-control.properties for standalone installations).
The table below describes the properties that can be set during configuration.
Property
Starburst version
Required or optional
Description

Example Immuta System Access Control Configuration

The example configuration snippet below uses the default configuration settings for immuta.allowed.immuta.datasource.operations and immuta.allowed.non.immuta.datasource.operations, which allow read access for data registered as Immuta data sources and read and write access on data that is not registered in Immuta. See the for details about customizing and enforcing read and write access controls in Starburst.

3 - Add Starburst Users to Immuta

to add users to Immuta.
when configuring your IAM (or map usernames manually) to Immuta.
- All Starburst users must map to Immuta users or match the immuta.user.admin regex configured on the cluster, and their Starburst username must be mapped to Immuta so they can query policy-enforced data.

4 - Register data

Trino Cluster Configuration

1 - Enable the Integration

Click the App Settings icon in the left sidebar.
Click the Integrations tab.
Click Add Integration and select Trino from the dropdown menu.
Click Save.

OAuth Authentication

Click the App Settings page icon.
Click Advanced Settings and scroll to Advanced Configuration.
Paste the following YAML configuration snippet in the text box, replacing the email address below with your admin username:

2 - Configure the Immuta System Access Control Plugin in Trino

Default configuration property values

If you use the default property values in the configuration file described in this section,

you will give users read and write access to tables that are not registered in Immuta and
results for SHOW

TLS Certificate Generation

If you provided your own TLS certificates during Immuta installation, you must ensure that the hostname in your certificate matches the hostname specified in the Starburst (Trino) configuration.

If you did not provide your own TLS certificates, Immuta generated these certificates for you during installation. See notes about your specific deployment method below for details.

The Immuta Trino plugin version matches the version of the corresponding Trino releases. For example, the Immuta plugin version supporting Trino version 403 is simply version 403. Navigate to the for a list of supported Trino versions. Immuta follows , but you can contact your Immuta representative for a specific Trino OSS release.
Download the assets for the release that corresponds to your Trino version.
Enable Immuta on your cluster. Select the tab below that corresponds to your installation method for instructions:

Docker installations

Follow to install the plugin archive on all nodes in your cluster.
Create the Immuta access control configuration file in the Trino configuration directory: /etc/trino/immuta-access-control.properties.

immuta-trino Docker image

Configure the properties described in the table below.

Property

Trino version

Required or optional

Description

Enable the Immuta access control plugin in Trino's configuration file (/etc/trino/config.properties for Docker installations or <trino_install_directory>/etc/config.properties for standalone installations). For example,

Example Immuta System Access Control Configuration

3 - Add Trino Users to Immuta

to add users to Immuta.
when configuring your IAM (or map usernames manually) to Immuta.
- All Trino users must map to Immuta users or match the immuta.user.admin regex configured on the cluster, and their Trino username must be mapped to Immuta so they can query policy-enforced data.