1 of 5

General Immuta Configuration

Who is this for?

This guide is for anyone ready to begin their journey with Immuta who isn't using Snowflake. Specifically, it provides details for configuring Immuta that must be accomplished before moving on to the Secure your data use cases.

If you are using Snowflake with Immuta, see the Monitor and secure sensitive data platform query activity use case.

Goals

This use case provides the basics for setting up Immuta and helps you understand the logic and decisions behind the tasks.

Configuration steps

Follow these steps to configure Immuta:

Configure your users in Immuta, using the user identity best practices.
Read the native integration architecture overview and connect Immuta to your databases. Consider the Databricks roles best practices if using Databricks.
Register data sources in order to start using Immuta features on the data sources.

User Identity Best Practices

This guide outlines best practices for managing user identities in Immuta with your identity manager, such as Active Directory and Okta.

Use your identity manager for users, groups, and attributes

Reusing information you have already captured today is a good practice. A lot of information about users is in your identity manager platform and can be used in Immuta for user onboarding and policies.

All users protected by Immuta must be registered in Immuta, even though people might not log in to Immuta.

Logging in to Immuta

SAML is commonly used as a single sign-on mechanism for users to log in to Immuta. This means you can use your organization's SSO, which complies with your security standards.

Keeping in sync and user provisioning

Every user that will be protected by Immuta needs to have a user on the platform to enforce policy, regardless of if they are logging in to Immuta. SCIM should be used to provision users from your identity manager platforms to Immuta automatically. The advantage here is that not all end-users need to log in to Immuta to create their accounts, and updates in your identity manager will be automatically reflected in Immuta, hence updating the access in your platforms.

Details on how to configure your individual identity manager's protocols can be found here:

There are several different combinations of supported protocol configurations, so consider those as you plan your user synchronization.

Immuta permissions

In Immuta, permissions control what actions a user is allowed to take through the API and UI. The different permissions can be found in the Immuta permissions guide.

We recommend using identity manager groups to manage permissions. When you configure the identity manager integration, you can enable group permissions. This allows you to control the permissions via identity manager groups and use the group assignment and approval process currently in place.

Native Integration Architecture

Immuta is not just a location to define your policy logic; Immuta also enforces that logic in your data platform. How that occurs varies based on each data platform, but the overall architecture remains consistent and follows the NIST Zero Trust framework. The below diagram describes the recommended architecture from NIST:

Immuta lives in the middle control plane. To do this, Immuta knows details about the subjects and enterprise resources, acts as the policy decision point through policies administered by policy administrators, and makes real-time policy decisions using the internal Immuta policy engine.

Lastly, and of importance to how Immuta Secure functions, Immuta also enables the policy enforcement point by administering the policies natively in your data platform in a way that can react to policy changes and live queries.

Configuring the native integration

To use Immuta, you must configure the Immuta native integration, which will require some level of privileged access to administer policies in your data platform, depending on your data platform and how the Immuta integration works. If using Databricks, please refer to Databricks roles best practices for Databricks before configuring the native integration.

Databricks Roles Best Practices

Intermingling your pre-existing roles in Databricks with Immuta can be confusing at first. Below outlines some best practices on how to think about roles in each platform.

Access to data, platform permissions, and the ability to use clusters and data warehouses are controlled in Databricks Unity Catalog with permissions to individual users or groups. Immuta can control those permissions to grant users permission to read data based on subscription policies.

This section discusses best practices for Databricks Unity Catalog permissions for end-users.

Privileges structure for end-users

Users who consume data (directly in your Databricks workspace or through other applications) need permission to access objects. But permissions are also used to control write, Databricks clusters and warehouses, and other object types that can be registered in Databricks Unity Catalog.

To manage this at scale, Immuta recommends taking a 3-layer approach, where you separate the different permissions into different privileges:

Privileges for read access (Immuta managed)
Privileges for write access (optional, soon supported by Immuta)
Privileges for warehouse and clusters, internal billing

Since Immuta leverages native Databricks Unity Catalog GRANTs, you can combine Immuta’s grants with grants done manually in Databricks Unity Catalog. This means you can gradually migrate to an Immuta-protected Databricks workspace.

Warehouses and clusters are granted to users to give them access to computing resources. Since this is directly tied to Databricks’ consumption model, warehouses and clusters are typically linked to cost centers for (internal) billing purposes. Immuta recommends creating a group per team/domain/cost center, applying this group for cluster/warehouse privileges, and granting this group to users using identity manager groups.

Permission structure for Immuta service principals

Immuta has two types of service accounts to connect to Databricks:

Policy role: Immuta needs to use a service principal to be able to push policies to Databricks Unity Catalog and to pull audits to Immuta (optional). This principal needs USE CATALOG and USE SCHEMA on all catalogs and schemas, and SELECT and MODIFY on all tables in the metastore managed by Immuta.
Data ownership role: You will also need a user/principal for the data source registration. A service account/principal is recommended so that when the user moves or leaves the organization, Immuta still has the proper credentials to connect to Databricks Unity Catalog. You can follow one of the two best practices:
- A central role for registration (recommended): It is recommended that you create a service role/user with SELECT permissions for all objects in your metastore. Immuta can register all the tables and views from Databricks, populate the Immuta catalog, and scan the objects for sensitive data using Immuta Discover. Immuta will not apply policy directly by default, so no existing access will be impacted.

Register Data Sources

This section offers the best practices when onboarding data sources into Immuta.

Configure your catalog

Register as much as possible

While it can be tempting to start small and register only the pieces of data that you intend to protect, you must remember that Immuta is not just about access control. It’s important to register your data metadata so that Immuta can also track activity and understand where that sensitive data lies (with Immuta Detect). In other words, Immuta can’t tell you where you have problems unless you first tell it to look at your metadata.

Without the no default subscription policy, Immuta will set each data source's subscription policy to the most restrictive option which automatically locks data down during onboarding. To unlock the data and give your users access again, new subscription policies must be set.

Let Immuta monitor for change

User Identity Best Practices

This guide outlines best practices for managing user identities in Immuta with your identity manager, such as Active Directory and Okta.

Use your identity manager for users, groups, and attributes

All users protected by Immuta must be registered in Immuta, even though people might not log in to Immuta.

Logging in to Immuta

SAML is commonly used as a single sign-on mechanism for users to log in to Immuta. This means you can use your organization's SSO, which complies with your security standards.

Keeping in sync and user provisioning

Details on how to configure your individual identity manager's protocols can be found here:

There are several different combinations of supported protocol configurations, so consider those as you plan your user synchronization.

Immuta permissions

In Immuta, permissions control what actions a user is allowed to take through the API and UI. The different permissions can be found in the Immuta permissions guide.

Databricks Roles Best Practices

Intermingling your pre-existing roles in Databricks with Immuta can be confusing at first. Below outlines some best practices on how to think about roles in each platform.

This section discusses best practices for Databricks Unity Catalog permissions for end-users.

Privileges structure for end-users

To manage this at scale, Immuta recommends taking a 3-layer approach, where you separate the different permissions into different privileges:

Privileges for read access (Immuta managed)
Privileges for write access (optional, soon supported by Immuta)
Privileges for warehouse and clusters, internal billing

Read access is managed by Immuta. By using , data access can be controlled to the table level. help you scale compared to RBAC, where access control is typically done on a schema or catalog level.

Write access is typically granted on a schema, catalog, or volume level. This makes it easy to manage in Databricks Unity Catalog through manual grants. We recommend creating groups that give INSERT, UPDATE, or DELETE permissions to a specific schema or catalog and attach this group to a user. This attachment can be done manually or using your identity manager groups. (See the for details.) Note that Immuta is working toward supporting write policies, so this will not need to be separately managed for long.

Permission structure for Immuta service principals

Immuta has two types of service accounts to connect to Databricks:

Policy role: Immuta needs to use a service principal to be able to push policies to Databricks Unity Catalog and to pull audits to Immuta (optional). This principal needs USE CATALOG and USE SCHEMA on all catalogs and schemas, and SELECT and MODIFY on all tables in the metastore managed by Immuta.
Data ownership role: You will also need a user/principal for the data source registration. A service account/principal is recommended so that when the user moves or leaves the organization, Immuta still has the proper credentials to connect to Databricks Unity Catalog. You can follow one of the two best practices:
- A central role for registration (recommended): It is recommended that you create a service role/user with SELECT permissions for all objects in your metastore. Immuta can register all the tables and views from Databricks, populate the Immuta catalog, and scan the objects for sensitive data using Immuta Discover. Immuta will not apply policy directly by default, so no existing access will be impacted.
- A service principal per domain (alternative): Alternatively, if you cannot create a service principal with SELECT permissions for all objects, you can allow the different domains or teams in the organization to use a service user/principal scoped to their data. This is delegating metadata registration and aligns well with type use cases and means every team is responsible for registering their data sets in Immuta.

Register Data Sources

In order to take advantage of all the capabilities of Immuta, you must make Immuta aware of your data metadata. This is done by with Immuta as data sources. It’s important to remember that Immuta is not reading your actual data at all; it is simply discovering your information schemas and pulling that information back as the foundation for everything else.

This section offers the best practices when onboarding data sources into Immuta.

Configure your catalog

If you have an external data catalog, like Collibra or Alation, first; then register your data in Immuta. This process will automatically tag your data with the external catalog tags as you register it.

Register as much as possible

Use Immuta's setting to onboard metadata without affecting your users' access. This means you onboard all metadata in Immuta without any impact on current accesses which gives you time to fully convert your operations to Immuta without causing unnecessary data downtime. Immuta will only take control when the first policies are applied. Because of this, register all tables.

If you are delegating the registration and control of data, then please read our use case for more information.

Let Immuta monitor for change

Use the to register a schema; then use schema monitoring to find new data sources and automatically register them.

One of the greatest benefits of a modern data platform is that you can manage all your data transformations at the data tier. This means that data is constantly changing in the data platform, which may result in the need for access control changes as well. This is why it is critical that you when registering metadata with Immuta. This will allow Immuta to constantly monitor and update for these changes.

It’s also important to understand that many data engineering tools make changes by destructively recreating tables and views, which results in all policies being dropped in the data platform. This is actually a good thing, because this gives to update the access as the changes are found (policy uptime) while the only user that can see the data being recreated is the creator of that change (data downtime for all other users). This is why schema monitoring and column detection are so critical.