Data Policies Reference Guide

Data policies manage what users see when they query data in a table they are subscribed to.

There are three different ways to restrict data access with data policies:

Row-level: Filter rows from certain users at query time.
Column masking: Mask values in a column at query time.
Cell masking: Mask specific cells in a column based on separate values in the same row at query time.

For all data policies, you establish the conditions for which they will be enforced:

If the user is a member of a group (or several groups)
If the user possesses an attribute (or several attributes)
If the user is acting under a purpose (or several purposes) for which the data is allowed to be used

Immuta allows you to append multiple conditions to the data, and these conditions can be directed as exclusionary or inclusionary, depending on the policy that's being enforced:

Exclusionary condition example: Mask using hashing values in columns tagged PII on all data sources for everyone except users in the group AUDIT.
Inclusionary condition example: Only show rows where user is a member of a group that matches the value in the column tagged Department. For all policies except purpose-based restriction policies, inclusionary logic allows governors to vary policy actions with an otherwise clause. For example, governors could mask values using hashing for users acting under a specified purpose while masking those same values by making null for everyone else who accesses the data.

Data policy support matrix

The table below outlines the types of data policies supported for various data platforms. If a data platform isn't included in the table, that integration does not support any data policies.

Details about each of the data policy types are included in their linked reference guides.

Amazon Redshift

Amazon Redshift Spectrum

Azure Synapse Analytics

Databricks Spark

Databricks Unity Catalog

Google BigQuery

Snowflake

Starburst (Trino)

Conditional masking

❌

✅

❌

✅

Custom functions for masking or row-level policies

❌

✅

Format preserving masking

❌

✅

❌

Hashing

❌

✅

Masking fields within STRUCT columns

❌

✅

Supported with caveats

❌

Minimize

❌

✅

Only show data by time

❌

✅

Only show rows (matching)

✅

Randomized response

❌

✅

❌

Regex

❌

✅

❌

✅

Replace with NULL or constant

✅

Supported with caveats

✅

Reversible masking

❌

✅

❌

✅

❌

✅

Rounding

❌

✅

WHERE clause

❌

✅

Policy behavior: conflicts, fallbacks, and lockout

Masking policy conflicts

In some cases, two conflicting global masking policies apply to a single data source. When this happens, the policy containing a tag deeper in the hierarchy will apply to the data source to resolve the conflict.

Consider the following global data policies created by a data governor:

Data policy 1:

Mask columns tagged PII by making null for everyone on data sources with columns tagged PII

Data policy 2:

Mask columns tagged PII.SSN using hashing for everyone on data sources with columns tagged PII.SSN

If a data owner creates a data source and applies the PII.SSN tag to a column, both of these global masking policies will apply to the column with that tag. Instead of having a conflict, the policy containing a deeper tag in the hierarchy will apply.

In this example, data policy 2 will be applied to the data source because PII.SSN is deeper and thus considered more specific than PII. If data owners wanted to use data policy 1 on the data source instead, they would need to disable data policy 2.

Should two or more masking policies target the same column and have the same hierarchy depth, the policy that was authored first will win out. This is a conservative approach that avoids the original policy being changed unexpectedly.

Row-level policy conflicts

Similar to masking policies, it is possible for two or more row-level policies to target the same table. When this occurs, all row-level policies will be applied and AND'ed together, meaning the user will need to meet all in some capacity to see any rows in the table at all.

To OR separate row-level policies together, build them into a single Immuta policy together with an OR.

Masking policy intelligent fallbacks

When masking columns, the type of the column matters. For example, it is not possible to hash a numeric column, because the hash would render the number as a string.

Many data platforms make the user account for this by building separate data policies for every column type that could exist now or in the future, which is quite onerous.

Instead, Immuta has intelligent fallbacks. An intelligent fallback occurs when a masking type targets a column type that is incompatible with the masking type. In this case, Immuta will fall back to the most appropriate masking type which retains the level of privacy or better required by the previous type.

For example, if a hashing masking type hits a numeric type, it would intelligently fallback to nulling the column instead, since nulls are allowed in numeric types.

Lockout policies

Sometimes a global data policy will target a table and the policy cannot be applied as written. This can happen for several reasons, but the most common is that the row-level policy logic is not relevant to the table in question.

For example, with the following policy

@attributeValuesContains('Attribute Name', 'SOME_COLUMN')

If SOME_COLUMN does not exist in the table, the row-level policy will not work (this is why it is always recommended to use the @columnTagged('tag name') function instead of hard coding column names).

In the case where an error such as this occurs with a global data policy, the lockout policy will kick in. The lockout policy is a row-level policy that blocks any rows from returning for any users. Since Immuta does not know how to apply the policy, the lockout policy avoids data leaks until the policy is edited to work correctly.

New column added data policy

This templated policy pairs with schema monitoring to mask newly added columns to data sources until data owners review and approve these changes from the requests tab of their profile page.

When this policy is activated by a governor, it will automatically be enforced on data sources that have the New tag applied to them.

To learn how to activate this policy, navigate to the Clone, activate, or stage a global policy how-to guide.

Custom data policy certifications

When building a global data policy, governors can create custom certifications, which must then be acknowledged by data owners when the policy is applied to data sources.

For example, data governors could add a custom certification that states that data owners must verify that tags have been added correctly to their data sources before certifying the policy.

When a global data policy with a custom certification is cloned, the certification is also cloned. If the user who clones the policy and custom certification is not a governor, the policy will only be applied to data sources that user owns.

Audit

The following events related to policy certification are audited and can be found on the audit page in the UI:

DatasourcePolicyCertificationExpired: The global policy certification on a data source is expired.
DatasourcePolicyCertified: A global policy is certified for a data source.
DatasourcePolicyDecertified: A global policy is decertified for a data source.

PreviousReference Guides NextMasking Policies

Last updated 1 month ago

Was this helpful?

hashtagData policy support matrix

hashtagPolicy behavior: conflicts, fallbacks, and lockout

hashtagMasking policy conflicts

hashtagRow-level policy conflicts

hashtagMasking policy intelligent fallbacks

hashtagLockout policies

hashtagNew column added data policy

hashtagCustom data policy certifications

hashtagAudit