Data Policies Reference Guide

Data policies manage what users see when they query data in a table they are subscribed to. The data policies that governors, data source owners, or domain policy managers can author fall into two categories: protect policies and reveal policies.

Protect policies

Protect policies restrict access to data once a user is subscribed to a data source. There are three different ways to restrict data access with protect policies:

  • Row-level: Filter rows from certain users at query time.

  • Column masking: Mask values in a column at query time.

  • Cell masking: Mask specific cells in a column based on separate values in the same row at query time.

For all protect policies, you can establish the conditions for which they will be enforced:

  • If the user is a member of a group (or several groups)

  • If the user possesses an attribute (or several attributes)

  • If the user is acting under a purpose (or several purposes) for which the data is allowed to be used

You can append multiple conditions to the protect policy, and these conditions can be directed as exclusionary or inclusionary, depending on the policy that's being enforced:

  • Exclusionary condition example: Mask using hashing values in columns tagged PII on all data sources for everyone except users in the group AUDIT.

  • Inclusionary condition example: Only show rows where user is a member of a group that matches the value in the column tagged Department.

    For all policies except purpose-based restriction policies, inclusionary logic allows policy authors to vary policy actions with an otherwise clause. For example, you could mask values using hashing for users acting under a specified purpose while masking those same values by making null for everyone else who accesses the data.

Data policy support matrix

The table below outlines the types of protect data policies supported for various data platforms. If a data platform isn't included in the table, that integration does not support any data policies.

Details about each of the data policy types are included in their linked reference guides.

Amazon Redshift
Amazon Redshift Spectrum
Azure Synapse Analytics
Databricks Spark
Databricks Unity Catalog
Google BigQuery
Snowflake
Starburst (Trino)
Teradata

Custom functions for masking or row-level policies

Supported with caveats

Supported with caveats

Replace with NULL or constant

Supported with caveats

Reveal policies

circle-info

Reveal policies are only supported for global masking policies.

Reveal policies are exceptions to column-level restrictions. However, unlike the exceptions included directly in protect policies, these exceptions stand alone as their own policy and are essentially on standby until a masking policy targets the same column. For example, the following reveal policy would be applied to all columns tagged HR:

Reveal columns tagged HR for everyone who is a member of group HR.

However, the reveal policy wouldn't do anything until the following protect policy is applied to the same columns (because the data would already be in the clear):

Mask all columns tagged HR.

Then, the reveal policy would combine with the protect policy on data sources with columns tagged HR:

Mask all columns tagged HR except for users in group HR.

Because reveal policies merge with other policies, policy authors can build exceptions to policies without having to know or predict every protect policy. This design also allows you to selectively delegate policy authoring to users across your organization, so data governors could author overarching protect policies (like masking all PII) while domain delegates and data owners could author reveal policies specific to the data within their area of expertise.

See the policy authoring strategy section for guidance on when to author protect or reveal policies.

Using protect and reveal policies

Protect and reveal policies allow you to define exceptions without having to consider existing masking policies in place. Instead, you can define what should be revealed at any granularity needed without creating a protect policy for every possible permutation of access someone might request.

For example, many organizations want to author protect policies that mask data like this:

Mask columns tagged Employee for everyone except users in group HR.

Then, if other users need to access the table, administrators have to either

  • Add users to the group HR when they need access to employee data (which gives them access to columns tagged Employee across all data sources) and then remove individual users from the group HR when their access should be removed.

  • Create separate policies for each permutation of data access. Such as

    • Mask columns tagged Employee for everyone except users in group HR.

    • Mask columns tagged Employee.Strictly Confidential for everyone except users with attribute Exception:Employee.Strictly Confidential.

    • Mask columns tagged Employee.Confidential for everyone except users with attribute Exception:Employee.Confidential.

Both of these strategies are untenable because of inefficiency, role bloat, and users gaining too much access.

Reveal policies solve this problem by separating the data access restriction from the exception. Consequently, governors can be responsible for authoring restrictions that protect data while data owners and domain delegates can be responsible for authoring reveal policies that provide exceptions to those restrictions. The examples above transform into the following policies:

Protect policy

Mask columns tagged Employee except for users in group HR.

Reveal policy

Reveal columns tagged Employee for everyone who has an attribute key Exception with a value that matches any column tag.

Once this protect policy and reveal policy target the same column in a data source, Immuta automatically combines them:

  • Columns tagged Employee.Strictly Confidential would have the following policy on them: Mask column for everyone except users in group HR or users who possess an attribute with key Exception and value Employee.Strictly Confidential.

  • Columns tagged Employee.Confidential would have the following policy on them: Mask column for everyone except users in group HR or users who possess an attribute with key Exception and value Employee.Confidential.

This design simplifies policy authoring processes and enables uninterrupted data access for your data users.

See the Multiple policies on a single data source section for details and examples of how protect and reveal policies merge when they apply to the same data source.

Policy authoring strategy

Although protect policies combine with the exceptions in reveal policies when they target the same column, you can also build exceptions directly into protect policies. To determine whether you should include exceptions within your protect policies over authoring separate reveal policies, consider the following:

  • Knowledge

    • Who is the expert of your organization's governance policies?

    • Who is the expert of the data and knows the users who need to access it?

  • Exceptions

    • Are the policy exceptions static, or do they often change?

    • Will policy exceptions need to be combined?

Use the comparison below to navigate these decision points.

Masking policy with exceptions

Mask columns tagged highly restricted for everyone except admins.

Admins get access to all highly restricted data.

Recommended when

  • Users are exempt based on the masking rule itself, not the data it applies to: users should get exempted from all masking this policy applies.

  • You use a centralized governance approach: one team owns all masking rules and exceptions.

  • Exceptions are only valid for a given rule: if a masking rule on a column changes, previous exceptions should be invalidated.

Reveal policy

Reveal columns tagged highly restricted and Finance for everyone who possesses attribute Finance.Allowed.

Users with attribute Finance.Allowed get access to highly restricted finance data only.

Recommended when

  • Users are exempt based on certain data the masking rule applies to, not the entire rule itself: users should get exempted from the masking this policy applies for a subset of data.

  • You use a decentralized governance approach: one team owns masking rules, but exceptions are managed by individual domain teams.

  • Exceptions are valid no matter what masking rule applies: if the masking rule changes on a column, previous exceptions should carry over.

Multiple policies on a single data source

More than one data policy may apply to a data source. When this happens, the policies will either merge or conflict, depending on the type of policy they are.

  • Reveal policies will merge with other global policies with OR. See the section below for details about how they will merge.

  • Protect policies

    • Row-level policies will merge with AND. See the section below for details about how they will merge.

    • Masking policies will conflict. See the section below for details about how the masking policy is selected.

    • Purpose-based policies will merge with AND. See the section below for details about how they will merge.

Reveal policy merges

Reveal policies merge with existing global protect policies and other reveal policies with OR, which means that if multiple exceptions apply within the merged policy, users only need to meet one exception to see the data. For example, users in group HR or in group Executive would see data tagged Employee.Internal in the clear if the following policy were applied to a data source:

Mask columns tagged Employee.Internal for everyone except users in group HR or in group Executive.

The table below illustrates how multiple protect and reveal policies apply to data sources:

Protect policy: Mask columns tagged Classified for everyone except users with attribute Access.Classified

Reveal policy: Reveal columns tagged Classified.Internal for users with attribute Access.Internal

Reveal policy: Reveal columns tagged Classified.Internal.Employee for users acting under purpose Quarterly review

Merged policy

Column A tag

  • Classified

Mask columns tagged Classified except for users with attribute Access.Classified

Column B tag

  • Classified.Internal

Mask columns tagged Classified.Internal except for users with attribute Access.Classified OR Access.Internal

Column C tag

  • Classified.Internal.Employee

Mask columns tagged Classified.Internal.Employee except for users with attribute Access.Classified OR Access.Internal OR acting under purpose Quarterly review

To combine exceptions with AND, the exceptions must be included in a single protect policy or a single action of a reveal policy:

Protect policy

Mask columns tagged Classified.Internal for everyone except when user possesses attribute Access.Classified AND Access.Internal.

Reveal policy

Reveal columns tagged Classified.Internal for users who possess attribute Access.Classified AND Access.Internal.

If a local policy is applied to a data source, the reveal policy will not merge with that local policy.

Row-level policy merges

When two or more row-level policies target the same table, all row-level policies will be applied with AND, meaning the user will need to meet the conditions of all the policies to see any rows in the table at all. For example, these two row-level policies apply to the same data source:

  • Only show rows where user possesses an attribute Classification that matches the value in column first_name for everyone except when user possesses attribute with key Classification and value Strictly Confidential.

  • Only show rows where user possesses an attribute Classification that matches the value in column first_name for everyone except when user is a member of group with name Managers.

To see any rows in the table, the querying user must be a member of group Managers and have the attribute key-value pair Classification: Strictly Confidential.

To combine separate row-level policies with OR, build them into a single Immuta policy using OR to combine the conditions of the policy.

Purpose-based policy merges

When two or more purpose-based policies target the same table, all purpose-based policies will be combined with AND, meaning the user will need to meet the conditions of all the policies to see any rows in the table. For example, these two purpose-based policies apply to the same data source:

  • Limit usage to purpose Marketing Campaign for everyone except when user is a member of the group Marketing Execs on data sources tagged Customer Data.

  • Limit usage to purpose Distribution for everyone except when user possesses attribute Classification: Strictly Confidential on data sources tagged Customer Data.Address.

To see any rows in the table, one of the following statements must be true for the user:

  • The user must be working under the purpose Marketing Campaign AND Distribution.

  • The user must be a member of the Marketing Execs group AND have the attribute Classification: Strictly Confidential

Masking policy conflicts

In some cases, two conflicting global masking policies apply to a single data source. When this happens, the policy containing a tag deeper in the hierarchy will apply to the data source to resolve the conflict.

Consider the following global data policies created by a data governor:

Data policy 1

Mask columns tagged PII by making null for everyone on data sources with columns tagged PII.

Data policy 2

Mask columns tagged PII.SSN using hashing for everyone on data sources with columns tagged PII.SSN.

If a data owner creates a data source and applies the PII.SSN tag to a column, both of these global masking policies will apply to the column with that tag. Instead of having a conflict, the policy containing a deeper tag in the hierarchy will apply.

In this example, data policy 2 will be applied to the data source because PII.SSN is deeper and thus considered more specific than PII. If data owners wanted to use data policy 1 on the data source instead, they would need to disable data policy 2.

If two or more masking policies target the same column and have the same hierarchy depth, the policy that was authored first will be applied. This is a conservative approach that avoids the original policy being changed unexpectedly.

Masking policy intelligent fallbacks

When masking columns, the type of the column matters. For example, it is not possible to hash a numeric column, because the hash would render the number as a string.

Many data platforms make the user account for this by building separate data policies for every column type that could exist now or in the future, which is onerous.

Instead, Immuta has intelligent fallbacks. An intelligent fallback occurs when a masking type targets a column type that is incompatible with the masking type. In this case, Immuta will fall back to the most appropriate masking type that retains the level of privacy or better required by the previous type.

For example, if a hashing masking type hits a numeric type, it would intelligently fallback to nulling the column instead, since nulls are allowed in numeric types.

Lockout policies

Sometimes a global data policy will target a table and the policy cannot be applied as written. This can happen for several reasons, but the most common is that the row-level policy logic is not relevant to the table in question.

For example, with the following policy

@attributeValuesContains('Attribute Name', 'SOME_COLUMN')

If SOME_COLUMN does not exist in the table, the row-level policy will not work. This is why it is always recommended to use the @columnTagged('tag name') function instead of hard coding column names.

In the case where an error such as this occurs with a global data policy, the lockout policy will kick in. The lockout policy is a row-level policy that blocks any rows from returning for any users. Since Immuta does not know how to apply the policy, the lockout policy avoids data leaks until the policy is edited to work correctly.

New column added data policy

This templated policy pairs with schema monitoring to mask newly added columns to data sources until data owners review and approve these changes from the requests tab of their profile page.

When this policy is activated by a governor, it will automatically be enforced on data sources that have the New tag applied to them.

To learn how to activate this policy, navigate to the Clone, activate, or stage a global policy how-to guide.

Custom data policy certifications

When building a global data policy, governors can create custom certifications, which must then be acknowledged by data owners when the policy is applied to data sources.

For example, data governors could add a custom certification that states that data owners must verify that tags have been added correctly to their data sources before certifying the policy.

When a global data policy with a custom certification is cloned, the certification is also cloned. If the user who clones the policy and custom certification is not a governor, the policy will only be applied to data sources that user owns.

Policy explainer

circle-info

sparkles AI-powered feature

This feature is currently only supported for global data policies.

The Policy explainer generates a summary of how a global data policy will affect users' access to a data source, which allows policy authors to verify the behavior of the policy before activating it.

To generate the policy summary, the Policy explainer sends the policy definition in JSON to AWS Bedrock. Then, it crafts an example table and scenarios based on mock data to simulate complex, real-world access decisions.

For example, if a user authored the policy

Mask columns tagged email using hashing for everyone except when user is a member of group Marketing.

and clicked Explain policy, the following policy definition would be sent to AWS Bedrock:

chevron-rightPolicy definition JSON examplehashtag

Then, the Policy explainer would generate a policy summary for the user.

Policy summary

The policy summary illustrates how a policy will affect various potential data consumers so that policy authors can see the policy's effect from different perspectives. The policy summary comprises distinct sections:

  1. A brief explanation of the policy's intended behavior.

  2. A sample table with mock data and columns.

  3. A description of what happens when users with contrasting entitlements query that table.

Below is an example of a policy summary created by the Policy explainer for the following policy:

Mask columns tagged email for everyone except when user is a member of group Marketing.

chevron-rightPolicy summary examplehashtag

This policy hashes the values in any column tagged email, so email addresses are replaced with a consistent cryptographic hash for most users. Members of the Marketing group are exempt, and they see the original email addresses. Here is an example of how this policy would affect users querying the same data:

  1. The Data Source

Imagine a table named Customer_Details that has the following data, where the email column is tagged with email:

customer_id
name
email
  1. User Scenarios

The output of a query (SELECT * FROM Customer_Details;) will vary based on the querying user's group membership:

Scenario A: User is NOT in the Marketing group

User: User A (Member of the Analysts group)

customer_id
name
email

101

Alice

a3f2b8c9d4e5f1a2b3c4d5e6

102

Bob

7b9e4f1a2c8d9e0f1a2b3c4d

103

Carol

8e5f2d1c9b7a6e3f4a5b6c7d

User A is not in the Marketing group, so the policy is enforced. The email column is replaced with a cryptographic hash value, hiding the actual addresses while still allowing joins or counts on the hashed values.

Scenario B: User IS in the Marketing group

User: User B (Member of the Marketing group)

customer_id
name
email

User B belongs to the Marketing group, so the exception is triggered. The policy is not enforced for them, and they see the original email addresses in clear text.

Policy changes and activation

Policy authors dictate all changes to and activation of policies; the Policy explainer does not activate, deactivate, or change the content of a policy. The diagram below illustrates this delineation between the Policy explainer and the policy author:

The Policy explainer sends the policy definition in JSON to AWS Bedrock and returns a summary to the user, while the user is responsible for managing the actual policies that enforce access controls.

However, if the Policy explainer were to misrepresent the behavior of a policy, the user could activate policy changes that behave counter to what they intended. Therefore, policy authors should verify the accuracy of the explanation before activating the policy.

Data protection

The Policy explainer does not query or send any of your actual data. The only data sent to AWS Bedrock is the policy definition (configured by the user) in JSON.

For details about data protection with the Policy explainer, see the Immuta's AI features page.

Audit

The following events related to policy certification are audited and can be found on the audit page in the UI:

Last updated

Was this helpful?