arrow-left

All pages
gitbookPowered by GitBook
1 of 17

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

How-to Guides

Data Policies

Data policies determine what users see when they query data in a table they have access to.

hashtag
How-to guides

  • Author a masking data policy

  • : Certify policies and view policy diffs on a data source.

hashtag
Reference guides

  • : This guide describes how data policies work in Immuta.

  • : This guide describes the types of masking policies available and when to use each.

  • : Row-level policies compare data values with user metadata at query-time to determine whether or not the querying user should have access to the individual rows of data.

Author a Restricted Data Policy

Data owners who are not governors can write restricted subscription and data policies, which allow them to enforce policies on multiple data sources simultaneously, eliminating the need to write redundant local policies.

Unlike global policies, the application of these policies is restricted to the data sources owned by the users or groups specified in the policy and will change as users' ownerships change.

  1. Click the Policies icon in the navigation menu and select Data Policies.

  2. Click New data policy and complete the Policy name field.

  3. Select how the policy should protect the data. Click a link below for instructions on building that specific data policy:

  4. : Click Explain this policy to open the AI assistant side sheet. The will generate a textual summary and explanation of the policy behavior on various users using mock data.

  5. Opt to complete the Enter Rationale for Policy (Optional) field, and then click Add.

  6. From the Where should this policy be applied dropdown menu, select When selected by data owners, On all data sources, or On data sources. If you selected On data sources, finish the condition in one of the following ways:

    • tagged: Select this option and then search for tags in the subsequent dropdown menu.

    • with columns tagged

  7. Beneath Whose Data Sources should this policy be restricted to, add users or groups to the policy restriction by typing in the text fields and selecting from the dropdown menus that appear.

  8. Click Create Policy, and then click Activate Policy or Stage Policy.

Author a Masking Data Policy

circle-info

Best practice: use global policies

Build global policies with tags instead of writing local policies to manage data access. This practice will prevent you from having to author or update individual policies for every data source added to Immuta.

  1. Determine your policy scope:

    • Global policy: Click the Policies icon in the navigation menu and select the Data Policies tab. Click New data policy and complete the Policy name field.

    • Local policy: Navigate to a specific data source and click the Policies tab. Scroll to the Data Policies section and click New Policy.

  2. Select Mask from the first dropdown menu.

  3. Select columns tagged, columns with any tag, columns with no tags, all columns, or columns with names spelled like.

  4. Select a masking type (some of these types will ):

  5. Select everyone except, everyone, or everyone who to continue the condition.

    • everyone except: In the subsequent dropdown menus, choose is a member of group, possesses attribute, or is acting under purpose. Complete the condition with the subsequent dropdown menus. For a list of exceptions and an explanation of their behavior, see the .

  6. : Click Explain this policy to open the AI assistant side sheet. The will generate a textual summary and explanation of the policy behavior on various users using mock data.

  7. Opt to complete the Enter Rationale for Policy (Optional) field, and then click Add.

  8. For global policies: Click the dropdown menu beneath Where should this policy be applied and select When selected by data owners, On all data sources, or On data sources. If you selected On data sources, finish the condition in one of the following ways:

    • tagged: Select this option and then search for tags in the subsequent dropdown menu.

    • with columns tagged

  9. Click Create Policy. If creating a global policy, you then need to click Activate Policy or Stage Policy.

Author a Minimization Policy

  1. Determine your :

    • Global policy: Click the Policies icon in the navigation menu and select the Data Policies tab. Click New data policy and complete the Policy name field.

Limit to Purpose Policies

Limit to purpose policies use purposes and Immuta projects to govern access to data.

Purposes define the scope and use of data within a project, while projects allow users to meet . In general, here are how purposes and projects interact to enforce purpose-based access controls:

  • Governors create purposes and include them in global data policies.

  • Project owners then add purposes to their project(s).

Limit to purpose policies: This guide describes how purposes are used in data policies.
  • Custom WHERE clause functions: This guide describes the custom functions you can use to extend the PostgreSQL WHERE syntax.

  • Lookup tables: If user metadata is stored in a table in the same data platform where a policy is enforced, it is not necessary to move that user metadata in Immuta. This guide describes how to reference the user metadata in those tables directly using custom WHERE functions in data policies.

  • Author a minimization policy
    Author a purpose-based restriction policy
    Author a restricted data policy
    Author a row-level policy
    Author a time-based restriction policy
    Policy certifications and diffs
    Data policies reference guide
    Masking policies
    Row-level policies
    Only show data by time
  • Limit usage to purpose(s)

  • Minimize data source

  • : Select this option and then search for
    tags
    in the subsequent dropdown menu.
  • with column names spelled like: Select this option, and then enter a regex and choose a modifier in the subsequent fields.

  • in server: Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

  • created between: Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  • sparkles
    Mask
    Only show rows
    AI-powered feature
    AI assistant
    by making null
  • using a constant: Enter a constant in the field that appears next to the masking type dropdown.

  • using a regex:

    1. Enter a regular expression and replacement value in the fields that appear next to the masking type dropdown.

    2. From the next dropdown, choose to make the regex Case Insensitive and/or Global. For this policy to be enforced on Redshift data sources, Global must be selected.

  • by rounding: Select the Bucket Type and then enter the bucket size.

  • with format preserving masking

  • using randomized response

  • using the custom function: Enter the custom function native to the underlying database.

    Note: The function must be valid for the data type of the column. If it is not, the default masking type will be applied to the column.

  • for everyone who
    : Complete the
    Otherwise
    clause. You can add more than one condition by selecting
    + Add Another Condition
    . The dropdown menu in the policy builder contains conjunctions for your policy. If you select
    or
    , only one of your conditions must apply to a user for them to see the data. If you select
    and
    , all of the conditions must apply.
    : Select this option and then search for
    tags
    in the subsequent dropdown menu.
  • with column names spelled like: Select this option, and then enter a regex and choose a modifier in the subsequent fields.

  • in server: Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

  • created between: Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  • sparkles
    only be available for Snowflake integrations
    using hashing
    with reversibility
    Masking policies reference guide
    AI-powered feature
    AI assistant
    Local policy: Navigate to a specific data source and click the Policies tab. Scroll to the Data Policies section and click New Policy.
  • Select Minimize data source from the first dropdown.

  • Complete the enter percentage field to limit the amount of data returned at query-time.

  • Select for everyone except from the next dropdown menu to continue the condition. Additional options include for everyone and for everyone who.

  • Use the next field to choose the attribute, group, or purpose that you will match values against.

    Notes:

    • If you choose for everyone who as a condition, complete the Otherwise clause before continuing to the next step.

    • You can add more than one condition by selecting + Add Another Condition. The dropdown menu then contains conjunctions for your policy. If you select or, only one of your conditions must apply to a user for them to see the data. If you select and, all of the conditions must apply.

  • sparkles AI-powered feature: Click Explain this policy to open the AI assistant side sheet. The AI assistant will generate a textual summary and explanation of the policy behavior on various users using mock data.

  • Opt to complete the Enter Rationale for Policy (Optional), and then click Add.

  • For global policies: Click the dropdown menu beneath Where should this policy be applied, and select On all data sources, On data sources, or When selected by data owners. If you select On data sources, finish the condition in one of the following ways:

    • tagged: Select this option and then search for tags in the subsequent dropdown menu.

    • with columns tagged: Select this option and then search for tags in the subsequent dropdown menu.

    • with column names spelled like: Select this option, and then enter a regex and choose a modifier in the subsequent fields.

    • in server: Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

    • created between: Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  • Click Create Policy. If creating a global policy, you then need to click Activate Policy or Stage Policy.

  • policy scope

    Data users work within the context of a project to access those data sources.

    For example, if a governor created the purpose Research they could author the following global policy:

    Limit usage to purpose(s) Research for everyone on data sources tagged PHI.

    Once a project owner adds the Research purpose to a project, any user acting under that project context would meet the criteria of the policy and gain access to data sources tagged PHI.

    Refer to the data governor policy guide for a tutorial on purpose-based restrictions on data or to the Projects and purposes reference guide for more details about these features.

    purpose restrictions on policies

    Reference Guides

    Author a Purpose-Based Restriction Policy

    Requirement and prerequisite:

    • CREATE_DATA_SOURCE or GOVERNANCE Immuta permission

    • A purpose has been created

    hashtag
    Build the policy

    1. Determine your :

      • Global policy: Click the Policies icon in the navigation menu and select the Data Policies tab. Click New data policy and complete the Policy name field.

      • Local policy: Navigate to a specific data source and click the Policies

    hashtag
    Related guides

    hashtag
    How-to guides

    • : To restrict access to data and associate your data source with a purpose, create a project and add the purpose and relevant data sources to the project.

    hashtag
    Reference guides

    hashtag
    Conceptual guide

    Policy Certifications and Diffs

    hashtag
    Required permissions

    To manage and apply existing policies to data sources, a user must have either the CREATE_DATA_SOURCE Immuta permission or be manually assigned the owner role on a data source.

    hashtag

    Create a custom certification for a global policy

    Data governors can add certifications that outline acknowledgements or require approvals from data owners. For example, data governors could add a custom certification that states that data owners must verify that tags have been added correctly to their data sources before certifying the policy.

    1. Click Add Certification in the data policy builder.

    2. Enter a Certification Label and Certification Text in the corresponding fields of the dialog that appears.

    3. Click Save.

    hashtag
    Certify global policies

    After a policy with a certification requirement is applied to a data source, data owners will receive a notification indicating that they need to certify the policy.

    1. Navigate to the Policies tab of the affected data source, and review the policy in the Data Policies section.

    2. Click Certify Policy.

    3. In the Policy Certification modal, click Sign and Certify.

    hashtag
    View policy diffs

    Once you have a data policy in effect, you can view the changes in your policies by clicking the Policy Diff button in the data policies section on a data source's policies tab.

    The Policy Diff button displays previous policies and the current policy applied to the data source.

    tab. Scroll to the
    Data Policies
    section and click
    New Policy
    .
  • Select Limit usage to purpose(s) in the first dropdown menu.

  • In the next field, select a specific purpose that you would like to restrict usage of this data source to or ANY PURPOSE. You can add more than one condition by selecting + Add Another Condition. The dropdown menu in the policy builder contains conjunctions for your policy. If you select or, only one of your conditions must apply to a user for them to see the data. If you select and, all of the conditions must apply.

  • Select for everyone or for everyone except. If you select for everyone except, you must select conditions that will drive the policy such as group, purpose, or attribute.

  • sparkles AI-powered feature: Click Explain this policy to open the AI assistant side sheet. The AI assistant will generate a textual summary and explanation of the policy behavior on various users using mock data.

  • Opt to complete the Enter Rationale for Policy (Optional) field, and then click Add.

  • For global policies: Click the dropdown menu beneath Where should this policy be applied, and select On all data sources, On data sources, or When selected by data owners. If you select On data sources, finish the condition in one of the following ways:

    • tagged: Select this option and then search for tags in the subsequent dropdown menu.

    • with columns tagged: Select this option and then search for tags in the subsequent dropdown menu.

    • with column names spelled like: Select this option, and then enter a regex and choose a modifier in the subsequent fields.

    • in server: Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

    • created between: Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  • Click Create Policy. If creating a global policy, you then need to click Activate Policy or Stage Policy.

  • policy scope
    Create a project
    Manage project purposes
    Projects and purposes
    Purpose-based policy restrictions
    Why use projects?

    Row-Level Policies

    Immuta row-level policies compare data values with user metadata at query-time to determine whether or not the querying user should have access to the individual rows of data.

    These policies hide entire rows or objects of data based on the policy being enforced; some of these policies require the data to be tagged as well.

    hashtag
    How do row-level policies work?

    The values contained in one or many columns in the table in question (or a ) need to be referenced by the policy for its logic to take effect.

    For example, consider the policy below:

    Only show rows where user is a member of a group that matches the value in the column tagged Department.

    The data values (the values in the column tagged Department) are matched against the user attribute (their groups) to determine whether or not rows will be visible to the user accessing the data.

    The policy targets columns tagged Department; this means that this policy can be applied globally across all tables and data platforms that have that tag with this single policy rather than having to build a separate policy for individual tables and columns.

    hashtag
    Masked columns as input for row-level policies

    circle-info

    Public preview: This feature is available to all accounts.

    If a global masking policy applies to a column, you can still use that masked column in a global row-level policy.

    Consider the following policy examples:

    • Masking policy: Mask values in columns tagged Country for everyone except users in group Admin.

    • Row-level policy: Only show rows where user possesses an attribute in OfficeLocation that matches the value in column tagged Country for everyone.

    Both of these policies use the Country tag to restrict access. Therefore, the masking policy and the row-level policy would apply to data source columns with the tag Country for users who are not in the Admin group.

    Limitations

    • This feature is only available for Snowflake and Databricks Unity Catalog integrations.

    • This feature is only supported for global data policies, not local data policies.

    hashtag
    Matching

    These policies match a user attribute with a row/object/file attribute to determine if that row/object/file should be visible. This process uses a direct string match, so the user attribute would have to match exactly the data attribute in order to see that row of data.

    For example, to restrict access to insurance claims data to the state for which the user's home office is located, you could build a policy such as this:

    Only show rows where user possesses an attribute in Office Location that matches the value in the column State for everyone except when user is a member of group Legal.

    In this case, the Office Location is retrieved by the identity management system as a user attribute or group. If the user's attribute (Office Location) was Missouri, rows containing the value Missouri in the State column in the data source would be the only rows visible to that user.

    hashtag
    Minimization

    These policies return a limited percentage of the data, which is randomly sampled at query time, but it is the same sample for all the users. For example, you could limit certain users to only 10% of the data. Immuta uses a hashing policy to return approximately 10% of the data, and the data returned will always be the same; however, the exact number of rows exposed depends on the distribution of high cardinality columns in the database and the hashing type available. Additionally, Immuta will adjust the data exposed when new rows are added or removed.

    circle-info

    Best practice: row count

    Immuta recommends you use a table with over 1,000 rows for the best results when using a data minimization policy.

    hashtag
    Time-based restrictions

    These policies restrict access to rows/objects/files that fall within the time restrictions set in the policy. If data sources have time-based restriction policies applied to them, queries run against the data sources will only return rows/blobs with dates in their event-time column/attribute from within a certain range.

    The time window is based on the event time you select when creating the data source. This value will come from a date/time column in relational sources.

    hashtag
    WHERE clause policy

    This policy can be thought of as a table "view" created automatically for the user based on the condition of the policy. For example, in the policy below, users who are not members of the Admins group will only see taxi rides where passenger_count < 2.

    Only show rows where public.us.taxis.passenger_count <2 for everyone except when user is a member of group Admins.

    You can put any valid SQL WHERE clause in the policy. See the Custom WHERE clause functions for a list of custom functions.

    circle-exclamation

    WHERE clause policy requirement

    All columns referenced in the policy must have fully qualified names. Any column names that are unqualified (just the column name) will default to a column of the data source the policy is being applied to (if one matches the name).

    hashtag
    Custom functions

    It is also possible to use custom functions in custom WHERE row-level policies for more complex use cases.

    These wrap Immuta context into free-form SQL logic for the row-level policy. That context can be things like the attributes (@attributeValuesContains()) or groups (@groupsContains()) possessed by the user or the username (@username) - injected into the SQL at runtime.

    Avoid referencing explicit column names in custom functions and instead use the @columnTagged('tag name') function in SQL. In doing so, you can avoid having to reference the physical database world with the custom SQL policies and instead continue to target the metadata/tag world.

    circle-info

    Avoid using columns masked using randomized response

    When building row-level policies with custom SQL statements, avoid using a column that is masked using randomized response in the SQL statement, as this can lead to different behavior depending on whether you’re using the Spark or Snowflake integration and may produce results that are unexpected.

    separate joined table

    Author a Row-Level Policy

    1. Determine your policy scope:

      • Global policy: Click the Policies icon in the navigation menu and select the Data Policies tab. Click New data policy and complete the Policy name field.

      • Local policy: Navigate to a specific data source and click the Policies tab. Scroll to the Data Policies section and click New Policy.

    2. Select the Only show rows action from the first dropdown.

    3. Choose one of the following policy conditions:

      • Where user

        1. Choose the condition that will drive the policy from the next dropdown: is a member of a group or possesses an attribute.

    4. Choose for everyone, everyone except, or for everyone who to drive the policy. If you choose for everyone except, use the subsequent dropdown to choose the group, purpose, or attribute for your condition. If you choose for everyone who as a condition, complete the Otherwise clause before continuing to the next step.

    5. : Click Explain this policy to open the AI assistant side sheet. The will generate a textual summary and explanation of the policy behavior on various users using mock data.

    6. Opt to complete the Enter Rationale for Policy (Optional) field, and then click Add.

    7. For global policies: Click the dropdown menu beneath Where should this policy be applied, and select On all data sources, On data sources, or When selected by data owners. If you select On data sources, finish the condition in one of the following ways:

      • tagged: Select this option and then search for tags in the subsequent dropdown menu.

      • with columns tagged

    8. Click Create Policy. If creating a global policy, you then need to click Activate Policy or Stage Policy.

    Orchestrated Masking Policies

    circle-info

    Private preview: This feature is available to select accounts. Contact your Immuta representative to enable this feature.

    Orchestrated masking policies (OMP) reduce conflicts between masking policies that apply to a single column, allowing policies to scale more effectively across your organization. Furthermore, OMP fosters distributed data stewardship, empowering policy authors who share responsibility of a data set to protect it while allowing data consumers acting under various roles or purposes to access the data.

    When multiple masking policies apply to a column, Immuta combines the exception conditions of the masking policy so that data subscribers can access the data when they satisfy one of those exception conditions. Multiple masking policies will be enforced on a column if the following conditions are true:

    • Policies use the same masking type.

    • Policies use the for everyone except condition.

    hashtag
    Requirements

    Databricks Spark or Starburst (Trino) integration

    hashtag
    Supported masking policy types

    OMP supports the following masking types:

    • Constant

    • Hashing

    • Format preserving masking

    • Null

    hashtag
    Global policy logic

    hashtag
    Previous policy logic

    Governors can apply policies to all columns in a data source or target specific columns with tags or a regular expression. Without orchestrated masking policies enabled, when multiple global policies apply to the same columns, Immuta could only apply one of those policies.

    Consider the following example to examine how policies behaved when one tag is used in two different policies:

    • Mask PII Global Policy 1: Mask using hashing the value in columns tagged email except when user is acting under the purpose Email Campaign.

    • Mask PII Global Policy 2: Mask using hashing the value in columns tagged email except when user is acting under purpose Marketing.

    For columns tagged email, only one of these policies is enforced. The Mask PII Global Policy 2 is not applied to the data source, so Immuta is not enforcing the masking policy properly for users who should be able to see emails because they are acting under the Marketing purpose.

    Consider the following example where multiple masking policies apply to columns that have multiple tags, resulting in one policy applying:

    • Global Policy 3: Mask using hashing the value in columns tagged Employee Data unless users are acting under the purpose Retention Analysis.

    • Global Policy 4: Mask using hashing the value in columns tagged HR Data unless users are acting under the purpose Employee Satisfaction Survey.

    If a column is tagged Employee Data and HR Data, Immuta will only apply one of the policies.

    hashtag
    Orchestrated masking policy logic

    With orchestrated masking policies, Immuta applies multiple global masking policies that apply to a single column by combining the policy exceptions with OR. For these policies to combine, the masking type must be identical and the policy must use the for everyone except condition.

    Consider the following example, both of these policies will apply to the data source:

    • Mask PII Global Policy 1: Mask using hashing the value in columns tagged email except when user is acting under the purpose Email Campaign.

    • Mask PII Global Policy 2: Mask using hashing the value in columns tagged email except when user is acting under purpose Marketing.

    Users acting under the purpose Marketing or Email Campaign will be able to see emails in the clear. However, in the following example, only one of these policies will apply to the data source because one masks using a constant and the other masks using hashing:

    • Global Policy 5: Mask using the constant REDACTED the value in columns tagged Employee Data unless users are acting under the purpose Retention Analysis.

    • Global Policy 6: Mask using hashing the value in columns tagged HR Data unless users are acting under the purpose Employee Satisfaction Survey.

    hashtag
    Limitations

    • No UI enhancements were made in this release. Multiple masking policies applied to the same column are visible on a data source, but there is no indication that the exceptions are combined with OR.

    • Masking types must match exactly for the policies to be combined. For example, both policies must mask using rounding.

    • Existing policies will not automatically migrate to the new policy logic when you enable the feature. To re-compute existing policies with the new logic, you must manually trigger global policy changes by staging and re-enabling each policy.

    Lookup Tables

    If user metadata is stored in a table in the same data platform where a policy is enforced, it is not necessary to move that user metadata in Immuta. Instead it can be referenced directly using functions in data policies.

    Below is an example row-level policy that leverages a lookup table to dynamically drive access to rows in a table:

    CREDIT_CARD_NUMBER
    TRANSACTION_LOCATION
    TRANSACTION_TIME
    ACCESS_LEVEL

    Regex

  • Rounding

  • Use the next field to choose the attribute, group, or purpose that you will match values against.

  • Use the next dropdown menu to choose the tag that will drive this policy. You can add more than one condition by selecting + Add Another Condition. The dropdown menu then contains conjunctions for your policy. If you select or, only one of your conditions must apply to a user for them to see the data. If you select and, all of the conditions must apply.

  • Where the value in the column tagged

    1. Select the tag from the next dropdown menu.

    2. From the subsequent dropdown, choose is or is not in the list, and then enter a list of comma-separated values.

  • Where: Enter a valid SQL WHERE clause in the subsequent field. When you place your cursor in this field, a tooltip details valid input and the column names of your data source. See Custom WHERE Clause Functions for more information about specific functions.

  • Never

    The never condition blocks all access to the data source.

    1. Choose the condition that will drive the policy from the next dropdown: for everyone, for everyone except, or for everyone who.

    2. Select the condition that will further define the policy: is a member of group, is acting under a purpose, or possesses attribute.

    3. Use the next field to choose the group, purpose, or attribute that you will match values against.

  • : Select this option and then search for
    tags
    in the subsequent dropdown menu.
  • with column names spelled like: Select this option, and then enter a regex and choose a modifier in the subsequent fields.

  • in server: Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

  • created between: Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  • sparkles
    AI-powered feature
    AI assistant

    9876543210

    College Park, MD

    09:16:08

    8

    The final column in the table, ACCESS_LEVEL, defines who can see that row of data.

    Now consider the following hierarchy:

    In this diagram, there are 11 different access levels (AL) to data and the tree defines access. For example, if a user has Vegetables, they get access levels 2, 3, 4, 9, 10, and 11. If a user has Pear, they only get access level 8. In other words, a user with Vegetables would see the first row of the above table, a user with Pear would see the second row of the above table, and a user with Food would see both rows of the table.

    Taking the example further, that hierarchy tree is represented as a table in the data platform that we wish to use to drive the row-level policy:

    ACCESS_LEVEL
    ROOT

    1

    Food

    2

    Food

    3

    Food

    4

    Food

    5

    Food

    6

    Food

    That hierarchy lookup table can be referenced in the row-level policy as user metadata like this:

    @columnTagged('access_level') IN (SELECT ACCESS_LEVEL from [lookup table] where @attributeValuesContains('user_level', 'ROOT'))

    Walking through the policy step-by-step:

    • @columnTagged('access_level'): This allows us to target multiple tables with an ACCESS_LEVEL column that needs protecting with a single policy. Simply tag all the ACCESS_LEVEL columns with the access_level tag and this policy would apply to all of them.

    • IN (SELECT ACCESS_LEVEL from [lookup table]: This is selecting the matching ACCESS_LEVEL from the lookup table to use as the IN clause for filtering the actual business table.

    • where @attributeValuesContains('user_level', 'ROOT'): This is comparing the user's attribute user_level to the value in the ROOT column, and if there's a match, that ACCESS_LEVEL is used for filtering in the previous step. See the documentation for more details on these functions.

    So, you can then add metadata to your users in Immuta, such as Vegetables or Pear and that will result in them seeing the appropriate rows in the business table in question.

    The above example used a row-level policy, but it could instead do cell masking using the same technique:

    Mask columns tagged Credit Card Number using hashing where @columnTagged('access_level') NOT IN (SELECT ACCESS_LEVEL from [lookup table] where @attributeValuesContains('user_level', 'ROOT'))

    In this case, the credit card number will be masked if the access_level is not found for the user for that row.

    Even if not using a lookup table, the power of the @columnTagged('tag name') function is apparent for applying your masking or row-level policies at scale.

    0123456789

    Lewes, DE

    00:07:34

    custom WHERE

    4

    Custom WHERE Clause Functions

    The policy builder allows you to use custom functions that reference important Immuta metadata from within your where clause. These custom functions can be seen as utilities that help you create policies easier. Using the policy builder, you can include these functions in your masking or row-level policies by choosing where in the sub-action menu or using the custom function in the masking type dropdown menu.

    hashtag
    The @attributeValuesContains() function

    This function returns true for a given row if the provided column evaluates to an attribute value for which the querying user has a corresponding attribute value. This function requires two arguments and accepts no more than three arguments.

    hashtag
    Parameters

    Parameter
    Description
    Required or optional

    hashtag
    The @columnReference() function

    This function must be used in custom WHERE policies that reference a column name in the data source being protected. For example, to only show rows that have the value US in the Location column, you would create the following policy:

    When using this function, match the casing of the column name you specify with the column name in the remote platform and escape the following characters with a backslash: \, ', ", ` .

    This function should not be used to reference column names in external lookup tables. Instead, use the fully-qualified column name for external lookup tables.

    hashtag
    Parameter

    Parameter
    Description
    Required or optional

    hashtag
    The @columnTagged() function

    This function returns the column name with the specified tag.

    If this function is used in a global policy and the tag doesn't exist on a data source, the policy will not be applied.

    hashtag
    Parameters

    Parameter
    Description
    Required or optional

    hashtag
    The @groupsContains() function

    This function returns true for a given row if the provided column evaluates to a group to which the querying user belongs. This function requires at least one argument.

    hashtag
    Parameters

    Parameter
    Description
    Required or optional

    hashtag
    The @hasAttribute() function

    This function returns a boolean indicating if the current user has the specified attribute name and value combination. If the specified attribute name or attribute value has a single quote, you will need to escape it using a \'\' expression within a custom WHERE policy.

    hashtag
    Parameters

    Parameter
    Description
    Required or optional

    hashtag
    The @isInGroups() function

    This function returns a boolean indicating if the current user is a member of all of the specified groups. If any of the specified groups has a single quote, you will need to escape it using a \'\' expression within a custom WHERE policy.

    hashtag
    Parameter

    Parameter
    Description
    Required or optional

    hashtag
    The @isUsingPurpose() function

    This function returns a boolean indicating if the current user is using the specified purpose. If the specified purpose has a single quote, you will need to escape it using a \'\' expression within a custom WHERE policy.

    hashtag
    Parameter

    Parameter
    Description
    Required or optional

    hashtag
    The @purposesContains() function

    This function returns true for a given row if the provided column evaluates to a purpose under which the querying user is currently acting. This function requires at least one argument and accepts no more than two arguments.

    hashtag
    Parameters

    Parameter
    Description
    Required or optional

    hashtag
    The @username function

    This function returns the current user's username.

    hashtag
    Parameters

    None.

    7

    Food

    8

    Food

    9

    Food

    10

    Food

    11

    Food

    2

    Vegetables

    3

    Vegetables

    4

    Vegetables

    9

    Vegetables

    10

    Vegetables

    11

    Vegetables

    5

    Fruits

    6

    Fruits

    7

    Fruits

    8

    Fruits

    4

    Carrots

    9

    Leafy

    10

    Leafy

    11

    Leafy

    5

    Orange

    6

    Orange

    7

    Orange

    8

    Pear

    10

    Lettuce

    11

    Lettuce

    custom WHERE

    Attribute name string

    The name of the attribute to retrieve values for.

    Required

    @columnReference('Column_Name') string

    The column that contains the value to match the attribute key against.

    Required

    Placeholder string

    A placeholder in case the list of values is empty.

    Optional

    Column name string

    The name of the column to use in the policy.

    Required

    Tag name string

    The name of the tag.

    Required

    @columnReference('Column_Name') string

    The column that contains the value to match the group against.

    Required

    Placeholder string

    A placeholder in case the list of values is empty.

    Optional

    Attribute name string

    The name of the attribute.

    Required

    Attribute value string

    The value to correspond with the attribute name.

    Required

    Group names array[string]

    A list of group names. For example, groups('group_a', 'group_b', 'group_c').

    Required

    Purpose string

    The name of the purpose to check the user against.

    Required

    @columnReference('Column_Name') string

    The column that contains the value to match the purpose against.

    Required

    Placeholder string

    A placeholder in case the list of values is empty.

    Optional

    Only show rows where @columnReference('Location')='US' for everyone.

    Data Policies Reference Guide

    Data policies manage what users see when they query data in a table they are subscribed to.

    There are three different ways to restrict data access with data policies:

    • Row-level: Filter rows from certain users at query time.

    • Column masking: Mask values in a column at query time.

    • : Mask specific cells in a column based on separate values in the same row at query time.

    For all data policies, you establish the conditions for which they will be enforced:

    • If the user is a member of a group (or several groups)

    • If the user possesses an attribute (or several attributes)

    • If the (or several purposes) for which the data is allowed to be used

    Immuta allows you to append multiple conditions to the data, and these conditions can be directed as exclusionary or inclusionary, depending on the policy that's being enforced:

    • Exclusionary condition example: Mask using hashing values in columns tagged PII on all data sources for everyone except users in the group AUDIT.

    • Inclusionary condition example: Only show rows where user is a member of a group that matches the value in the column tagged Department. For all policies except , inclusionary logic allows governors to vary policy actions with an otherwise clause. For example, governors could mask values using hashing for users acting under a specified purpose while masking those same values by making null for everyone else who accesses the data.

    hashtag
    Data policy support matrix

    The table below outlines the types of data policies supported for various data platforms. If a data platform isn't included in the table, that integration does not support any data policies.

    Details about each of the data policy types are included in their linked reference guides.

    Amazon Redshift
    Amazon Redshift Spectrum
    Azure Synapse Analytics
    Databricks Spark
    Databricks Unity Catalog
    Google BigQuery
    Snowflake
    Starburst (Trino)
    Teradata

    hashtag
    Policy behavior: conflicts, fallbacks, and lockout

    hashtag
    Masking policy conflicts

    In some cases, two conflicting global masking policies apply to a single data source. When this happens, the policy containing a tag deeper in the hierarchy will apply to the data source to resolve the conflict.

    Consider the following global data policies created by a data governor:

    Data policy 1:

    Mask columns tagged PII by making null for everyone on data sources with columns tagged PII

    Data policy 2:

    Mask columns tagged PII.SSN using hashing for everyone on data sources with columns tagged PII.SSN

    If a data owner creates a data source and applies the PII.SSN tag to a column, both of these global masking policies will apply to the column with that tag. Instead of having a conflict, the policy containing a deeper tag in the hierarchy will apply.

    In this example, data policy 2 will be applied to the data source because PII.SSN is deeper and thus considered more specific than PII. If data owners wanted to use data policy 1 on the data source instead, they would need to disable data policy 2.

    Should two or more masking policies target the same column and have the same hierarchy depth, the policy that was authored first will win out. This is a conservative approach that avoids the original policy being changed unexpectedly.

    hashtag
    Row-level policy conflicts

    Similar to masking policies, it is possible for two or more row-level policies to target the same table. When this occurs, all row-level policies will be applied and AND'ed together, meaning the user will need to meet all in some capacity to see any rows in the table at all.

    To OR separate row-level policies together, build them into a single Immuta policy together with an OR.

    hashtag
    Masking policy intelligent fallbacks

    When masking columns, the type of the column matters. For example, it is not possible to hash a numeric column, because the hash would render the number as a string.

    Many data platforms make the user account for this by building separate data policies for every column type that could exist now or in the future, which is quite onerous.

    Instead, Immuta has intelligent fallbacks. An intelligent fallback occurs when a masking type targets a column type that is incompatible with the masking type. In this case, Immuta will fall back to the most appropriate masking type which retains the level of privacy or better required by the previous type.

    For example, if a hashing masking type hits a numeric type, it would intelligently fallback to nulling the column instead, since nulls are allowed in numeric types.

    hashtag
    Lockout policies

    Sometimes a global data policy will target a table and the policy cannot be applied as written. This can happen for several reasons, but the most common is that the row-level policy logic is not relevant to the table in question.

    For example, with the following policy

    @attributeValuesContains('Attribute Name', 'SOME_COLUMN')

    If SOME_COLUMN does not exist in the table, the row-level policy will not work (this is why it is always recommended to use the @columnTagged('tag name') function instead of hard coding column names).

    In the case where an error such as this occurs with a global data policy, the lockout policy will kick in. The lockout policy is a row-level policy that blocks any rows from returning for any users. Since Immuta does not know how to apply the policy, the lockout policy avoids data leaks until the policy is edited to work correctly.

    hashtag
    New column added data policy

    This templated policy pairs with to mask newly added columns to data sources until data owners review and approve these changes from the requests tab of their profile page.

    When this policy is activated by a governor, it will automatically be enforced on data sources that have the New tag applied to them.

    To learn how to activate this policy, navigate to the .

    hashtag
    Custom data policy certifications

    When building a global data policy, governors can , which must then be acknowledged by data owners when the policy is applied to data sources.

    For example, data governors could add a custom certification that states that data owners must verify that tags have been added correctly to their data sources before certifying the policy.

    When a global data policy with a custom certification is cloned, the certification is also cloned. If the user who clones the policy and custom certification is not a governor, the policy will only be applied to data sources that user owns.

    hashtag
    Policy explainer

    circle-info

    This feature is currently only supported for global data policies.

    The Policy explainer generates a summary of how a global data policy will affect users' access to a data source, which allows policy authors to verify the behavior of the policy before activating it.

    To generate the , the Policy explainer sends the policy definition in JSON to AWS Bedrock. Then, it crafts an example table and scenarios based on mock data to simulate complex, real-world access decisions.

    For example, if a user authored the policy

    Mask columns tagged email using hashing for everyone except when user is a member of group Marketing.

    and clicked Explain policy, the following policy definition would be sent to AWS Bedrock:

    chevron-rightPolicy definition JSON examplehashtag

    Then, the Policy explainer would generate a policy summary for the user.

    hashtag
    Policy summary

    The policy summary illustrates how a policy will affect various potential data consumers so that policy authors can see the policy's effect from different perspectives. The policy summary comprises distinct sections:

    1. A brief explanation of the policy's intended behavior.

    2. A sample table with mock data and columns.

    3. A description of what happens when users with contrasting entitlements query that table.

    Below is an example of a policy summary created by the Policy explainer for the following policy:

    Mask columns tagged email unless user is a member of group Marketing.

    chevron-rightPolicy summary examplehashtag

    This policy hashes the values in any column tagged email, so email addresses are replaced with a consistent cryptographic hash for most users. Members of the Marketing group are exempt, and they see the original email addresses. Here is an example of how this policy would affect users querying the same data:

    1. The Data Source

    Imagine a table named Customer_Details that has the following data, where the email

    hashtag
    Policy changes and activation

    Policy authors dictate all changes to and activation of policies; the Policy explainer does not activate, deactivate, or change the content of a policy. The diagram below illustrates this delineation between the Policy explainer and the policy author:

    However, if the Policy explainer were to misrepresent the behavior of a policy, the user could activate policy changes that behave counter to what they intended. Therefore, policy authors should verify the accuracy of the explanation before activating the policy.

    hashtag
    Data protection

    The Policy explainer does not query or send any of your actual data. The only data sent to AWS Bedrock is the policy definition (configured by the user) in JSON.

    For details about data protection with the Policy explainer, see the .

    hashtag
    Audit

    The following events related to policy certification are and can be found on the :

    • : The global policy certification on a data source is expired.

    • : A global policy is certified for a data source.

    • : A global policy is decertified for a data source.

    βœ…

    ❌

    βœ…

    βœ…

    ❌

    Custom functions for or policies

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    ❌

    ❌

    ❌

    ❌

    ❌

    ❌

    ❌

    βœ…

    ❌

    ❌

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    ❌

    ❌

    βœ…

    βœ…

    βœ…

    Supported with caveats

    βœ…

    βœ…

    βœ…

    ❌

    ❌

    ❌

    ❌

    βœ…

    Supported with caveats

    ❌

    ❌

    ❌

    ❌

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    ❌

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    ❌

    ❌

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    ❌

    ❌

    ❌

    ❌

    ❌

    ❌

    ❌

    βœ…

    ❌

    ❌

    ❌

    βœ…

    ❌

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    ❌

    Replace with or

    βœ…

    βœ…

    βœ…

    Supported with caveats

    βœ…

    βœ…

    βœ…

    βœ…

    ❌

    ❌

    βœ…

    ❌

    βœ…

    ❌

    ❌

    βœ…

    βœ…

    ❌

    ❌

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    ❌

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    column is tagged with
    email
    :
    customer_id
    name
    email

    101

    Alice

    [email protected]

    102

    Bob

    [email protected]

    103

    Carol

    [email protected]

    1. User Scenarios

    The output of a query (SELECT * FROM Customer_Details;) will vary based on the querying user's group membership:

    Scenario A: User is NOT in the Marketing group

    User: User A (Member of the Analysts group)

    customer_id
    name
    email

    101

    Alice

    a3f2b8c9d4e5f1a2b3c4d5e6

    102

    Bob

    7b9e4f1a2c8d9e0f1a2b3c4d

    103

    Carol

    8e5f2d1c9b7a6e3f4a5b6c7d

    User A is not in the Marketing group, so the policy is enforced. The email column is replaced with a cryptographic hash value, hiding the actual addresses while still allowing joins or counts on the hashed values.

    Scenario B: User IS in the Marketing group

    User: User B (Member of the Marketing group)

    customer_id
    name
    email

    101

    Alice

    [email protected]

    102

    Bob

    [email protected]

    103

    Carol

    [email protected]

    User B belongs to the Marketing group, so the exception is triggered. The policy is not enforced for them, and they see the original email addresses in clear text.

    Cell-level masking

    ❌

    βœ…

    βœ…

    sparkles
    Cell masking
    user is acting under a purpose
    purpose-based restriction policies
    schema monitoring
    Clone, activate, or stage a global policy how-to guide
    create custom certifications
    AI-powered feature
    policy summary
    Immuta's AI features page
    audited
    audit page in the UI
    DatasourcePolicyCertificationExpired
    DatasourcePolicyCertified
    DatasourcePolicyDecertified
    The Policy explainer sends the policy definition in JSON to AWS Bedrock and returns a summary to the user, while the user is responsible for managing the actual policies that enforce access controls.

    βœ…

    {
      "type": "masking",
      "extra": {
        "filteredDictionary": [],
        "currentTags": [
          {
            "name": "email",
            "displayName": "email",
            "hasLeafNodes": false,
            "source": "curated"
          }
        ]
      },
      "exceptions": {
        "operator": "and",
        "conditions": [
          {
            "type": "groups",
            "group": {
              "name": "Managers"
            }
          }
        ]
      },
      "config": {
        "fields": [
          {
            "name": "email",
            "displayName": "email",
            "hasLeafNodes": false,
            "source": "curated"
          }
        ],
        "maskingConfig": {
          "type": "Consistent Value",
          "metadata": {}
        }
      }
    }
    masking
    row-level
    Format preserving masking
    Hashing
    Limit to purpose
    Masking fields within STRUCT columns
    Matching
    Minimize
    Only show data by time
    Randomized response
    Regex
    NULL
    constant
    Reversible masking
    Rounding
    WHERE clause

    Masking Policies

    Masking policies hide values in data, providing various levels of utility while still preserving privacy. Immuta offers column masking and cell-level masking.

    As with all Immuta policy types, use global policies when authoring masking policies to manage policies at scale. When using global policies, tagging your data with metadata becomes critical and is described in detail in the Compliantly open more sensitive data for ML and analytics use case.

    The masking options described on this page can be implemented in a variety of use cases, and there are several different approaches for masking data that allow you to make tradeoffs between privacy (how far you go with masking) and utility (how much you want the masked data to be useful to the data consumer). Use the table below to determine the circumstance under which a function should be used.

    Nulling
    Constant
    Regex
    Hashing
    Reversible masking
    Format preserving masking
    Randomized response
    Rounding
    Custom function
    circle-info

    Masking policy support by integration

    Since global policies can apply masking policies across multiple different databases at once, if an unsupported masking policy is applied to a column, Immuta will revert to NULLing that column. See the for an outline of masking policies supported by each integration.

    hashtag
    Masking types

    hashtag
    Constant

    Masking with a constant replaces any value in a column with a specified value. For example, you can replace the values in a column with the constant Redacted. The underlying data will appear to be a constant, removing any utility of that data.

    Apply this policy to strings that require a specific repeated value.

    hashtag
    Custom function

    This option uses SQL functions native to the underlying database to transform the values in a column. This can be used in numerous use cases, but notional examples include top-coding to some upper limit, a custom hash function, and string manipulation.

    Single quotes enclosing the regex and escaping special characters are required. The following example masks telephone numbers variably depending on the presence of a dash (implying a prefix), space, or only digits:

    The image below illustrates authoring a global policy using this custom function:

    Limitations

    • The masking functions are executed against the remote database directly. A poorly written function could lead to poor quality results, data leaks, and performance hits.

    • Using custom functions can result in changes to the original data type. In order to prevent query errors you must ensure that you cast this result back to the original type.

    • The function must be valid for the data type of the selected column. If it is not

    hashtag
    Format preserving masking

    circle-info

    Support limitation: This policy is only supported in the Snowflake integration.

    Format preserving masking uses a reversible function to mask the data in a way that the underlying structure of a value is preserved, so the length and type of a value are maintained. This is appropriate when the masked value should appear in the same format as the underlying value. Examples of this include social security numbers and credit card numbers where mask with format preserving masking would return masked values in a format consistent with credit cards or social security numbers, respectively.

    There is larger overhead with this masking type, and it should really only be used when format is critically valuable, such as situations when an engineer is building an application where downstream systems validate content. In almost all analytical use cases, format should not matter.

    hashtag
    Hashing

    Hashing masks the values with an irreversible sha256 hash, which is consistent for the same value throughout the data source, so you can count or track the specific values, but not know the true raw value.

    This policy type is appropriate for cases where the underlying value is sensitive, but there is a need to segment the population. Such attributes could be addresses, time segments, or countries. It is important to note that hashing is susceptible to inference attacks based on prior knowledge of the population distribution. For example, if state is hashed, and the dataset is a sample across the United States, then an adversary could assume that the most frequently occurring hash value is California. As such, it's most secure to use the hashing mask on attributes that are evenly distributed across a population.

    Hashed values are different across data sources, so you cannot join on hashed values unless you . Immuta prevents joins on hashed values to protect against link attacks where two data owners may have exposed data with the same masked column (a quasi-identifier), but their data combined by that masked value could result in a sensitive data leak.

    hashtag
    NULL

    This masking type replaces the values in the column with NULL, removing any identifiability from the column and all utility of the data.

    Apply this policy to numeric or text attributes that have a high re-identification risk, but little analytic value (names and personal identifiers).

    hashtag
    Randomized response

    circle-info

    Support limitation: This policy is only supported in the Snowflake integration.

    Randomized response masks data by slightly randomizing the values in a column, preserving the utility of the data while preventing outsiders from inferring content of specific records.

    This function randomizes the displayed value to make the true value uncertain, but maintains some analytic utility.

    For example, if an analyst wanted to publish data from a health survey she conducted, she could remove direct identifiers to make it difficult to single out individuals. However, consider these survey participants, a cohort of male welders who share the same zip code:

    participant_id
    zip_code
    gender
    occupation
    substance_abuse

    All members of this cohort have indicated substance abuse, sensitive personal information that could have damaging consequences, and, even though direct identifiers have been removed, outsiders could infer substance abuse for an individual if they knew a male welder in this zip code.

    In this scenario, using randomized response would change some of the Y's in substance_abuse to N's and vice versa; consequently, outsiders couldn't be sure of the displayed value of substance_abuse given in any individual row, as they wouldn't know which rows had changed.

    The randomization is applied differently to both categorical and quantitative values. In both cases, the noise can be increased to enhance privacy or reduced to preserve more analytic value. Immuta requires that you .

    • Categorical randomized response: Categorical values are randomized by replacing a value with some non-zero probability. Not all values are randomized, and the consumer of the data is not told which values are randomized and which ones remain unchanged. Values are replaced by selecting a different value uniformly at random from among all other values. If a randomized response policy were applied to a β€œstate” column, a person’s residency could flip from Maryland to Virginia, which would provide ambiguity to the actual state of residency. This policy is appropriate when obscuring sensitive values such as medical diagnosis or survey responses.

    • Datetime and numeric randomized response: Numeric and datetime randomized response apply a tunable, unbiased noise to the nominal value. This noise can obscure the underlying value, but the impact of the noise is reduced in aggregate. This masking type can be applied to sensitive numerical attributes, such as salary, age, or treatment dates.

    hashtag
    How the randomization works

    circle-info

    Sample data is processed during computation of randomized response policies

    When a randomized response policy is applied to a data source, the columns targeted by the policy are queried under a fingerprinting process. To enforce the policy, Immuta generates and stores predicates and a list of allowed replacement values that may contain data that is subject to regulatory constraints (such as GDPR or HIPAA) in Immuta's metadata database.

    The location of the metadata database depends on your deployment:

    Immuta applies a random number generator (RNG) that is seeded with some fixed attributes of the data source, column, backing technology, and the value of the high cardinality column, an approach that simulates cached randomness without having to actually cache anything.

    For string data, the random number generator essentially flips a biased coin. If the coin comes up as tails, which it does with the frequency of the replacement rate , then the value is changed to any other possible value in the column, selected uniformly at random from among those values. If the coin comes up as heads, the true value is released.

    For numeric data, Immuta uses the RNG to add a random shift from a 0-centered Laplace distribution with the standard deviation specified in the policy configuration. For most purposes, knowing the distribution is not important, but the net effect is that on average the reported values should be the true value plus or minus the specified deviation value.

    hashtag
    Preserving data utility

    Using randomized response doesn't destroy the data because data is only randomized slightly; aggregate utility can be preserved because analysts know how and what proportion of the values will change. Through this technique, values can be interpreted as hints, signals, or suggestions of the truth, but it is much harder to reason about individual rows.

    Additionally, randomized response gives deniability of record content not dataset participation, so individual rows can be displayed.

    hashtag
    Regular expression (regex)

    circle-exclamation

    Deprecation notice

    Support for masking with a non-global regex on Redshift data sources has been deprecated. Policy authors must use the global flag (by selecting Global in the regex policy builder) when masking using a regex on Redshift data sources.

    See the for EOL dates.

    This masking option uses a regular expression to replace all or a portion of a column value.

    This policy is similar to replacing with a constant, but it provides more utility because you can retain portions of the true value, and REGEX replacement allows for some groupings to be maintained, while providing greater ambiguity to the disclosed value. This masking technique is useful when the underlying data has some consistent structure, the remasked underlying data represents some re-identification risk, and a regular expression can be used to mask the underlying data to be less identifiable.

    When authoring the policy in Immuta, the regex and the replacement value do not need to be in single or double quotes.

    The following regex rule would mask the final digits of an IP address:

    Mask using a regex \d+$ the value in the columns ip_address for everyone.

    In this case, the regular expression \d+$

    \d matches a digit (equal to [0-9])

    +

    The image below illustrates authoring a regex global policy that will apply to Databricks Unity Catalog data sources:

    circle-info

    Databricks Unity Catalog integration regex_replace function

    The Databricks Unity Catalog integration uses Spark’s built in regex_replace function. That Databricks function currently . Regex will not work on this platform unless these settings are appropriately configured.

    hashtag
    Reversibility

    circle-exclamation

    Deprecation notice

    Support for reversible masking on Redshift data sources has been deprecated.

    See the for EOL dates.

    This masking option masks the values using a token that is consistent for the same value throughout the data source, so you can count or track the specific values, but not know the true raw value.

    This policy type is appropriate for cases where the underlying value is sensitive, but there is a need to segment the population. Such attributes could be addresses, time segments, or countries. Reversible hashing is susceptible to inference attacks based on prior knowledge of the population distribution. For example, if state is hashed, and the dataset is a sample across the United States, then an adversary could assume that the most frequently occurring hash value is California. As such, it's most secure to use the reversible hashing mask on attributes that are evenly distributed across a population.

    Hashed values are different across data sources, so you cannot join on hashed values unless you . Immuta prevents joins on hashed values to protect against link attacks where two data owners may have exposed data with the same masked column (a quasi-identifier), but their data combined by that masked value could result in a sensitive data leak.

    Reversibly masked fields can leak the length of their contents, so it is important to consider whether or not this may be an attack vector for applications involving its use.

    hashtag
    Rounding

    Rounding masking policies reduce, round, or truncate numeric or datetime values to a fixed precision.

    This technique hides precision from numeric values while providing more utility than simply hashing. For example, you could remove precision from a geospatial coordinate. You can also use this type of policy to remove precision from dates and times by rounding to the nearest hour, day, month, or year.

    • Datetime rounding: This policy truncates the precision of a datetime value to a user-defined precision. minute, hour, day, months, and year are the supported precisions.

    • Numeric rounding: This policy maps the nominal value to the ceiling of some specified bandwidth. Immuta has a recommended bandwidth based on the Freedman-Diaconis rule.

    hashtag
    Masking exceptions

    Exceptions to masking policies allow users specified in the exception to see masked data in the clear. Expand the collapsible blocks below to see how the different masking exceptions work.

    chevron-rightExempt user from data masking when user is acting under a purposehashtag

    If a user is acting under a purpose, they will see masked data in the clear.

    The table below illustrates how the following policy enforces access controls for 3 different users:

    Mask columns tagged restricted by making NULL for everyone except when user is acting under purpose Employee Retention

    chevron-rightExempt user from data masking when user possesses an attribute with a specific key and valuehashtag

    If a user possesses an attribute with the specified key and value, they will see masked data in the clear for data sources the policy applies to.

    The table below illustrates how the following policy enforces access controls for 3 different users:

    Mask columns tagged restricted by making NULL for everyone except when user possesses an attribute with key

    chevron-rightExempt user from data masking when user possesses an attribute with a specific key and any value that matches any column taghashtag
    circle-info

    This masking exception type is only supported for global data policies.

    If a user possesses an attribute with a key and value that matches any tag on the column, they will see the masked data in the tagged column in the clear.

    chevron-rightExempt user from data masking when user possesses an attribute with a specific key and any value that matches any data source taghashtag
    circle-info

    This masking exception type is only supported for global data policies.

    If a user possesses an attribute with a key and value that matches any tag on the data source, they will see the masked data in the tagged data source in the clear.

    chevron-rightExempt user from data masking when user is a member of a specific grouphashtag

    If a user is a member of the specified group, they will see masked data in the clear for data sources the policy applies to.

    The table below illustrates how the following policy enforces access controls for 3 different users:

    Mask columns tagged restricted by making NULL for everyone except when user is a member of group with name

    chevron-rightExempt user from data masking when user is a member of any group with a name that matches any column taghashtag
    circle-info

    This masking exception type is only supported for global data policies.

    If a user is a member of a group that matches any tag on the column, they will see the masked data in the tagged column in the clear.

    chevron-rightExempt user from data masking when user is a member of any group with a name that matches any data source taghashtag
    circle-info

    This masking exception type is only supported for global data policies.

    If a user is a member of a group with a name that matches any tag on the data source, they will see the masked data in the tagged data source in the clear.

    hashtag
    Advanced masking exceptions

    circle-info

    Advanced masking exceptions are only supported for .

    The masking exceptions in the list above that match any attribute value or any group name to any data source or column tag allow you to scope the masking exceptions to a subset of the data that is being masked. This decoupling between the generic masking rule and its fine-grained masking exceptions means that even with complex requirements, you only need to author one policy in Immuta.

    For example, if a compliance requirement states that users should only be able see restricted data in the clear if it is specific to their department, you could approach this by authoring the following masking policies in Immuta:

    • Mask columns tagged restricted using hashing for everyone except when user is member of a group with name Marketing on columns tagged Marketing in domain Marketing

    • Mask columns tagged restricted using hashing for everyone except when user is member of a group with name Finance

    However, the drawback of this approach is that each time a new department gets added, this also requires a new masking policy to be authored, tested, and deployed. This will lead to policy bloat, potentially resulting in dozens or thousands of masking policies.

    With advanced masking exceptions you can achieve the same result by just writing a single policy:

    Mask columns tagged restricted using hashing for everyone except when user is a member of a group with name that matches any column tag

    Users will only see restricted columns if their group name matches a tag on the column (just like in the former example), but you don't have to specify every possible group name or column tag in separate policies. Furthermore, using this policy type helps ensure that access controls will continue to be enforced appropriately even if your user or data metadata changes. If additional groups are created or tags are added to columns, access will be automatically updated to reflect those user and tag changes, so you don't have to update or add new policies after each change.

    hashtag
    Behavior of data source versus column tags

    Advanced masking exceptions allow you to match any attribute value or any group name to any data source or column tag. But how can you decide whether you should use the data source tags or column tags option? To determine this, consider the granularity with which you want to manage masking exceptions:

    • Do you need to exempt users on a column-by-column basis?

    • Or do you need to exempt users on a data source-by-data-source basis?

    The example below illustrates this decision point:

    • The table Salaries contains the columns name, email and wage that are all tagged PII.

    • User A is a member of a group named HR.

    When preparing the masking exception policy, the policy author compares the results of the using data source tags and the using column tags options:

    hashtag
    Evaluation logic for tag hierarchies

    Masking exceptions respect tag hierarchies when evaluating user attribute key-value pairs against column and data source tags. The evaluation always goes from left to right: the attribute value gets evaluated against the tag starting at the tag's root level.

    The table below illustrates this behavior by showing how the following masking policy enforces access controls for 4 different users:

    Mask columns tagged Marketing.Analyst by making NULL for everyone except when user possesses an attribute with key Role and value that matches any column tag

    In this example,

    • User A can see data in the email column in the clear because their Marketing.Analyst attribute value exactly matches the Marketing.Analyst tag.

    • User B can see data in the email column in the clear because their Marketing attribute value matches the root level in the Marketing.Analyst column tag. This example illustrates the benefits of using tag hierarchies, as you can match against tag parent values (instead of having to match against each individual child tag).

    hashtag
    Cell-level masking

    Use cell-level masking (sometimes also called conditional masking) to achieve granular, context-aware data protection that standard column-level security cannot provide alone. While regular masking policies are an all-or-nothing approach (a user can either see all values in a column or none at all), cell-level masking increases data utility by masking columns on a row-by-row basis. Cell-level masking is achieved by conditionally masking the content in one column based on the value in another column of the same row.

    Building a cell-level masking policy is done in the same manner as . The primary difference is when selecting who the policy should apply to, a where clause is injected.

    For example, a regular masking policy looks like this:

    Mask columns tagged SSN using hashing for everyone except members of group admins

    With this approach, users will either see all social security numbers or none at all. If only social security numbers for US-based subjects needed special protection, and all other can remain in the clear, you could insert a where clause into the policy condition:

    Mask columns tagged SSN using hashing where @columnReference('country_of_residence') = 'US' for everyone except members of group admins

    That policy will check the country_of_residence column in the table, and if the value is US the cell tagged SSN will be masked. For all rows where the column country_of_residence contains a different value than US, the data of the column tagged SSN will remain in the clear.

    Furthermore, instead of using the physical column name as shown in the example above, you could use the function. Using this function would allow you to target the policy on any table with a column containing location information no matter the name of that column in the physical table:

    Mask columns tagged SSN using hashing where @columnTagged('country') = 'US' for everyone except members of group admins

    This example policy targets the column with the tag country in the policy logic dynamically instead of looking for the hard-coded column name of country_of_residence.

    Note: When building conditional masking policies with custom SQL statements, avoid using a column that is masked using in the SQL statement, as this can lead to different behavior depending on your data platform and may produce results that are unexpected.

    hashtag
    Mixing masking policies on the same column

    In some cases, you may want several different masking policies applied to the same column through Otherwise policies. To build these policies, select everyone who instead of everyone or everyone except. After you specify who the masking policy applies to, select how it applies to everyone else in the Otherwise condition.

    You can add and remove tags in Otherwise conditions for global policies (unlike local policy Otherwise conditions); however, all tags or regular expressions included in the initial everyone who rule must be included in an everyone or everyone except rule in the additional clauses.

    hashtag
    Complex data types: masking fields within struct columns

    circle-info

    Public preview: This feature is available to all accounts.

    Spark supports a class of data types called complex types, which can represent multiple data values in a single column. Immuta supports masking fields within array and struct columns:

    • Array: an ordered collection of elements

    • Struct: a collection of elements that are primitive or complex types

    Without this feature enabled, the struct and array columns of a data source default to jsonb in the Data Dictionary, and the masking policies that users can apply to jsonb columns are limited. For example, if a user wanted to mask PII inside the column patient in the image below, they would have to apply null masking to the entire column or use a custom function instead of just masking name or address.

    After Complex Data Types is enabled on the , the column type for struct columns for new data sources will display as struct in the data dictionary. (For data sources that are already in Immuta, users can edit the data source and change the column types for the appropriate columns from jsonb to struct.) Once struct fields are available, they can be searched, tagged, and used in masking policies. For example, a user could tag name, ssn, and street as PII instead of the entire patient column.

    After a global or local policy masks the columns containing PII, users who do not meet the exception specified in the policy will see these values masked:

    Note: Immuta uses the > delimiter to indicate that a field is nested instead of the . delimiter, since field and column names could include ..

    circle-info

    Feature limitations

    • This feature is only available for Databricks data sources.

    • The Databricks Unity Catalog integration only supports masking with NULL on STRUCT

    hashtag
    Struct columns with many fields

    To get column information about a data source, Immuta executes a DESCRIBE call for the table. In this call, Spark returns a simple string representation of the schema for each column in the table. For the patient column above, the simple string would look like this:

    struct<name:string,ssn:string,age:int,address:struct<city:string,state:string,zipCode:string,street:text>>

    Immuta then parses this string into the following format for the data source's dictionary:

    However, if the struct contains more than 25 fields, Spark truncates the string, causing the parser to fail and fall back to jsonb. Immuta will attempt to avoid this failure by increasing the number of fields allowed in the server-side property setting, maxToStringFields.

    Supported with caveats

    Preserves value locality

    ❌

    ❌

    ❌

    ❌

    ❌

    ❌

    Supported with caveats

    βœ…

    Supported with caveats

    Preserves averages

    n/a

    n/a

    n/a

    n/a

    n/a

    ❌

    βœ…

    Supported with caveats

    Supported with caveats

    Preserves message length

    ❌

    ❌

    Supported with caveats

    ❌

    ❌

    Supported with caveats

    ❌

    n/a

    Supported with caveats

    Reversible

    ❌

    ❌

    ❌

    ❌

    βœ…

    βœ…

    ❌

    ❌

    Supported with caveats

    Preserves appearance

    ❌

    ❌

    Supported with caveats

    ❌

    ❌

    βœ…

    βœ…

    βœ…

    Supported with caveats

    Applicable to numeric data

    ❌

    ❌

    ❌

    ❌

    ❌

    βœ…

    βœ…

    βœ…

    Supported with caveats

    Provides deniability of record content

    βœ…

    βœ…

    Supported with caveats

    Supported with caveats

    ❌

    ❌

    βœ…

    ❌

    Supported with caveats

    Suitable for de-identification

    βœ…

    βœ…

    Supported with caveats

    Supported with caveats

    Supported with caveats

    Supported with caveats

    ❌

    ❌

    Supported with caveats

    Column value determinism

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    ❌

    βœ…

    Supported with caveats

    Introduces NULLs

    βœ…

    ❌

    ❌

    ❌

    ❌

    βœ…

    βœ…

    ❌

    Supported with caveats

    Performance

    10/10

    10/10

    Variable

    6/10

    4/10

    2/10

    5/10

    8/10

    Variable

    Local policies will error and show a message that the function is not valid.

  • Global policies will error and change to the default masking type (hashing for text and NULL for all others).

  • 75002

    Male

    Welder

    Y

    260930ce

    75002

    Male

    Welder

    Y

    046dc7fb

    75002

    Male

    Welder

    Y

    Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
  • SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.

  • To ensure this process does not violate your organization's data localization regulations, you need to first activate this masking policy type before you can use it in your Immuta tenant.

    Quantifier β€” Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)

    $ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)

    This ensures we capture the last digit(s) after the last . in the IP address. We then can enter the replacement for what we captured, which in this case is XXX. So the outcome of the policy, would look like this: 164.16.13.XXX

    This regex rule applies masking to telephone numbers variably depending on the presence of a dash (implying a prefix), space, or only digits:

    Mask using a regex (\+?\d{0,3}[-\s]?)?\d{4} the value in the column tagged Discovered...Telephone Number for everyone.

    name column restricted

    email column restricted

    wage column restricted

    office location column (no tags)

    User A purpose

    Marketing Research

    ❌

    ❌

    ❌

    βœ…

    User B purpose

    Employee Retention

    βœ…

    βœ…

    βœ…

    βœ…

    User C purpose

    Internship Capstone

    In this example,

    • User A can only see the office location column since it is not tagged as restricted. The other columns are masked because this user is not acting under the Employee Retention purpose specified in the policy.

    • User B can see all columns because they are acting under the Employee Retention purpose the policy specifies.

    • User C can only see the office location column since it is not tagged as restricted. The other columns are masked because this user is not acting under the Employee Retention purpose specified in the policy.

    Employee
    and value
    Marketing

    name column restricted

    email column restricted

    wage column restricted

    office location column (no tags)

    User A attribute

    Employee:Marketing

    βœ…

    βœ…

    βœ…

    βœ…

    User B attribute

    Employee:Finance

    ❌

    ❌

    ❌

    βœ…

    User C attribute

    Intern:Marketing

    In this example,

    • User A can see all columns because they possess the Employee.Marketing attribute key-value pair the masking policy exception criteria specifies.

    • User B can only see the office location column since it is not tagged as restricted. The other columns are masked because this user does not have the Marketing attribute value specified in the masking policy exception criteria.

    • User C can only see the office location column since it is not tagged as restricted. Although this user has an attribute value of Marketing, the user's attribute key Intern does not match the attribute key Employee specified in the masking policy exception criteria, so the restricted columns are masked for this user.

    The table below illustrates how the following policy enforces access controls for 4 different users:

    Mask columns tagged restricted by making NULL for everyone except when user possesses an attribute with key Employee and value that matches any column tag

    name column restricted, HR

    email column restricted, Marketing

    wage column restricted, HR, Finance

    office location column (no tags)

    User A attribute

    Employee:Marketing

    ❌

    βœ…

    ❌

    βœ…

    User B attribute

    Employee:Finance

    ❌

    ❌

    βœ…

    βœ…

    User C attribute

    Employee:HR

    In this example,

    • User A can see the email column because the masking policy exception criteria specifies the Employee attribute key, and the Marketing tag on the column matches the user's Marketing attribute value. Because the office location column is not tagged as restricted, the user can also see that column in the clear.

    • User B can see the wage column because the masking policy exception criteria specifies the Employee attribute key, and the Finance tag on the column matches the user's Finance attribute value. Because the office location column is not tagged as restricted, the user can also see that column in the clear.

    • User C can see the name column and wage column because the masking policy exception criteria specifies the Employee attribute key, and the HR tag on the columns matches the user's HR attribute value. Because the office location column is not tagged as restricted, the user can also see that column in the clear.

    • User D can only see the office location column since it is not tagged as restricted. Although this user has an attribute value of Marketing, which matches the Marketing tag on the email column, the user's attribute key Intern does not match the attribute key Employee specified in the masking policy exception criteria, so the email column is masked for this user.

    The table below illustrates how the following policy enforces access controls for 4 different users:

    Mask columns tagged restricted by making NULL for everyone except when user possesses an attribute with key Employee and value that matches any data source tag

    Employee survey results data source HR

    Purchase orders data source Finance

    Customer contacts data source Marketing, HR

    User A attribute

    Employee:Marketing

    ❌

    ❌

    βœ…

    User B attribute

    Employee:Finance

    ❌

    βœ…

    ❌

    User C attribute

    Employee:HR

    βœ…

    ❌

    βœ…

    In this example,

    • User A can see any column tagged restricted in the Customer contacts data source because the masking policy exception criteria specifies the Employee attribute key, and the Marketing tag on the data source matches the user's Marketing attribute value.

    • User B can see any column tagged restricted in the Purchase orders data source because the masking policy exception criteria specifies the Employee attribute key, and the Finance tag on the data source matches the user's Finance attribute value.

    • User C can see any column tagged restricted in the Employee survey results data source and the Customer contacts data source because the masking policy exception criteria specifies the Employee attribute key, and the HR tag on the data sources matches the user's HR attribute value.

    • User D cannot see any masked columns in these data sources. Although this user has an attribute value of Marketing, which matches the Marketing tag on the Customer contacts data source, the user's attribute key Intern does not match the attribute key Employee specified in the masking policy exception criteria, so the columns are masked for this user.

    HR

    name column restricted

    email column restricted

    wage column restricted

    office location column (no tags)

    User A group

    Marketing

    ❌

    ❌

    ❌

    βœ…

    User B group

    Finance

    ❌

    ❌

    ❌

    βœ…

    User C group

    HR

    In this example,

    • User A can only see the office location column since it is not tagged as restricted. The other columns are masked because this user is not a member of the HR group specified in the masking policy exception criteria.

    • User B can only see the office location column since it is not tagged as restricted. The other columns are masked because this user is not a member of the HR group specified in the masking policy exception criteria.

    • User C can see all columns because they are a member of the HR group the masking policy exception criteria specifies.

    The table below illustrates how the following policy enforces access controls for 4 different users:

    Mask columns tagged restricted by making NULL for everyone except when user is a member of a group with name that matches any column tag.

    name column restricted, HR

    email column restricted, Marketing

    wage column restricted, Finance, HR

    office location column (no tags)

    User A group

    Marketing

    ❌

    βœ…

    ❌

    βœ…

    User B group

    Finance

    ❌

    ❌

    βœ…

    βœ…

    User C group

    HR

    In this example,

    • User A can see the email column because the Marketing tag on the column matches the user's group name. Because the office location column is not tagged as restricted, the user can also see that column in the clear.

    • User B can see the wage column because the Finance tag on the column matches the user's group name. Because the office location column is not tagged as restricted, the user can also see that column in the clear.

    • User C can see the name column and wage column because the HR tag on the columns matches the user's group name. Because the office location column is not tagged as restricted, the user can also see that column in the clear.

    • User D can only see the office location column since it is not tagged as restricted. None of the other columns in the data source have a tag that matches the group name Research, so all of the other columns are masked for this user.

    The table below illustrates how the following policy enforces access controls for 4 different users:

    Mask columns tagged restricted by making NULL for everyone except when user is a member of a group with name that matches any data source tag.

    Employee survey results data source HR

    Purchase orders data source Finance, HR

    Customer contacts data source Marketing

    User A group

    Marketing

    ❌

    ❌

    βœ…

    User B group

    Finance

    ❌

    βœ…

    ❌

    User C group

    HR

    βœ…

    βœ…

    ❌

    In this example,

    • User A can see any column tagged restricted in the Customer contacts data source because the Marketing tag on the data source matches the user's group name.

    • User B can see any column tagged restricted in the Purchase orders data source because the Finance tag on the data source matches the user's group name.

    • User C can see any column tagged restricted in the Employee survey results data source and the Purchase orders data source because the HR tag on the data sources matches the user's group name.

    • User D cannot see any masked columns in these data sources, since none of the data source tags match the group name Research.

    on columns tagged
    Finance
    in domain Finance
  • Mask columns tagged restricted using hashing for everyone except when user is member of a group with name HR on columns tagged HR in domain HR

  • Mask columns tagged restricted using hashing for everyone except when user is member of a group with name ... on columns tagged ... in domain ...

  • User C cannot see data in the email column in the clear. Although this user's attribute value Analyst matches the tag's child value on the column, the user's attribute value does not match the tag root level, as Analyst is not equal to Marketing.

  • User D cannot see data in the email column in the clear. Their Finance.Analyst attribute value does not match the Marketing.Analyst column tag, as Finance is not equal to Marketing.

  • User E cannot see data in the email column in the clear. Their Marketing.Analyst.Junior attribute value does not match the Marketing.Analyst column tag, as Junior is not part of the tag hierarchy on the email column.

  • ,
    ARRAY
    , and
    MAP
    type columns.
  • This feature only supports Parquet and Delta table types.

  • Preserves equality and grouping

    ❌

    ❌

    Supported with caveats

    βœ…

    βœ…

    βœ…

    ❌

    ❌

    Supported with caveats

    Preserves range statistics

    ❌

    ❌

    Supported with caveats

    βœ…

    βœ…

    βœ…

    Supported with caveats

    880d0096

    75002

    Male

    Welder

    Y

    f267334b

    75002

    Male

    Welder

    Y

    email column Marketing.Analyst

    User A attribute Role:Marketing.Analyst

    βœ…

    User B attribute Role:Marketing

    βœ…

    User C attribute Role:Analyst

    ❌

    User D attribute Role:Finance.Analyst

    ❌

    User E attribute Role:Marketing.Analyst.Junior

    ❌

    Using data source tags

    • Policy: Mask columns tagged PII using NULL except when user is a member of a group with name that matches any data source tag

    • Salaries table tag: HR

    Result User A will see all three columns in the clear. If any other columns get added to the Salaries table and tagged PII, user A will see those automatically in the clear.

    Using column tags

    • Policy: Mask columns tagged PII using NULL except when user is a member of a group with name that matches any column tag

    • Salaries table tag: No tags

    • Column tags

    Result User A will only see the wage column in the clear. If any other columns get added to the Salaries table and tagged PII, user A will not see those automatically in the clear unless they are also tagged HR .

    data policy support matrix
    enable masked joins on data sources within a project
    opt in to use this masking policy type
    configured in the policy
    Deprecations page
    only supports global pattern flags set as global (g) and case-sensitive
    Deprecations page
    enable masked joins on data sources within a project
    global data policies
    building a column masking policy
    @columnTagged('tag name')
    randomized response
    App settings page
    Note the use of @column to specify the column to which this should apply
    A regex applied to Databricks which requires Global pattern to be enabled and Case insensitivity disabled.

    Supported with caveats

    bfdb43db

    REGEXP_REPLACE(@column, '(\\+?\\d{0,3}[-\\s]?)?\\d{4}', '****')
    {
      dataType: 'struct',
      children: [
        {
          name: 'name',
          dataType: 'text'
        },
        {
          name: 'ssn',
          dataType: 'text'
        },
        {
          name: 'age',
          dataType: 'integer'
        },
        {
          name: 'address',
          dataType: 'struct',
          children: [
            {
              name: 'city',
              dataType: 'text'
            },
            {
              name: 'state',
              dataType: 'text'
            },
            {
              name: 'zipCode',
              dataType: 'text'
            },
            {
              name: 'street',
              dataType: 'text'
            },
          ]
        }
      ]
    }
    Column tags
    • name: PII

    • email: PII

    • wage: PII

  • name: PII

  • email: PII

  • wage: PII, HR

  • ❌

    ❌

    ❌

    βœ…

    ❌

    ❌

    ❌

    βœ…

    βœ…

    ❌

    βœ…

    βœ…

    User D attribute

    Intern:Marketing

    ❌

    ❌

    ❌

    βœ…

    User D attribute

    Intern:Marketing

    ❌

    ❌

    ❌

    βœ…

    βœ…

    βœ…

    βœ…

    βœ…

    ❌

    βœ…

    βœ…

    User D group

    Research

    ❌

    ❌

    ❌

    βœ…

    User D group

    Research

    ❌

    ❌

    ❌

    Author a Time-Based Restriction Policy

    1. Determine your policy scope:

      • Global policy: Click the Policies icon in the navigation menu and select the Data Policies tab. Click New data policy and complete the Policy name field.

      • Local policy: Navigate to a specific data source and click the Policies tab. Scroll to the Data Policies section and click New Policy.

    2. Select Only show data by time from the first dropdown.

    3. Select where data is more recent than or older than from the next dropdown, and then enter the number of minutes, hours, days, or years that you would like to restrict the data source to. Note that unlike many other policies, there is no field to select a column to drive the policy. This type of policy will be driven by the data source's event-time column, which is selected at data source creation.

    4. Choose for everyone, everyone except, or for everyone who to drive the policy. If you choose for everyone except, use the subsequent dropdown to choose the group, purpose, or attribute for your condition. If you choose for everyone who as a condition, complete the Otherwise clause before continuing to the next step.

    5. : Click Explain this policy to open the AI assistant side sheet. The will generate a textual summary and explanation of the policy behavior on various users using mock data.

    6. Opt to complete the Enter Rationale for Policy (Optional) field, and then click Add.

    7. For global policies: Click the dropdown menu beneath Where should this policy be applied, and select On all data sources, On data sources, or When selected by data owners. If you select On data sources, finish the condition in one of the following ways:

      • tagged: Select this option and then search for tags in the subsequent dropdown menu.

      • with columns tagged

    8. Click Create Policy. If creating a global policy, you then need to click Activate Policy or Stage Policy.

    : Select this option and then search for
    tags
    in the subsequent dropdown menu.
  • with column names spelled like: Select this option, and then enter a regex and choose a modifier in the subsequent fields.

  • in server: Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

  • created between: Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  • sparkles
    AI-powered feature
    AI assistant