Skip to content

Write a Global Data Policy

Audience: Data Governors

Content Summary: This page outlines step-by-step instructions for creating and staging Global Data Policies, which are built by Data Governors and apply to all data sources across an organization.

For instructions on writing Local Data Policies, see the tutorial in Chapter 4 - Connecting Data. For information about creating custom policy handlers, see the Advanced Guide.

Use Case

Compliance Requirement: Redact all personal information for everyone except when running queries in Test and Prod.

For this organization's purposes, they should write a Global Policy that masks all personal information, except for system accounts running queries in Test and Prod. To do so, they will use the tags and attributes created in Chapter 2 to build their Global Data Policy. The steps below use this scenario to illustrate the policy, but other policy builder options are noted throughout the tutorial.

1 - Create a Global Data Policy

Best Practice: Write Global Policies

Build Global Policies with tags instead of writing Local Policies to manage data access. This practice will prevent you from having to write/re-write single policies for every data source added to Immuta.

  1. Click the Policies page icon in the left sidebar and select the Data Policies tab at the top of this page.
  2. Click Add Policy, enter a name for your policy, and then select Mask from the first dropdown menu.

    Select Mask

  3. Select columns tagged and then select PII from the subsequent dropdown menu. Additional options include columns with any tag, columns with no tags, all columns, or columns with names spelled like.

    Select Columns Tagged

  4. Select using hashing from the next dropdown menu. Additional custom masking types include with reversibility, by making null, using a constant, using a regex, by rounding, with format preserving masking, with K-Anonymization, or using randomized response. Click on the tabs below to view specific instructions for these masking policies:

    using a constant

    Enter a constant in the field that appears next to the masking type dropdown:

    Masking Policy Enter Constant

    by rounding

    1. Select using fingerprint or specifying the bucket from the subsequent dropdown menu.
    2. If specifying the bucket, select the Bucket Type and then enter the bucket size.

      Specify Bucket

    Note: If you choose by rounding as your masking type, the statistics of the data fingerprint will autogenerate the bucket size when the policy is applied to a data source.

    using a regex

    1. Enter a regular expression and replacement value in the fields that appear next to the masking type dropdown.
    2. From the next dropdown, choose to make the regex Case Insensitive and/or Global.

      Masking Policy Regex

    with K-Anonymization

    Select either using fingerprint or requiring group size of at least and enter a group size in the subsequent dropdown menu.

    Select Hashing

  5. Select everyone except from the next dropdown menu to continue the condition. Additional options include everyone and everyone who.

    Everyone Except

  6. In the subsequent dropdown menus, choose possesses attribute and select Environment dev, or, and Environment prod. You could also use group or purpose to complete a condition.

    Notes:

    • If you choose for everyone who as a condition, complete the Otherwise clause before continuing to the next step.

    • You can add more than one condition by selecting + ADD. The dropdown menu in the far right of the Policy Builder contains conjunctions for your policy. If you select or, only one of your conditions must apply to a user for them to see the data. If you select and, all of the conditions must apply.

    Possesses Attribute

  7. Opt to complete the Enter Rationale for Policy (Optional) field, and then click Add.

    Global Data Policy

  8. The dropdown menu beneath Where should this policy be applied should already be complete. However, you have the option to select On all data sources or On data sources. If you selected On data sources, finish the condition in one of the following ways:

    tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with columns tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with column names spelled like

    Select this option, and then enter a regex and choose a modifier in the subsequent fields.

    in server

    Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

    created between

    Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  9. Click Create Policy, and then click Activate Policy or Stage Policy.

    Activate Policy

2 - Create a Custom Certification

This step is optional, but Data Governors can add certifications that outline acknowledgements or require approvals from Data Owners. For example, Data Governors could add a custom certification that states that Data Owners must verify that tags have been added correctly to their data sources before certifying the policy.

  1. Click Add Certification in the top right corner of the Data Policy Builder.

    Add Certification

  2. Enter a Certification Label and Certification Text in the corresponding fields of the dialog that appears.

    Custom Certification Dialog

  3. Click Save.

Results

Now that this Global Policy is active, users with the attribute Environment.dev will see redacted data and users with the attributes Environment.test or Environment.prod will see all the data:

Dev User

Dev Results

Test User

Test Results

Prod User

Prod Results

Additional Data Policies

You may need to use additional Data Policy types to meet your needs. The table below defines each of the additional policy types. For more information, see Data Policies in the Appendix.

Policy Type Description
Row Redaction For query-backed data sources, Governors can restrict which rows in the data source tables are visible to which users. This redaction is done by matching values in a specific column against a user's groups, attributes, or purposes.
Minimization These policies hide a specified percentage of query results from a user, based on a column with high cardinality (e.g., an employee ID number or other unique identifier).
Time-based Restrictions If a data source has time-based restriction policies, queries run against the data source by a user will only return rows/blobs with a date in its event-time column/attribute from within a certain range.
Purpose-based Restrictions Governors in Immuta can restrict usage of any data source to one or more purposes. If a user wishes to run SQL queries against a purpose-restricted data source, they must use the SQL credentials provided by a project containing that purpose.
Differential Privacy Data sources with Differential Privacy policies will only return results for a certain type of SQL query: aggregates, such as the COUNT and SUM functions. Users must avoid aggregate queries that are too specific; Immuta will only return differentially private results for broad aggregate queries.

Additional Tutorials

Click on the tabs below for tutorials outlining how to implement the following policy types:

Row Redaction

  1. Navigate to the Data Policies tab on the Policies page.
  2. Click Add Policy, enter a name for your policy, and then select the Only show rows action from the first dropdown.
  3. Choose where user, where the value in column tagged, or where from the next dropdown. Click on the tabs below to view specific instructions for these clauses:

    where user

      1. Use the next field to choose the attribute, group, or purpose that you will match values against.
      2. Use the next dropdown menu to choose the tag that will drive this policy.

      Choose the condition that will drive the policy from the next dropdown: is a member of a group or possesses an attribute.

      Note: You can add more than one condition by selecting + ADD. The dropdown menu in the far right of the Policy Builder contains conjunctions for your policy. If you select or, only one of your conditions must apply to a user for them to see the data. If you select and, all of the conditions must apply.

    where the value in the column tagged

    1. Select the tag from the next dropdown menu.
      1. From the subsequent dropdown, choose is or is not in the list, and then enter a list of comma-separated values.

    where

    1. Enter a valid SQL WHERE clause in the subsequent field. When you place your cursor in this field, a tool-tip should appear that details valid input and the column names of your data source. See Custom WHERE Clause Functions for more information about specific functions.

      WHERE Clause Policy 1

  4. Choose the condition that will drive the policy: for everyone, for everyone except, or for everyone who.

  5. Use the subsequent dropdown to choose the group, purpose, or attribute key / value pair for your condition.

    Note: If you choose for everyone who as a condition, complete the Otherwise clause before continuing to the next step.

  6. Opt to complete the Enter Rationale for Policy (Optional) field, and then click Add.

    Global Row Redaction Policy

  7. Click the dropdown menu beneath Where should this policy be applied, and select On all data sources or On data sources. If you selected On data sources, finish the condition in one of the following ways:

    tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with columns tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with column names spelled like

    Select this option, and then enter a regex and choose a modifier in the subsequent fields.

    in server

    Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

    created between

    Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  8. Click Create Policy, and then click Activate Policy or Stage Policy.

Minimization

  1. Navigate to the Data Policies tab on the Policies page.
  2. Click Add Policy, enter a name for the policy, and then select the Minimize data source from the first dropdown.
  3. Complete the enter percentage field to limit the data source.
  4. Choose the condition that will drive the policy: for everyone, for everyone except, or for everyone who.
  5. Use the next field to choose the attribute, group, or purpose that you will match values against.

    Notes:

    • If you choose for everyone who as a condition, complete the Otherwise clause before continuing to the next step.

    • You can add more than one condition by selecting + ADD. The dropdown menu in the far right of the Policy Builder contains conjunctions for your policy. If you select or, only one of your conditions must apply to a user for them to see the data. If you select and, all of the conditions must apply.

  6. Opt to complete the Enter Rationale for Policy (Optional), and then click Add.

    Minimization Policy 1

  7. Click the dropdown menu beneath Where should this policy be applied, and select On all data sources or On data sources. If you selected On data sources, finish the condition in one of the following ways:

    tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with columns tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with column names spelled like

    Select this option, and then enter a regex and choose a modifier in the subsequent fields.

    in server

    Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

    created between

    Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  8. Click Create Policy, and then click Activate Policy or Stage Policy.

Time-based Restrictions

  1. Navigate to the Data Policies tab on the Policies page.
  2. Click Add Policy, enter a name for the policy, and select Only show data by time from the first dropdown.
  3. Select where data is more recent than or older than from the next dropdown, and then enter the number of minutes, hours, days, or years that you would like to restrict the data source to. Note that unlike many other policies, there is no field to select a column to drive the policy. This type of policy will be driven by the data sources event-time column, which is selected at data source creation.
  4. Choose the condition that will drive the policy: for everyone, for everyone except, or for everyone who.
  5. Use the next field to choose the attribute, group, or purpose that you will match values against.

    Notes:

    • If you choose for everyone who as a condition, you will need to complete the Otherwise clause before continuing to the next step.

    • You can add more than one condition by selecting + ADD. The dropdown menu in the far right of the Policy Builder contains conjunctions for your policy. If you select or, only one of your conditions must apply to a user for them to see the data. If you select and, all of the conditions must apply.

  6. Opt to complete the Enter Rationale for Policy (Optional) field, and then click Add.

    Time Policy 1

  7. Click the dropdown menu beneath Where should this policy be applied, and select On all data sources or On data sources. If you selected On data sources, finish the condition in one of the following ways:

    tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with columns tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with column names spelled like

    Select this option, and then enter a regex and choose a modifier in the subsequent fields.

    in server

    Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

    created between

    Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  8. Click Create Policy, and then click Activate Policy or Stage Policy.

Purpose-based Restrictions

  1. Navigate to the Data Policies tab on the Policies page.
  2. Click Add Policy, enter a name for the policy, and then select Limit usage to purpose(s) in the first dropdown menu.
  3. In the next field, select ANY PURPOSE or the specific purpose that you would like to restrict usage of this data source to.

    Note: You can add more than one condition by selecting + ADD. The dropdown menu in the far right of the Policy Builder contains conjunctions for your policy. If you select or, only one of your conditions must apply to a user for them to see the data. If you select and, all of the conditions must apply.

  4. From the next dropdown, select for everyone or for everyone except. If you select for everyone except, you must select conditions that will drive the policy.

  5. Opt to complete the Enter Rationale for Policy (Optional) field, and then click Add.

    Purpose Policy 1

  6. Click the dropdown menu beneath Where should this policy be applied, and select On all data sources or On data sources. If you selected On data sources, finish the condition in one of the following ways:

    tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with columns tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with column names spelled like

    Select this option, and then enter a regex and choose a modifier in the subsequent fields.

    in server

    Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

    created between

    Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  7. Click Create Policy, and then click Activate Policy or Stage Policy.

Differential Privacy

  1. Navigate to the Data Policies tab on the Policies page.
  2. Click Add Policy, enter a name for the policy, and then select Make differentially private in the first dropdown menu.
  3. Select the noise level you would like to apply to your data: small, medium, or large; these values correspond to epsilon-differential privacy, where epsilon (privacy loss) has values of 3, 2.1, and 1.4, respectively.
  4. Choose the condition that will drive the policy: for everyone, for everyone except, or for everyone who.
  5. Use the next field to choose the attribute, group, or purpose that you will match values against.

    Notes:

    • If you choose for everyone who as a condition, you will need to complete the Otherwise clause by following steps 5 through 7 again before continuing to step 8.
      • You can add more than one condition by selecting + ADD. The dropdown menu in the far right of the Policy Builder contains conjunctions for your policy. If you select or, only one of your conditions must apply to a user for them to see the data. If you select and, all of the conditions must apply.
  6. Opt to complete the Enter Rationale for Policy (Optional) field, and then click Add.

    Differential Privacy Policy 1

  7. Click the dropdown menu beneath Where should this policy be applied, and select On all data sources or On data sources. If you selected On data sources, finish the condition in one of the following ways:

    tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with columns tagged

    Select this option and then search for tags in the subsequent dropdown menu.

    with column names spelled like

    Select this option, and then enter a regex and choose a modifier in the subsequent fields.

    in server

    Select this option and then choose a server from the subsequent dropdown menu to apply the policy to data sources that share this connection string.

    created between

    Select this option and then choose a start date and an end date in the subsequent dropdown menus.

  8. Click Create Policy, and then click Activate Policy or Stage Policy.

What's Next

Now that you've written a Data Policy, you can continue to the next page or to one of these tutorials:

Import and Export Policies Clone Policies