Skip to content

Adjust Identification and Classification Framework Tags

Requirements:

Immuta Discover provides identification frameworks out-of-the-box to recognize and tag data, and Discover also provides classification frameworks out-of-the-box to categorize and classify data. These frameworks are all generic to industry practices and should be customized to each organization's specific needs.

Tune SDD frameworks and identifiers first to adjust where Discovered tags are applied. Because classification frameworks apply classification tags from the Discovered tags, tuning SDD should come first and will have trickle-down effects on classification. Customizing SDD requires some initial work but will automate data tagging for all data sources in the future.

Follow the steps below to tune SDD from the Default Framework:

  1. Create a new identification framework.
  2. Configure the resulting tags in the identifiers.
  3. Create a new identifier specific to your organization.
  4. Add a few data sources to your new framework: This will remove the tags from any previous identification frameworks and re-run identification with your new framework. From here, either continue to edit identifiers to reconfigure the applied tags, or if you are happy with the results, proceed to the next step.
  5. Configure SDD to run your new framework on all data sources.

After SDD has applied entity tags, classification frameworks will automatically reapply their tags to account for any changes to Discovered tags. It may be necessary to adjust the classification tags based on your organization's data, security, and compliance needs.

Assess your queries with Detect

Requirements:

Use the Detect dashboards to review queries at different sensitivity levels and review the tags that have been applied to your data source columns to understand the tags that Immuta applied there:

  1. Have an Immuta user subscribed to a data source make multiple queries to a data source in Snowflake. The user should query both non-sensitive and sensitive data.
  2. Navigate to the Audit page and click ↻Native Query Audit to pull in queries made in Snowflake.
  3. Navigate to the Events (Beta) page. Note that Snowflake has a 15-minute data latency for all audit events.
  4. Select the Event Id of one of the queries. Click the Columns tab.
  5. The Column tab lists the columns in the query organized from highest to lowest sensitivity and the tags applied to each column. Check that the columns you know to be sensitive are here.

    For example, if the query has a column with last names, you should see a minimum of the following tags: DSF. Personal, DSF.Record.Subject.Type.Individual, DSF.Record.Identifiability.Identifiable, and DSF.Control.Personal.

  6. Note any sensitive columns not labeled as sensitive.

  7. Complete steps 2-5 for as many queries as you want.

Assess your data source tags

Requirement: Immuta permission GOVERNANCE or data owner

Target some data sources to manually review tags:

  1. Navigate to the data dictionary for the data source by opening the Data Sources page and selecting a data source. Click the Data Dictionary tab to open the data dictionary.
  2. The data dictionary lists the data source columns, with details about the name, data type, and a list of the tags on each column. Assess whether the tags are accurate to your data.

If you find that too many tags are applied

Tags may be unexpected but still accurate to your data. Additionally, they may have been applied because they were found to be the best match from the identifiers in the framework.

If you want to improve SDD and personalize it to your data,

  1. Assess why the tag was applied to your data.

  2. Is the identifier incorrectly matching your data and irrelevant to your organization? Delete the identifier that applied the tag from the identification framework.

  3. Is the identifier incorrectly matching this specific column, but correct in other places? It must have been the most correct match found by identification. Create a better match by completing the following steps:

    1. Create an identifier specific to the column with a new Discovered tag.
    2. Add the identifier to the identification framework so this column is correctly matched by identification.

If you want to remove the unexpected tags, use one of the following how-to guides:

  1. Deactivate frameworks irrelevant to your organization.
  2. Ensure the Discovered tags are applied properly by adjusting SDD.
  3. Remove any excess tags. Note that classification tags build off of other tags, so removing a single classification or Discovered tag can have trickle-down effects on the data source.
  4. Adjust the classification framework rules using the frameworks API.

If you find that tags are missing

If you were expecting some sensitive data to be tagged and it is not, enable additional tags using one of the following how-to guides:

  1. Activate additional frameworks relevant to your organization.
  2. Ensure the Discovered tags are applied properly by adjusting SDD.
  3. Add additional tags. Note that classification tags build off of other tags, so adding a single classification or Discovered tag can have trickle-down effects on the data source.
  4. Adjust the classification framework rules using the frameworks API.

Tune your data dictionaries

Requirement: Immuta permissions GOVERNANCE and AUDIT

Tags can be edited on an individual basis for each data source. If broad changes to the classification framework are necessary to re-tag your data, use the frameworks API.

  1. Navigate to the Data Sources page and select the data sources that you assessed and noted issues.
  2. Click the Data Dictionary tab.
  3. Delete unnecessary tags by clicking on the tag you want to remove from the column, and select Disable from the tag side sheet.
  4. To add tags,
    1. Click Add Tags in the Actions column.
    2. Begin typing the name of the tag you want to add in the Search by Name field and select the tag from the dropdown list.
    3. Click Add.