Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Requirement: Immuta permission GOVERNANCE
This how-to guide is for enabling sensitive data discovery (SDD). For additional information on sensitive data discovery and classification, see the Discover architecture page.
Navigate to the App Settings page and scroll to the Sensitive Data Discovery section.
Select the Enable Sensitive Data Discovery (SDD) checkbox to enable SDD.
Click Save and then click Confirm to apply your changes. Note that the Immuta tenant will have a system restart.
Run SDD for a select group of data sources; use one of the following options to run SDD on specific data sources:
Make the following request specifying the data sources in the request using the Immuta API.
A successful request will have the code 200
and a body with the number of jobs created from the request:
Navigate to the data source overview page of the data source you listed in the payload.
Click the Data Dictionary tab.
Assess whether the Discovered and classification tags applied are accurate.
If they are, then repeat the steps above for more of your data sources. Once a majority of your data sources appear to have accurate tags, run SDD on all your data sources. If the tags are not accurate, you will need to tune SDD and classification frameworks. See the Adjust frameworks and tags guide for instructions.
Click the Discover icon and the Identification tab in the navigation menu.
Select the more actions icon.
Select Run SDD and then select it again in the modal.
Requirement: Immuta permission GOVERNANCE
Make the following request using the Immuta API to run SDD for all data sources, specifying all
as true
:
A successful request will have the code 200
and a body with the number of jobs created from the request:
Requirements:
Native SDD enabled and turned on
Immuta permission GOVERNANCE
Click the Discover icon in the navigation menu and select the Patterns tab.
Click Create New.
In the modal, enter a name for the new pattern.
Write a Description for the type of data the pattern will find.
Select the Type of pattern.
For regex and column name regex, enter the regex.
For dictionary, enter the values you want the pattern to match and toggle the switch on if you want them to be case-sensitive.
Click Create Pattern.
See the Manage rules page to add your new pattern to a framework.
Note that all user-created patterns must be a 90% match or greater for the contents of the column to be tagged.
Editing a pattern will affect any rule built off the pattern throughout Immuta. To edit a pattern,
Click the Discover icon in the navigation menu and select the Patterns tab.
Click the name of the pattern you want to edit.
Click Edit.
Edit the field you want to change. Note any field shadowed is not editable, and the pattern must be deleted and re-created to change them.
Click Save.
Built-in patterns cannot be edited.
Deleting a pattern will remove it from Immuta and remove all the rules that relied on it in the frameworks throughout Immuta. To delete a pattern,
Click the Discover icon in the navigation menu and select the Patterns tab.
Click the three dot menu in the Action column for the pattern you want to delete.
Select Remove.
Click Confirm.
Built-in patterns cannot be deleted.
Requirement: Immuta permission APPLICATION_ADMIN
Click the App Settings icon in the left sidebar.
Click Sensitive Data Discovery in the left panel to navigate to that section.
Enter the request-friendly name of your global template in the Global SDD Template Name field. This name can be found in the tooltip on the framework's detail page.
Click Save, and then Confirm your changes.
Requirements:
Native SDD enabled and
Immuta permission GOVERNANCE
You can only have one rule per pattern in the framework. If you do not see the pattern for the rule you want to create, then it already has a rule built off of it.
Click the Discover icon in the navigation menu and select the Framework tab.
Select the framework you want to edit and navigate to the Discovery Rules tab.
Click Create New.
Select the Tags to apply from the dropdown. The tags you select are the tags applied when the pattern is matched. Note that resulting tags must be under the Discovered parent tag and cannot be parent tags themselves unless they have already been manually applied to a data source.
Select the Criteria type from the dropdown. See the .
Competitive pattern analysis is for regex and dictionary patterns.
Column name is for column name patterns.
Select the Pattern from the dropdown.
Click Create Rule.
Click the Discover icon in the navigation menu and select the Frameworks tab.
Select the framework of the rule you want to edit and navigate to the Discovery Rules tab.
Select the rule you want to edit.
Click Edit.
Edit the field you want to change. Note any field shadowed is not editable, and the rule must be deleted and re-created to change them.
Click Save.
Deleting a rule removes the tags once applied by that rule the next time SDD runs on a data source. To delete a rule,
Click the Discover icon in the navigation menu and select the Frameworks tab.
Select the framework you want to edit and navigate to the Discovery Rules tab.
Click the three dot menu in the Action column for the rule you want to delete.
Select Remove.
Click Confirm.
This guide provides information and best practices for migrating from the deprecated legacy sensitive data discovery (SDD) option to the improved native SDD. This guide is for users who have already enabled SDD on their tenant and have Discovered tags on their data sources.
Legacy SDD is deprecated. It will be removed and replaced by native SDD. Native SDD is significantly improved from legacy SDD for discovering and tagging your data with upgrades to the built-in patterns. Additionally, the greatest benefit is the respect for data residency. Native SDD doesn't move any of your data when running. The discovery is done right in your data platform, and the platform only returns the matching patterns and column names to Immuta.
See the for more information on native SDD.
Native SDD requires Snowflake, Databricks, Redshift, or Starburst (Trino) data sources
Legacy SDD enabled on your tenant
Legacy SDD tags applied to your data sources: To find out if you have legacy SDD tags applied, create a governance report as described in the .
Contact your Immuta representative to enable native SDD on your Immuta tenant. Note that unless specifically disabled, all Immuta installations after the 2024.2 LTS have native SDD automatically enabled. Proceed to if you want to self-service check if native SDD is already running and tagging your data before you reach out to the representative.
This action will not change anything immediately on your tenant; however, anytime SDD runs in the future, it will be native SDD instead of the legacy version.
To assess native SDD for your data, proceed with the steps below. If you do not review native SDD, the legacy SDD tags will all remain on your data source columns. However, when on new data sources and columns, it will apply native SDD tags, and because of the improvements to SDD, it may tag different data than legacy SDD.
Requirement: Immuta permission GOVERNANCE
To check the tags on an individual data source, navigate to the data source data dictionary and select a Discovered tag. On the tag side sheet, you can determine the context of the tag. When patterns match data, native SDD will apply tags, and their tag context will be Sensitive Data Discovery
. Any tags with the context Legacy Sensitive Data Discovery
were not matched by native SDD but will remain on the data source.
To check your tags globally, navigate to the governance reports page and build a report for sensitive data discovery. This report will present the legacy tags on your data sources' columns and native SDD tags that are also on those columns. Use this report to assess the context of the Discovered tags and understand if native SDD is matching the data you want it to.
These actions will allow you to understand the differences between how native SDD and legacy SDD tag your data and whether your data is recognized as expected by native SDD or if legacy SDD was over-tagging your data. This way you can better tune SDD to your data.
If there are any legacy SDD tags that you want native SDD to catch, you need to tune native SDD so that this type of data is discovered in future tables and columns; see guidance on that in the next section.
Requirement: Immuta permission GOVERNANCE
Using the report you built above, complete these actions to tune SDD:
Focus on a legacy SDD tag properly applied to your data. Assess whether the native SDD tag on the column instead was applied more accurately than the legacy tag. If it is applied incorrectly, proceed to the next step.
Complete the steps above for all legacy SDD tags.
Completing the actions above will create parity between what legacy SDD was tagging your data and what native SDD will tag in the future.
Requirements:
Native SDD enabled and
Registered
Immuta permission GOVERNANCE
Click the Discover icon in the navigation menu and select the Frameworks tab.
Click Create New.
Enter a Name for the framework.
Enter a Description for the framework.
Select the option to Create empty framework.
Click Create.
After you create the framework, you can .
Click the Discover icon in the navigation menu and select the Frameworks tab.
Click Create New.
Enter a Name for the framework.
Enter a Description for the framework.
Select the option to Create rules from an existing framework.
Select the checkbox for the framework you want to copy. You can only copy a single framework. For more information about a framework, click the framework name to open a new tab with details about the framework.
Click Create.
To assign a framework to run on specific data sources,
Click the Discover icon in the navigation menu and select the Frameworks tab.
Select the framework you want to assign and navigate to the Data Sources tab.
Click Add Data Sources.
Select the checkbox for the data source you want this framework to run on. You may select more than one.
Click Add Data Source(s).
After a data source is removed from a framework, it will use the global framework for any SDD scans and the tags applied by the removed framework will be replaced. To remove data sources from a framework,
Click the Discover icon in the navigation menu and select the Frameworks tab.
Select the framework you want to remove data sources from and navigate to the Data Sources tab.
Select the checkbox for the data source you want to remove from the framework. You may select more than one.
Select the Bulk Actions more options.
Select Remove Data Sources.
Click Confirm.
Deleting a framework will remove it from any data sources. Those data sources will then use the global framework for any SDD scans and the tags applied by the deleted framework will be replaced. Governors can delete any framework, and users with the CREATE_DATA_SOURCE
or CREATE_DATA_SOURCE_IN_PROJECT
permissions can only delete frameworks they created. To delete a framework,
Click the Discover icon in the navigation menu and select the Frameworks tab.
Select Remove.
Click Confirm.
Requirements:
Native SDD enabled and
Registered
Immuta permission GOVERNANCE
SDD runs automatically, but if you want to re-run SDD when a new global framework is set or when new rules have been added, you can or for specific frameworks through the UI:
Click the Discover icon and the Identification tab in the navigation menu.
Select the more actions icon.
Select Run SDD and then select it again in the modal.
SDD runs automatically, but if you want to re-run SDD when a new global framework is set or when new rules have been added, you can or for specific data sources through the UI:
Navigate to the data source overview page.
Click the health status.
Select Re-run next to Sensitive Data Discovery (SDD).
Verify discovered tags
If sensitive data discovery has been enabled, then manually adding tags to columns in the data dictionary will be unnecessary in most cases. The data owner will just need to verify that the Discovered tags are correct.
If a governor, data owner, or data source expert disables a Discovered tag from the data dictionary, the column will not be re-tagged when that data source's fingerprint is recalculated or SDD is re-run. When a Discovered tag is disabled, the tag will not completely disappear, so it can be manually enabled through the tag side sheet.
To disable a discovered tag,
Navigate to a data source and click the Data Dictionary tab.
Scroll to the column you want to remove the tag from and click the tag you want to remove.
Click Disable in the side sheet and then click Confirm.
to run native SDD on your data sources.
to discover this data. Ensure it is specific and will match your data with a 90% confidence.
in your framework using the new pattern and the Discovered tag you want applied to the data.
Retest your updated rules and patterns by and continue refining to the level of accuracy you want.
Click the three dot menu in the Action column for the framework you want to delete. Note that the global framework cannot be deleted. If you want to delete it, .