Requirements:
Immuta permission GOVERNANCE
Registered Snowflake, Databricks, Redshift, or Starburst (Trino) data sources
This how-to guide is for enabling sensitive data discovery (SDD) for the first time. For additional information on sensitive data discovery, see the Data discovery page.
Navigate to the App Settings page and scroll to the Sensitive Data Discovery section.
Select the Enable Sensitive Data Discovery (SDD) checkbox to enable SDD.
Click Save and then click Confirm to apply your changes. Note that the Immuta tenant will have a system restart.
Once SDD is enabled on your tenant, SDD will automatically run when new data sources are added, but it must be manually run for all existing data sources. This allows you to test out SDD with a select few data sources without worrying that it will add tags throughout all your data sources.
For this step, you will pick the identifiers to match the data that matters to your organization. For example, for international data, you may want to enable many different identifiers for many countries, like the "Australia Passport" identifier and the "Finland National ID Number" identifier. However, if you are dealing with United States domestic financial data, those identifiers would be irrelevant. In that case, it would be better to identify the data likely to appear, like Bitcoin or US Bank Routing MICR.
First, create an empty framework,
Navigate to Discover and Identification.
Select Create New.
Enter a Name and Description for your new identification framework.
Select Create empty framework.
Then, add a new identifier to that framework,
Navigate to Discover and Identifiers.
Use the checkboxes to select all the identifiers relevant to your data. Tip: From the overview page you can see the name and the tags that will be applied by the identifier. To better understand the data it will match, click the name to read the description.
Once you have checked the identifiers you want in your framework, click Add to Framework.
Type the framework name in the text box.
Click Add to Framework.
Once you have created a framework relevant to your data, it is time to test it on your data and customize it. Run identification on a select number of data sources where you understand the data to assess and adjust the tags to reflect what you expect to see.
Add those select data sources to your new framework,
Navigate to Discover and Identification.
Click your new framework name.
Navigate to the Data Sources tab.
Click Add Data Sources.
Check the checkboxes for the select data sources you want to try SDD on.
Click Add Data Source(s).
Then, run identification on those data sources,
Navigate to Discover and Identification.
Click the action menu for your new framework.
Click Run Identification.
After identification runs, you will receive a notification that the job is complete. Then, you can view the results from the data source dictionary.
Navigate to the data source overview page of the data source you added to the framework.
Click the Data Dictionary tab.
Assess whether the Discovered tags are applied as expected.
If you are happy with the Discovered tags, follow the Assign data sources to frameworks guide to add the rest of your data sources to the framework and follow the Run identification guide to run identification on all your data sources.
If you want additional tags, follow the Create an identifier guide to create identifiers that matter to your data.