Loading...
Loading...
Loading...
Loading...
Loading...
Requirements:
Immuta permission GOVERNANCE
To activate a classification framework,
Navigate to Discover and select the Classification tab.
Click the more actions icon in the Actions column for the framework you want to activate.
Select Activate.
Navigate to Discover and select the Classification tab.
Click the more actions icon in the Actions column for the framework you want to activate.
Select Deactivate.
To activate a framework using the Immuta API, see the Frameworks API reference page.
Requirements:
Immuta permission GOVERNANCE
Immuta Discover provides identifiers out-of-the-box to recognize and tag data. Users can then utilize classification frameworks and build them to apply tags based off those identifier tags and their own catalog tags.
Tune identification frameworks and identifiers first to adjust where Discovered tags are applied. Because classification frameworks can apply classification tags from the Discovered tags, tuning SDD should come first and will have trickle-down effects on classification. Customizing SDD requires some initial work but will automate data tagging for all data sources in the future.
Follow the steps below to tune SDD for your data:
Add a few data sources to your new framework: This will remove the tags from any previous identification frameworks and re-run identification with your new framework. From here, either continue to edit identifiers to reconfigure the applied tags, or if you are happy with the results, proceed to the next step.
After SDD has applied entity tags, any active classification frameworks will automatically reapply their tags to account for any changes to Discovered tags. It may be necessary to adjust the classification tags based on your organization's data, security, and compliance needs.
Requirement: Immuta permission GOVERNANCE
or data owner
Target some data sources to manually review tags:
Navigate to the data dictionary for the data source by opening the Data Sources page and selecting a data source. Click the Data Dictionary tab to open the data dictionary.
The data dictionary lists the data source columns, with details about the name, data type, and a list of the tags on each column. Assess whether the tags are accurate to your data.
Tags may be unexpected but still accurate to your data. Additionally, they may have been applied because they were found to be the best match from the identifiers in the framework.
If you want to improve SDD and personalize it to your data, assess why the tag was applied to your data:
Is the identifier incorrectly matching your data and irrelevant to your organization? Delete the identifier that applied the tag from the identification framework.
Is the identifier incorrectly matching this specific column, but correct in other places? It must have been the most correct match found by identification. Create a better match by completing the following steps:
Add the identifier to the identification framework so this column is correctly matched by identification.
If you want to remove the unexpected tags, use one of the following how-to guides:
Ensure the Discovered tags are applied properly by adjusting SDD.
Remove any excess tags. Note that classification tags build off of other tags, so removing a single classification or Discovered tag can have trickle-down effects on the data source.
If you were expecting some sensitive data to be tagged and it is not, enable additional tags using one of the following how-to guides:
Ensure the Discovered tags are applied properly by adjusting SDD.
Add additional tags. Note that classification tags build off of other tags, so adding a single classification or Discovered tag can have trickle-down effects on the data source.
Requirement: Immuta permissions GOVERNANCE
and AUDIT
Tags can be edited on an individual basis for each data source. If broad changes to the classification framework are necessary to re-tag your data, use the frameworks API.
Navigate to the Data Sources page and select the data sources that you assessed and noted issues.
Click the Data Dictionary tab.
Delete unnecessary tags by clicking on the tag you want to remove from the column, and select Disable from the tag side sheet.
To add tags,
Click Add Tags in the Actions column.
Begin typing the name of the tag you want to add in the Search by Name field and select the tag from the dropdown list.
Click Add.
After you have configured a data catalog integration and registered data sources in Immuta, you can start automating data classification of a column based on its context, which is the combination of
associated tags already applied to the column
tags applied to the neighboring columns and
table tags on the data source.
The starter framework in this how-to is built to map a classification scale of restricted, confidential, internal, and public to Immuta's three-level scale, which can be visualized in the data source and query event dashboards.
Follow this guide to map your external catalog tags to the example framework, or consult the for more information about the framework schema.
Requirement: An configured in Immuta
Using the example framework below, customize the framework for your organization's classification tags:
tags
: These tags are automatically created in Immuta with the sensitivity you assign. They must not already exist in Immuta. All tags used in the classificationTag
parameter should be defined here.
tags.sensitivities
: This is metadata for the sensitivity of the new tag. Use confidentiality
for dimension
. Options for sensitivity
are 1
(shown as sensitive in Detect dashboards) and 2
(shown as highly sensitive in Detect dashboards). For nonsensitive, leave this parameter empty.
rules
: These are the rules for applying the tags
defined above. Each rule contains the classification tag to apply if the requirements are met and the requirements: the column tags, neighboring column tags, and table tags that must be present. All requirements within each defined rule must be met for the classification tag to be applied.
rules.classificationTag
: The name and source of the tag you want applied if the rule requirements are met. This classification tag must be defined in tags
. The source
is curated
.
rules.columnTags
: These are the required tags for a column. If the tags defined here are found on a column, and the other tag rules are met, then the rule's classificationTag
will be applied to the same column.
rules.neighborColumnTags
: These are the required tags on other columns in the data source (or in the query if dynamic query classification is enabled). If the tags defined here are found on any column in the data source, and the other tag rules are met, then the rule's classificationTag
will be applied to all the neighboring columns.
rules.tableTags
: These are the required tags on the data source. If the tags defined here are found on the data source, and the other tag rules are met, then the rule's classificationTag
will be applied to all the columns in that data source.
active
: When true
the framework is active and will apply tags when the rules are met.
Follow the example below to map your external tags to the rules in the example framework.
This example framework has a rule where columns tagged DSF.Interpretation.Credentials.Secret
by sensitive data discovery will be tagged RAF.Confidentiality.High
:
To translate this to your tags, replace the name and source value of the columnTags
, neighborColumnTags
, or tableTags
with your own. This new example is for a Collibra tag that an organization uses for confidential data. This rule now states: Apply the classification tag RAF.Confidentiality.High
to a column if it has the collibra
tag Confidential
. Repeat this for your organization's remaining classification levels.
name
and source
for your tagsIf you do not know the name
or source
for your tags, you can list your tags using the Immuta API:
This request will list all the tags in your Immuta environment, similar to this example response:
Requirement: Immuta permission GOVERNANCE
Once you have made all the customizations to the example framework, make the following request using the Immuta API, with your full customized framework as the payload.
Your new framework will now be visible in the Immuta UI by navigating the the Classification section under Discover.
For more information about these parameters see the .
Classification is the process in which data is categorized by the content and the associated risk level based on context. Classification complements sensitive data discovery (SDD), and the tags classification applies can give additional information in the Detect dashboards for data sources.
Activate classification frameworks: Use the API to activate a classification framework.
How to use a classification framework with your own tags: Create a classification framework using a provided template.
Classification frameworks: This reference guide describes classification frameworks and how classification works in Immuta.
Classification is the process in which data is categorized by the content and the associated risk level based on context. To classify your data, Discover evaluates your data in two phases:
Sensitive data discovery (SDD) runs to identify your data by content type. The data is discovered and evaluated by the identifier it matches and is tagged.
Classification runs to classify your data by its context. The data is classified by the rules within a framework and the tags currently applied to the column and table. Once the data is classified, it's tagged with special tags with additional metadata used in the Detect dashboards as sensitivity and visualize when that sensitive data is accessed.
Both phases of classification in Immuta can be customized to find and tag the data your organization cares about. After data is classified, classification tags can be used to build Secure policies or visualize sensitive data access in Detect dashboards.
Using Discover classification to assign risk and sensitivity levels to your data and Detect dashboards to visualize the risk levels offers these benefits:
Increasing the semantic understanding of your data to better meet compliance requirements
Reducing the time to make decisions about what data access is allowed under what purposes
Reducing the effort and time to respond to auditors about data access in your company
Reducing the labor of classifying data to enumerate what data is within the scope of security or regulatory compliance frameworks
Both entity and classification tags describe the content of data on a per-column basis, and you can use them to monitor data access and build access policies. However, there are key differences between the two:
Entity tags are applied through identification and describe what the data is. SDD applies entity tags to columns based on the patterns of the data.
Classification tags are applied through categorization and risk assessment and describe the context of the data and the risk it poses. Using classification frameworks, classification tags are applied to columns based on the entity tags previously applied by SDD. Additional classification tags can then be applied, providing even more context or expressing the property of the record rather than just the column.
Entity tags describe the contents of individual columns, in isolation. But you don't access individual columns in isolation, so why would you determine their sensitivity that way? Entity tags do not attempt to and cannot contextualize column contents with neighboring columns' contents. This means that connections between data are lost if they cannot be identified through a pattern within the column itself. Classification tags describe the contents of a table with the context of all its columns, providing a holistic view of the risk of the data for what it is, rather than the pattern it fits. Context is necessary to understand whether your data is public or private data, risky or safe to have ungoverned access, or sensitive and creating toxic joins when accessed with other tables.
For example, under HIPAA, a list of procedures a doctor performed is only considered protected health information (PHI) if it can be associated with the identity of patients. Since entity tagging operates on a single column-by-column basis, it cannot reason whether or not a column containing procedure codes merits classification as PHI. Therefore, entity tagging will not tag procedure codes as PHI. But classification tagging will tag it PHI if it detects patient identity information in the other columns of the table.
Additionally, entity tagging does not indicate how sensitive the data is, but classification tags can carry a sensitivity level. For example, an entity tag may identify a column that contains telephone numbers, but the entity tag alone cannot say that the column is sensitive. A phone number associated with a person may be classified as sensitive, while the publicly listed phone number of a company might not be considered sensitive.
After you understand what entities your data contains using SDD, you need to adopt frameworks that determine what combinations of data constitute sensitive data and their level of sensitivity.
Frameworks are a set of data categories and a set of classification rules to place data into those categories. In Immuta, the data categories are represented by tags, and when data fits a classification rule the tag is applied:
Classification tags are applied based on the Discovered tags from SDD or other tags on the data source. Classification tags contain additional metadata about each column, such as the source of the tag, the dimension, and the sensitivity level. This metadata is used in the framework rules and complex formulas that assign the sensitivity of queries visible in Detect dashboards.
Classification rules determine how each classification tag is applied. These rules can apply tags based on tags already on the column, tags applied to neighboring columns, and tags applied to the data source. This means that the complete data source is considered when classifying your data sources, and even tags applied to individual columns can affect the risk level of the entire data source.
Frameworks are often built off of an interpretation of regulatory frameworks or standards, such as the US Health Insurance Portability and Accountability Act (HIPAA) and the PCI standard. However, organizations can also build frameworks that represent their internal business processes. When used in Immuta, they automate data tagging and provide information about what data you have immediately after it is registered in Immuta.
Data classification is a process, and with Immuta, much of it is automated. This means that you can reap the benefits of classified and tagged data quicker and easier than manually classifying and tagging it:
Quick data access control: Use Discover to identify and classify your data immediately after registration in Immuta. Then, build Secure governance policies off of those tags. This repeatable process will protect your data in its current state and whenever any new data sources are created. Automate the process further with schema monitoring; schema monitoring allows you to register data just once. Then, Immuta will monitor your data environment for changes and, when found, update the data source in Immuta, update the tags on that data source, and then update user access based on your governance policies when changes happen.
Scale your data monitoring: Use Discover to identify and classify your data immediately after registration in Immuta. Then, view your data users' access to your sensitive and risky data through the Detect dashboards.
Build data platform compliance: Create classification frameworks to identify and classify your data based on the industry practices and regulations your organization needs to abide by. Once the frameworks are built, they will automatically tag data as it's registered, ensuring your data sources are properly tagged to abide by the regulations you care about.