> For the complete documentation index, see [llms.txt](https://documentation.immuta.com/latest/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://documentation.immuta.com/latest/configuration/manage-data-metadata/data-classification/built-in-classification-frameworks-reference-guide.md).

# Classification Frameworks Reference Guide

Classification is the process in which data is categorized by the content and the associated risk level based on context. To classify your data, Immuta evaluates your data in two phases:

1. Identification runs to identify your data by content type. The data is discovered and evaluated by the identifier it matches and is tagged.
2. Classification runs to classify your data by its context. The data is classified by the rules within a framework and the tags currently applied to the column and table. Once the data is classified, it's tagged with special tags with additional metadata used in the [audit dashboards](/latest/governance/detect-your-activity/detection/detect-concept.md) as sensitivity and visualize when that sensitive data is accessed.

Both phases of classification in Immuta can be customized to find and tag the data your organization cares about. After data is classified, classification tags can be used to [build policies](/latest/governance/author-policies-for-data-access-control.md) or [visualize sensitive data access in audit dashboards](/latest/governance/detect-your-activity/detection/detect-concept.md).

Using classification to assign risk and sensitivity levels to your data and audit dashboards to visualize the risk levels offers these benefits:

* Increasing the semantic understanding of your data to better meet compliance requirements
* Reducing the time to make decisions about what data access is allowed under what purposes
* Reducing the effort and time to respond to auditors about data access in your company
* Reducing the labor of classifying data to enumerate what data is within the scope of security or regulatory compliance frameworks

## What is the difference between entity tags and classification tags?

Both entity and classification tags describe the content of data on a per-column basis, and you can use them to [monitor data access](/latest/governance/detect-your-activity/detection/detect-concept.md) and [build access policies](/latest/governance/author-policies-for-data-access-control.md). However, there are key differences between the two:

* Entity tags are applied through identification and describe what the data is. Identification applies entity tags to columns based on the patterns of the data.
* Classification tags are applied through categorization and risk assessment and describe the context of the data and the risk it poses. Using classification frameworks, classification tags are applied to columns based on the entity tags previously applied by identification. Additional classification tags can then be applied, providing even more context or expressing the property of the record rather than just the column.

### Why isn’t entity tagging sufficient for classification?

Entity tags describe the contents of individual columns, in isolation. But you don't access individual columns in isolation, so why would you determine their sensitivity that way? Entity tags do not attempt to and cannot contextualize column contents with neighboring columns' contents. This means that connections between data are lost if they cannot be identified through a pattern within the column itself. Classification tags describe the contents of a table with the context of all its columns, providing a holistic view of the risk of the data for what it is, rather than the pattern it fits. Context is necessary to understand whether your data is public or private data, risky or safe to have ungoverned access, or sensitive and creating toxic joins when accessed with other tables.

For example, under HIPAA, a list of procedures a doctor performed is only considered protected health information (PHI) if it can be associated with the identity of patients. Since entity tagging operates on a single column-by-column basis, it cannot reason whether or not a column containing procedure codes merits classification as PHI. Therefore, entity tagging will not tag procedure codes as PHI. But classification tagging will tag it PHI if it detects patient identity information in the other columns of the table.

Additionally, entity tagging does not indicate how sensitive the data is, but classification tags can carry a sensitivity level. For example, an entity tag may identify a column that contains telephone numbers, but the entity tag alone cannot say that the column is sensitive. A phone number associated with a person may be classified as sensitive, while the publicly listed phone number of a company might not be considered sensitive.

After you understand what entities your data contains using identification, you need to adopt frameworks that determine what combinations of data constitute sensitive data and their level of sensitivity.

## What is a framework?

Frameworks are a set of data categories and a set of classification rules to place data into those categories. In Immuta, the data categories are represented by tags, and when data fits a classification rule the tag is applied:

* **Classification tags** are applied based on the tags from identification or other tags on the data source. Classification tags contain additional metadata about each column, such as the source of the tag, the dimension, and the sensitivity level. This metadata is used in the framework rules and complex formulas that assign the sensitivity of queries visible in [audit dashboards](/latest/governance/detect-your-activity/detection/detect-concept.md).
* **Classification rules** determine how each classification tag is applied. These rules can apply tags based on tags already on the column, tags applied to neighboring columns, and tags applied to the data source. This means that the complete data source is considered when classifying your data sources, and even tags applied to individual columns can affect the risk level of the entire data source.

Frameworks are often built off of an interpretation of regulatory frameworks or standards, such as the US Health Insurance Portability and Accountability Act (HIPAA) and the PCI standard. However, organizations can also build frameworks that represent their internal business processes. When used in Immuta, they automate data tagging and provide information about what data you have immediately after it is registered in Immuta.

## What are the benefits of classification?

Data classification is a process, and with Immuta, much of it is automated. This means that you can reap the benefits of classified and tagged data quicker and easier than manually classifying and tagging it:

* **Quick data access control**: Use classification to identify and classify your data immediately after registration in Immuta. Then, [build governance policies](/latest/governance/author-policies-for-data-access-control.md) off of those tags. This repeatable process will protect your data in its current state and whenever any new data sources are created. Automate the process further with [schema monitoring](/latest/configuration/integrations/registering-metadata/data-sources/schema-monitoring.md); schema monitoring allows you to register data just once. Then, Immuta will monitor your data environment for changes and, when found, update the data source in Immuta, update the tags on that data source, and then update user access based on your governance policies when changes happen.
* **Scale your data monitoring**: Use classification to identify and classify your data immediately after registration in Immuta. Then, view your data users' access to your sensitive and risky data through the [audit dashboards](/latest/governance/detect-your-activity/detection/detect-concept.md).
* **Build data platform compliance**: Create classification frameworks to identify and classify your data based on the industry practices and regulations your organization needs to abide by. Once the frameworks are built, they will automatically tag data as it's registered, ensuring your data sources are properly tagged to abide by the regulations you care about.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://documentation.immuta.com/latest/configuration/manage-data-metadata/data-classification/built-in-classification-frameworks-reference-guide.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.