# Architecture

Discover automates discovering and tagging data across your data platform. It encompasses the [identification](/2024.2/discover-your-data/data-discovery.md) and [classification](/2024.2/discover-your-data/data-classification.md) of data using frameworks.

## Requirements

* SDD enabled
* [Frameworks enabled](/2024.2/application-settings/how-to-guides/config-builder-guide.md#enable-discover-features)
* Registered [Snowflake, Databricks, Redshift, or Starburst (Trino) data sources](/2024.2/data-and-integrations/registering-metadata/register-data-sources/query-backed-tutorial.md)

## Components

The Immuta UI has separate sections for identification frameworks and classification frameworks. Both frameworks are made of rules, criteria, and resulting tags, but the criteria types differ for each framework type. Identification frameworks use competitive pattern matching and column name matching to discover data types and tag them. Classification frameworks use tags on the column, neighboring columns, and data source for context and then tag the columns based on that context. Find more information about each framework type below.

### Identification frameworks

Identification frameworks run with sensitive data discovery (SDD). They use data patterns to discover data and tag it based on what the data is.

#### Supported criteria and pattern types

* **Competitive pattern analysis**: This criteria is a process that will review all the regex and dictionary patterns within the rules of the framework and search for the pattern with the best fit. In this review, each competitive pattern analysis criteria in the framework competes against each other to find the best and most specific pattern that fits the data. The resulting tags for the best pattern's rule are then applied to the column.
  * **Regex pattern**: This pattern contains a case-insensitive regular expression that searches for matches against column values. [Create a regex pattern in the UI](/2024.2/discover-your-data/data-discovery/how-to-guides/manage-patterns.md) or with the [`sdd/classifier` endpoint](/2024.2/developer-guides/api-intro/immuta-v1-api/configure-your-instance-of-immuta/sdd-api.md#create-a-pattern).
  * **Dictionary pattern**: This pattern contains a list of words and phrases to match against column values. [Create a dictionary pattern in the UI](/2024.2/discover-your-data/data-discovery/how-to-guides/manage-patterns.md) or with the [`sdd/classifier` endpoint](/2024.2/developer-guides/api-intro/immuta-v1-api/configure-your-instance-of-immuta/sdd-api.md#create-a-pattern).
* **Column name**: This criteria matches a column name pattern to the column names in the data sources. The rule's resulting tags will be applied to the column where the name is found.
  * **Column name pattern**: This pattern includes a case-insensitive regular expression matched against column names, not against the values in the column. [Create a column name pattern in the UI](/2024.2/discover-your-data/data-discovery/how-to-guides/manage-patterns.md) or with the [`sdd/classifier` endpoint](/2024.2/developer-guides/api-intro/immuta-v1-api/configure-your-instance-of-immuta/sdd-api.md#create-a-pattern).

#### Related guides

* To start using identification frameworks in the UI, see the [Getting started guide](/2024.2/discover-your-data/getting-started.md).
* To manage identification frameworks with the API, see the [`/sdd/template` endpoint reference guide](/2024.2/developer-guides/api-intro/immuta-v1-api/configure-your-instance-of-immuta/sdd-api.md).

### Classification frameworks

Classification frameworks run with the classify service. They determine rule match and criteria fit based on proximity tags and then tag data based on the context it is within.

#### Supported criteria

* **Match column tag**: This criteria applies resulting tags based on specific tags already on the column.
* **Match neighboring column tag**: This criteria applies resulting tags based on specific tags on neighboring columns.

#### Related guides

* To manage classification frameworks in the UI, see the [Activate frameworks guide](/2024.2/discover-your-data/data-classification/how-to-guides/activate-framework.md).
* To create a classification framework with the API, see the [`/frameworks` endpoint reference guide](/2024.2/developer-guides/api-intro/immuta-v1-api/configure-your-instance-of-immuta/frameworks.md).

### Data inventory dashboard

{% hint style="info" %}
**Private preview** This feature is only available to select accounts.
{% endhint %}

The data inventory dashboard visualizes information about your organization's data. It presents your entire data corpus within the context of the frameworks you have actively tagging your data with details like when your data was scanned last or how much of the scanned data is relevant to your active frameworks.

In the data inventory dashboard you will see tiles for scanned coverage and the percent of data scanned within a specific time frame. These tiles are referencing data scanned by an identification framework with SDD. To increase the number of your data sources that have been scanned, [run SDD](/2024.2/discover-your-data/data-discovery/how-to-guides/enable-sdd.md#run-sdd-on-all-data-sources).

The next section of the dashboard shows tiles for the compliance frameworks. Within each graph is the separation of columns found containing or not containing the data important to the compliance framework. These graphs update every time classification runs, which will happen from [these events](#frequency).

For information on the frameworks visualized in the dashboard, see the [Immuta frameworks reference guide](/2024.2/discover-your-data/data-classification/immuta-frameworks.md).

## Workflow

The Discover workflow involves both identification with SDD and classification:

1. A user with the `GOVERNANCE` permission [enables SDD](/2024.2/discover-your-data/data-discovery/how-to-guides/enable-sdd.md) and [activates classification frameworks](/2024.2/discover-your-data/data-classification/how-to-guides/activate-framework.md).
2. Users register data in Immuta.
3. SDD runs:
   1. Immuta generates a SQL query using the identification framework's rules.
   2. That query is executed in the remote database.
   3. Immuta receives the query results containing the column name and the matching rules but no raw data values.
   4. SDD applies the resulting tags to the relevant columns.
4. Classification runs:
   1. The data source's current tags are checked against the framework's rules.
   2. When a matching rule is found, the resulting tags are applied to the relevant columns.
5. Users with the `GOVERNANCE` permission or data owners can view the data inventory dashboard with visualizations of their scanned data.

### Frequency

This workflow will run when a new data source is manually registered in Immuta or found from schema monitoring. Additionally, SDD alone will run from the following events:

* A new data source is created.
* Schema monitoring is enabled, and a new data source is detected.
* Column detection is enabled, and new columns are detected. Here, SDD will only run on new columns, and no existing tags will be removed or changed.
* A user manually triggers it from the data source health check menu.
* A user manually triggers it from the identification frameworks page.
* A user manually triggers it through the API.

Classification will run from the following events:

* A framework gets created, updated, or deleted.
* A tag gets added to or removed from a column manually or by SDD.
* A tag gets added to a data source.
* A user manually triggers it from the data source health check menu.
* A user manually triggers it through the API.

## Caveat

* Customizing classification frameworks currently requires users to use the Immuta API.

## Discover section contents

**Conceptual guides**:

* [Data classification](/2024.2/discover-your-data/data-classification.md)

**Getting started guide**:

* [Getting started with Discover](/2024.2/discover-your-data/getting-started.md)

**How-to guides**:

* Identification guides:
  * [Enable SDD](/2024.2/discover-your-data/data-discovery/how-to-guides/enable-sdd.md)
  * [Create an identification framework](/2024.2/discover-your-data/data-discovery/how-to-guides/manage-frameworks.md#create-a-framework)
  * [Create a pattern](/2024.2/discover-your-data/data-discovery/how-to-guides/manage-patterns.md#create-a-pattern)
  * [Manage identification rules](/2024.2/discover-your-data/data-discovery/how-to-guides/manage-rules.md)
  * [Manage SDD and discovered Tags](/2024.2/discover-your-data/data-discovery/how-to-guides/manage-sdd-tags.md)
  * [Manage global SDD settings](/2024.2/discover-your-data/data-discovery/how-to-guides/global-sdd.md)
* Classification guides:
  * [Activate a classification framework](/2024.2/discover-your-data/data-classification/how-to-guides/activate-framework.md)
  * [Adjust and accept entity and classification tags](/2024.2/discover-your-data/data-classification/how-to-guides/adjust-classification-tags.md)

**Reference guides**:

* [Built-in pattern reference](/2024.2/discover-your-data/data-discovery/reference-guides/classifier-reference.md)
* [Discovered tag reference](/2024.2/discover-your-data/data-discovery/reference-guides/discovered-tags.md)
* [Built-in classification frameworks reference](/2024.2/discover-your-data/data-classification/immuta-frameworks.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.immuta.com/2024.2/discover-your-data/architecture.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
