External Catalog Introduction

Users who want to use tags from outside of Immuta can connect an external catalog to automatically pull and apply them to Immuta data sources. These tags can then be used to drive policies or classification frameworks.

Supported external catalogs

Immuta supports automated tag ingestion from the following external catalogs:

You can also ingest tags from the following data platforms:

To configure an external catalog, see the Configure an external catalog guide.

Best practice: Use a single catalog; having more than one can lead to multiple truths and data leaks.

Architecture

Once an external catalog has been configured on the Immuta app settings page, there are two recurring process steps:

  1. Linking to data sources and columns: Whenever a new data source is created or an external catalog is set up, Immuta will attempt to automatically link data sources to their corresponding assets in the external catalog. This is done by comparing the with its corresponding asset name in the external catalog, so data sources must have the same database, schema, and object name in Immuta and the external catalog. Alternatively, a user can also manually link a data source to an asset in an external catalog. Once a data source has been linked to an external catalog, it can be seen on the data source's detail page.

  2. Pull and apply tags in Immuta: Using the link established in the first step, Immuta polls the external catalog to ingest and apply tags to each data source and its columns. Immuta checks every 24 hours for any relevant metadata changes in the connected external catalog. Tags originating from an external catalog can be found on the tags list page and on the data dictionary page for each data source.

See below for more information about the way Immuta integrates with each supported external catalog provider.

Alation

Alation tags and custom fields (except people sets, since those are represented as email addresses) containing values with a dot "." delimiter will be transformed into hierarchical tags in Immuta. To learn more about the benefits of hierarchical tags for policy authoring, see tag hierarchy.

Immuta's Alation integration supports importing both tags and custom fields, Alation's two primary ways of allowing data stewards to apply metadata to data assets.

  • Tags: Tags are a single word or phrase that can be attached to most Alation objects by nearly anyone. For instance, users can add a PCI tag for financial data.

  • Custom fields: Custom fields are key-value pairs that can only be attached and removed by authorized users. Unlike tags, custom fields can have multiple values associated with a single key. For example, the custom field DK_STEWARD could have MARKETING, FINANCE, and CUSTOMER values associated with it. Using Alation custom fields allows you to explicitly control who can modify information associated with that field inside of Alation, whereas Alation standard tags are modifiable by any user inside of Alation. The following custom field data types are supported and will be applied to Immuta data sources as tags: pickers, multi-select pickers, object sets, people sets, references, and dates.

When pulled into Immuta, Alation tags and custom fields will be applied to data sources as either column or data source tags in Immuta. Importing both Alation tags and custom fields into Immuta provides full flexibility for organizations leveraging the Alation enterprise data catalog, no matter what operating model they choose to document their metadata in Alation.

How Immuta gets metadata from Alation

  • Linking to data sources and columns in Alation: Immuta links data sources to assets in Alation by looking up the fully qualified name of an object via the Alation API.

  • Pull and apply tags in Immuta from Alation: Immuta polls Alation every 24 hours for all tags.

Atlan

Private preview

The Atlan catalog integration is only available to select accounts. Contact your Immuta representative to enable this feature.

The Atlan catalog integration with Immuta supports ingestion of tags and descriptions from Atlan assets.

How Immuta gets metadata from Atlan

  • Linking to data sources and columns in Atlan: Immuta links data sources to assets in Atlan by looking up the fully qualified name of an entity using Atlan APIs.

  • Pull and apply tags in Immuta from Atlan: Immuta checks Atlan every 24 hours for any relevant metadata changes. Based on these changes, Immuta then only polls and ingests tags from Atlan for the relevant data sources. However, if Immuta observes more than 25,000 metadata changes in Atlan within 24 hours, it will poll all data sources for tags during that run of external catalog tag synchronization.

Limitations

  • Custom metadata fields from Atlan do not get ingested as tags into Immuta

  • The current implementation only supports Databricks Unity Catalog data sources and their associated columns.

Collibra

Collibra objects using the dot "." delimiter will be transformed into hierarchical tags in Immuta. To learn more about the benefits of hierarchical tags for policy authoring, see tag hierarchy.

Immuta's Collibra integration supports importing tags, data classifications, and attributes. Additionally, data source and column descriptions from the connected Collibra catalog will be pulled into Immuta.

  • Tags: Tags are a single word or phrase that can be attached to objects in Collibra. For instance, users can add a PHI tag on health-related data assets.

  • Data classifications: Data classifications are a label in Collibra on the asset type column that describe the content of data and are separate from tags in Collibra. Immuta will ingest data classifications from Collibra and apply these classifications as tags on the appropriate columns. All data classifications from Collibra will be under the Data classification parent tag.

  • Attributes: Attributes in Collibra are a characteristic that describes an asset with an individual field. Unlike tags, attributes can have multiple values associated with a single key. For example, the attribute region could have emea, apac, and nala values associated with it. Using Collibra attributes allows you to explicitly control who can modify information associated with that field inside of Collibra, whereas Collibra standard tags are modifiable by any user inside of Collibra.

When pulled into Immuta, Collibra tags, data classifications, and attributes will be applied to data sources as either column or in Immuta. Importing Collibra tags, data classifications, and attributes into Immuta provides full flexibility for organizations leveraging the Collibra data catalog, no matter what operating model they choose to document their metadata in Collibra.

How Immuta gets metadata from Collibra

  • Linking to data sources and columns in Collibra: Immuta links data sources to assets in Collibra by looking up the full name. The must match the Collibra table asset name for the table to successfully link.

  • Pull and apply tags in Immuta from Collibra: Immuta checks Collibra every 24 hours by observing the linked assets history for any relevant metadata changes. Based on these changes, Immuta then only polls and ingests objects from Collibra for the relevant data sources. However, if Immuta observes more than 25,000 metadata changes in Collibra within 24 hours, it will poll all data sources for tags, data classifications, and attributes during that run of external catalog tag synchronization.

Limitations

  • The catalog auto-linking method will only auto-link Collibra assets where the asset's full name follows the Collibra Edge naming convention. Any assets following a different naming convention must be linked manually instead.

  • Columns must have a direct relation to their parent asset in Collibra. Indirect/inherited relations are not supported and will result in column tags and attributes not being ingested into Immuta.

Microsoft Purview catalog

Private preview: This feature is available to select accounts. Contact your Immuta representative to enable this feature.

The Microsoft Purview catalog integration with Immuta currently supports ingestion of Classifications and Managed attributes of type single or multiple choice as tags. Additionally, data source and column descriptions from the connected Microsoft Purview catalog will be pulled into Immuta.

How Immuta gets metadata from Microsoft Purview

  • Linking to data sources and columns in Microsoft Purview: Immuta links data sources to assets in Microsoft Purview by looking up the fully qualified name of an entity. The composition of the fully qualified name in Microsoft Purview differs depending on the technology type backing the data source.

  • Pull and apply tags in Immuta from Microsoft Purview: Immuta polls Microsoft Purview every 24 hours for all tags.

Limitations

  • Standard tags from Purview do not get ingested into Immuta

  • The current implementation only supports Databricks Unity Catalog, Snowflake, and Azure Synapse Analytics data sources and their associated columns

  • If a managed attribute is applied to an Immuta data source but later expires, it will still appear as a tag on the data source. Expired attributes must be removed from the object in Purview for the tag to be removed from the Immuta data source.

Custom REST catalog

If users have an unsupported catalog, or have customized their catalog integration, they can connect through the REST Catalog using the Immuta API.

For more details about using a custom REST catalog with Immuta, see the Custom REST Catalog Interface Introduction.

AWS Lake Formation

Private preview: This feature is only available to select accounts. Contact your Immuta representative to enable this feature.

The AWS Lake Formation connection can ingest Lake Formation Tags and apply them to Immuta data sources.

Databricks Unity Catalog

Private preview: This feature is only available to select accounts. Contact your Immuta representative to enable this feature.

Users can connect their Databricks Unity Catalog account to allow Immuta to ingest Databricks tags and apply them to Databricks data sources.

Snowflake

Users can connect a Snowflake account to allow Immuta to ingest Snowflake tags onto Snowflake data sources.

Authentication support matrix

This table lists the supported external catalogs and their supported authentication methods. Data platform tag ingestion uses the data platform credentials. For more details about a catalog, see the linked section:

Catalog
Username and password
OAuth 2.0
API key

External catalog behaviors

  • Tags ingested from external catalogs cannot be edited within Immuta. To edit, delete, or add a tag from an external catalog to a data source or column, make the change in the external catalog.

  • You can configure multiple external catalogs within a single tenant of Immuta, but only one external catalog can be linked to a data source.

  • Immuta searches all external catalog providers once per day and links data sources without an external catalog attached to them to the first catalog that matches.

  • S3 data sources can currently only be linked manually to external catalogs.

Audit

The following catalog-related events are audited and can be found on the audit page in the UI:

  • ConfigurationUpdated: The configuration on the Immuta app settings page is updated, including when an external catalog configuration is added or deleted.

  • DatasourceCatalogSynced: An external catalog is linked and synced for the data source.

Resources

Last updated

Was this helpful?