arrow-left

All pages
gitbookPowered by GitBook
1 of 3

Loading...

Loading...

Loading...

Understanding Data Metadata Management in Immuta

Static policies and manual tagging can’t keep up. To scale, you need metadata that updates with your data, and policies that adjust automatically.

With Immuta, you can discover sensitive data automatically, create and manage tags directly in the platform, or integrate with external catalogs to pull in data tags. No matter where the metadata comes from, Immuta uses it drive policy creation, and enable dynamic access control across your data platforms.

In this guide, you’ll learn

  • How Immuta integrates with external catalogs

  • How to automate metadata tagging and classification directly in Immuta

  • How to create dynamic, attribute-based policies driven by metadata

  • How to support self-service access through governed data marketplaces

  • Best practices for consistent, scalable metadata management across platforms

hashtag
Catalog integration: enhancing metadata management

Immuta integrates with external catalogs like Alation, Collibra, Snowflake, and Unity Catalog to import metadata directly into the platform. This metadata includes column-level and table-level tags that define the characteristics of data assets, such as whether they contain sensitive information like credit card numbers or personal identifiers (PII).

By connecting metadata catalogs to Immuta, organizations can automatically apply policies based on existing metadata and eliminate the need for manual tagging. Custom REST endpoints also offer flexibility, enabling teams to integrate with other metadata sources as needed.

hashtag
Key benefits of catalog integration

  • Immuta automatically synchronizes metadata from external catalogs, ensuring consistent tagging and classification across all data sources.

  • Governance teams can create policies more efficiently in Immuta using synchronized metadata, which helps maintain consistency and avoid misalignment across environments.

  • Organizations can scale policy enforcement across platforms like Snowflake, Databricks, and Starburst by using metadata-driven policies, enabling a seamless governance experience.

hashtag
Leveraging identification and classification in Immuta

Immuta’s data discovery services help organizations proactively identify and tag sensitive data across their environment. By scanning data sources for elements like PII and PCI, Immuta enables teams to automatically classify and protect sensitive information with appropriate policies.

hashtag
Key features of identification and classification in Immuta

  • Automatically identifies sensitive data such as customer names and Social Security numbers, and tags it for policy enforcement.

  • Allows organizations to customize discovery workflows to capture unique attributes and ensure accurate tagging.

  • Links classification directly to policy creation, enabling teams to enforce access controls based on discovered data.

hashtag
Creating policies driven by metadata

With metadata from catalogs and sensitive data discovery tools, Immuta empowers organizations to build dynamic, attribute-based policies. Teams can write policies in natural language, using data classifications and user attributes to define who can access what data. As metadata evolves, Immuta automatically updates access controls—eliminating the need for manual policy changes.

hashtag
For example

  • You can create a policy that masks all columns tagged as PII for users outside the HR department.

  • You might restrict access to sensitive financial data to only users in the Finance department.

Immuta enforces these policies consistently across platforms like Snowflake and Databricks, enabling scalable and flexible governance that adapts as your data and teams grow.

hashtag
Catalogs and the Request app: supporting self-service access

As more organizations adopt a data-as-a-product strategy, data catalogs and marketplaces play a central role in delivering data products to users. Immuta integrates with these platforms to simplify access requests and approvals, ensuring that access remains dynamic, secure, and compliant.

When a user requests access to a data product, Immuta evaluates their entitlements and the metadata associated with the product. It then applies the appropriate access controls automatically, removing manual steps and reducing risk while maintaining a seamless user experience.

hashtag
The Request app workflow includes the following capabilities

  • Automated access requests let users self-serve by requesting access to specific data products, with Immuta handling the approvals and applying the appropriate controls.

  • Sensitive data remains protected by default as Immuta only unmasks sensitive fields like credit card numbers for users who meet the policy criteria.

hashtag
Conclusion

Effective data metadata management plays a vital role in securing, scaling, and governing data access across the enterprise. With Immuta’s integrations and tools like and , organizations automate the discovery, tagging, and classification of data, then dynamically enforce policies using that metadata. This approach simplifies governance and ensures consistent application of access rules across every connected data platform.

By using metadata to drive policy creation and integrating with external catalogs, organizations gain greater efficiency, flexibility, and control in their governance strategy with Immuta.

identification
classification

Data Metadata

Implementing Identification: A Strategic Guide

Organizations today manage more data than ever, yet many still struggle to understand what they have, where it lives, and how to protect it. With users ranging from analysts and AI agents to executives and external partners, access must be fast, secure, and compliant.

The key to enabling secure access at scale? Metadata. When data is accurately identified and tagged, you can apply dynamic protections, deliver tailored access, and confidently scale data sharing. Without trustworthy metadata, you risk slowing down access, or worse, exposing sensitive data.

That’s where Immuta’s identification service comes in. Identification is the automated process of detecting and tagging sensitive data—such as personal identifiers, business-critical fields, or regulatory attributes—based on configurable patterns like column names, data values, or dictionaries. This provides the foundation for

  • Faster provisioning of governed data

  • Automated policy enforcement

  • Consistent masking and filtering as data evolves

  • Audit-ready visibility into data protections

Without identification, every new table or column requires manual review. With it, protections keep pace with your data so access remains both fast and secure.

hashtag
Key concepts

hashtag
Identification

Immuta’s Identification service automatically scans data sources to identify and tag data based on configurable criteria. These tags drive access controls, masking, and audit logging, helping you enforce policies at scale.

hashtag
Identifiers

An identifier defines the criteria and tags applied to data that matches those criteria. Immuta includes built-in identifiers for common data types, which you can use as-is, edit, or build upon with custom identifiers for your unique needs. Rules that define how Immuta detects specific types of data. These can be based on

  • Regex patterns (e.g., for SSNs or phone numbers)

  • Dictionary matching (e.g., known hospitals or countries)

  • Column name patterns (e.g., a column named “patient notes” or “comments” that may contain unstructured text, but still often includes sensitive details like names, emails, or IDs)

Tags applied through identifiers can come from Immuta’s built-in discovered hierarchy, external catalogs, or manual tagging, all of which contribute to a consistent metadata layer across your environment.

There are two types of identifiers:

  • Reference identifiers - A library of reusable identifiers that can be added to domains. Once added, they become domain-specific copies.

  • Domain-specific identifiers - These identifiers exist only within a specific domain and apply only to that domain’s data. Edits to reference identifiers won’t affect domain-specific copies

hashtag
Getting started

hashtag
1. Define what’s sensitive

Start by determining which data fields are sensitive and how they should be handled. Align this with compliance requirements (e.g., HIPAA, GDPR), business priorities (e.g., protecting IP), and access needs. Also identify business-critical fields that commonly drive access controls, such as region, department, or hospital name.

  • Use built-in identifiers for common types like SSNs, names, and emails.

  • Create custom identifiers for organization-specific fields like customer_id, project_code, or hospital_name.

hashtag
2. Create and organize identifiers

Build a reusable library of reference identifiers for data types that require consistent tagging, such as SSNs, email addresses, or account IDs. Reference identifiers can be applied across multiple domains and adapted to fit different data contexts.

hashtag
3. Assign identifiers to domains

Use domains to group related data sources and apply relevant identifiers. Assigning a reference identifier to a domain creates a domain-specific copy, allowing each team to tailor tagging logic without affecting others.

For example, a healthcare organization might create Clinical, Billing, and Research domains, each using domain-specific versions of relevant identifiers.

You can also configure a Global SDD domain to centrally manage tagging across enterprise-wide data sources. Because Immuta allows data sources to belong to multiple domains, a single source can be scanned by both global and business-specific identifiers. Each domain’s tagging logic is evaluated independently, so discovery runs in parallel without conflict.

hashtag
4. Empower domain-level data discovery

Give domain stewards permission to manage identifiers. They can customize reference identifiers—adjusting regex, renaming, or modifying tags—to fit their domain’s context. This balance of local control and global consistency supports scalable, accurate tagging across the organization.

hashtag
5. Iterate and improve

Identification is an ongoing process. Regularly review tagging results, refine detection logic, and adjust as data evolves or new data sources are added. With object sync, Immuta automatically detects new tables and columns and re-scans to keep metadata current with minimal manual effort.

hashtag
Best practices

Start small, scale strategically: Begin with a focused set of identifiers and one or two domains. Once tagging is accurate and policies are working as intended, expand incrementally to more domains and data sources. Scaling gradually helps ensure quality and reduces false positives.

Use reference identifiers to drive consistency: Establish a set of reference identifiers as the foundation for tagging common data types, like SSNs, names, or email addresses. Apply these consistently across domains to avoid duplication and ensure alignment on enterprise-wide policies.

Customize only where necessary: Allow domain stewards to adapt reference identifiers only when the data context truly differs; for example, when a regex pattern needs to match local naming conventions or different tags are required. This balance preserves standardization while enabling flexibility.

Review and refine regularly: Establish a cadence for reviewing identifier performance. Audit tag accuracy, monitor false positives or missed fields, and update patterns as your data evolves. Use Object Sync and automatic scanning to keep tags up to date without needing to reconfigure each time a table changes.

Involve the right stakeholders: Bring in data owners, compliance, legal, and business partners to help define sensitive data criteria and review tagging outcomes. Their input ensures tagging logic reflects both regulatory requirements and operational realities.

hashtag
What's next?

Accurate and automated identification is the foundation for scalable data governance. Once sensitive fields and business-critical attributes are tagged, you can dynamically apply policies, safely provision data, and drive governed self-service with confidence.

hashtag
Add classification

Immuta’s service builds on identification by categorizing data based on content and risk level, considering existing tags on the column, tags from neighboring columns, and table-level tags on the data source.

It then assigns a sensitivity level to each column. These classifications enhance governance by unlocking smarter automation:

  • Audit dashboards show access activity by sensitivity level.

  • Approval flows can use AI-driven risk scoring to determine whether to auto-approve access or require review.

For example, masked access to low-risk data might be auto-approved, while requests for unmasked or highly sensitive data could trigger manual approval. Classification helps you scale governed access while adapting to data risk in real time.

hashtag
Automate policy enforcement

Accurate and automated identification is the foundation for dynamic policy enforcement. Once sensitive fields and business-critical attributes are tagged, Immuta can automatically apply policies that mask, filter, or restrict access without manual intervention.

For example:

  • An SSN column tagged by an identifier can automatically trigger a masking policy.

  • A department or hospital_name tag can drive row-level filtering based on the user’s attributes.

Because Immuta policies are driven by tags, newly added tables and columns that match identifier logic are automatically protected. With policies in place, your data is ready to be safely shared through the Request app, giving analysts, researchers, and partners timely access to the data they need while maintaining strong security and compliance controls.

hashtag
Accelerate provisioning through the Request app

Once data is tagged through identification, governed through policies, and optionally classified by sensitivity, it becomes ready for safe and scalable delivery.

makes this provisioning process seamless. Tagged and protected datasets can be published as data products in the Request app, where users can discover and request access based on governed policies.

This process allows you to:

  • Automatically approve access to low-risk, fully masked datasets

  • Route higher-risk requests for manual approval using classification-driven logic

  • Track activity through audit logs enriched with sensitivity context

For example, a data product containing masked employee emails may be auto-approved for analytics teams, while a request for a table with unmasked patient records would trigger a manual approval workflow due to its classification as highly sensitive.

By combining identification, classification, and policy enforcement, Immuta enables intelligent access decisions that balance data utility with risk. The Request app is the final step: allowing teams to access the data they need, when they need it, while giving governance teams full confidence that sensitive data remains protected.

Define detection logic using regex, dictionaries, or column name patterns.
  • Establish a tagging hierarchy based on your data model. Tags applied by identifiers can come from external catalogs, be manually created in Immuta, or originate from Immuta’s built-in discovered hierarchy.

  • classification
    The Request app