Manage Data Metadata How-to Guide

2 - Manage data metadata how-to guide

Before authoring global data policies to mask columns or redact rows, data metadata must exist in Immuta so that it can be used in the policy to identify the data that should be masked or redacted.

This how-to guide demonstrates how to manually manage tags, use data identification, or use existing tags in external catalogs to identify data that should be targeted by a data policy.

For detailed explanations and examples of how to manage data metadata, see the Managing data metadata guide.

Requirement

Immuta permission: APPLICATION_ADMIN (if using an external catalog or identification) or GOVERNANCE (if manually adding tags in Immuta)

Prerequisites

Select your strategy

  • Fact-based (ABAC): Use this strategy to tag data sources at the column and table level.

  • Logic-based (orchestrated RBAC): Use this strategy to tag data sources at the table level.

Organize your data metadata

Logic-based (orchestrated-RBAC)

Logic-based column tags requires subjective decisions (not recommended):

  • Column ssn has column tag PII

  • Column f_name has column tag sensitive

  • Column dob has column tag indirect identifier

There is one way you can accomplish this use case using orchestrated RBAC: lineage. Immuta's lineage feature (for Snowflake only) can propagate tags based on transform lineage.

  1. Ensure tags are in a hierarchy that will support hierarchical matching. For example, if you have the tags Strictly Confidential, Confidential, Internal, and Public , you would want to ensure that user attributes follow the same hierarchy. For example,

    1. A user with access to all data: Classification: Strictly Confidential

    2. A user with access to only Internal and Public: Classification: Strictly Confidential.Confidential.Internal

Enable schema monitoring

Enable schema monitoring to allow Immuta to actively monitor your data platform to find when new tables or columns are created or deleted. Immuta will then automatically register or disable those tables and update the tags.

If you registered your data through connections, object sync will ensure the objects in your database stay synchronous with the registered objects in Immuta.

Apply tags to data in Immuta

There are several options for applying data tags:

  1. Use identification: This is the most powerful option. Immuta can discover your sensitive data, and you can extend what types of entities are discovered to those specific to your business. Identification can run completely within your data platform, with no data leaving at all for Immuta to analyze. Identification is more relevant for the ABAC approach because the tags are facts about the data.

  2. Sync tags from an external source: You may have already done all the work tagging your data in some external catalog or your own homegrown tool. If so, Immuta can pull those tags in and use them.

  3. Manually tag: Manually tag tables and columns in Immuta from within the UI, using the Immuta API, or when registering the data, either during initial registration or subsequent tables discovered in the future through schema monitoring.

Next steps

Learn

Read these guides to learn more about using Immuta to mask sensitive data.

  1. Choose your path: orchestrated RBAC and ABAC: This section describes the two different approaches (or mix) you can take to managing policy and their tradeoffs.

  2. Managing user metadata: This guide explains how meaningful user metadata is critical to building scalable policy and understanding the considerations around how and what to capture.

  3. Author policy: This guide describes how to define your global data policy logic.

Implement

Follow these guides to start using Immuta to mask sensitive data.

  1. Manage user metadata. Tag your users with attributes and groups that are meaningful for Immuta global policies.

  1. Author policy. Define your global data policy logic.

Last updated

Was this helpful?