Manage Data Metadata How-to Guide

2 - Manage data metadata how-to guide

Before authoring global subscription policies to automate access controls, data metadata must exist in Immuta so that it can be used in the policy to identify the data that should be governed.

This how-to guide demonstrates how to manually manage tags, use data identification, or use existing tags in external catalogs to identify data that should be governed by a subscription policy.

For detailed explanations and examples of how to manage data metadata, see the Managing data metadata guide.

Requirement

Immuta permission: APPLICATION_ADMIN (if using an external catalog or identification) or GOVERNANCE (if manually adding tags in Immuta)

Prerequisites

Select your strategy

  • Fact-based (ABAC): Use this strategy to tag data sources at the column and table level.

  • Logic-based (orchestrated RBAC): Use this strategy to tag data sources at the table level.

Organize your data metadata

Fact-based (ABAC)

Fact-based column tags are descriptive (recommended):

  • Column ssn has column tag social security number

  • Column f_name has column tag name

  • Column dob has column tags date and date of birth

Create tags that describe the data source columns.

Logic-based (orchestrated-RBAC)

Logic-based column tags requires subjective decisions (not recommended):

  • Column ssn has column tag PII

  • Column f_name has column tag sensitive

  • Column dob has column tag indirect identifier

  1. Use your tags as-is from your external catalog.

  2. Ensure tags are in a hierarchy that will support hierarchical matching.

    For example, if you have the tags Strictly Confidential, Confidential, Internal, and Public , you would want to ensure that user attributes follow the same hierarchy. For example,

    • A user with access to all data: Classification: Strictly Confidential

    • A user with access to only Internal and Public: Classification: Strictly Confidential.Confidential.Internal

Just like hierarchy has an impact with user metadata, so can data tag hierarchy. We discussed the matching of user metadata to data metadata in the Managing user metadata guide. However, there are even simpler approaches that can leverage data tag hierarchy beyond matching. This will be covered in more detail in the Author policy guide, but is important to understand as you think through data tagging.

As an example, it is possible to tag your data with Cars and then also tag that same data with more specific tags (in the hierarchy) such as Cars.Nissan.Xterra. Then, when you build policies, you could allow access to tables tagged Cars to administrators, but only those tagged Cars.Nissan.Xterra to suv_inspectors. This will result in two separate policies landing on the same table, and the beauty of Immuta is that it will handle the conflict of those two separate policies. This provides a large amount of scalability because you have to manage far fewer policies.

Enable schema monitoring

Enable schema monitoring to allow Immuta to actively monitor your data platform to find when new tables or columns are created or deleted. Immuta will then automatically register or disable those tables and update the tags.

If you registered your data through connections, object sync will ensure the objects in your database stay synchronous with the registered objects in Immuta.

Apply tags to data in Immuta

There are several options for applying data tags:

  1. Use identification: This is the most powerful option. Immuta can discover your sensitive data, and you can extend what types of entities are discovered to those specific to your business. Identification can run completely within your data platform, with no data leaving at all for Immuta to analyze. Identification is more relevant for the ABAC approach because the tags are facts about the data.

  2. Sync tags from an external source: You may have already done all the work tagging your data in some external catalog or your own homegrown tool. If so, Immuta can pull those tags in and use them.

  3. Manually tag: Manually tag tables and columns in Immuta from within the UI, using the Immuta API, or when registering the data, either during initial registration or subsequent tables discovered in the future through schema monitoring.

Next steps

Learn

Read these guides to learn more about using Immuta to automate data access control decisions.

  1. Choose your path: orchestrated RBAC and ABAC: This section describes the two different approaches (or mix) you can take to managing policy and their tradeoffs.

  2. Managing user metadata: This guide explains how meaningful user metadata is critical to building scalable policy and understanding the considerations around how and what to capture.

  3. Author policy: This guide describes how to define your global subscription policy logic.

Implement

Follow these guides to start using Immuta to automate data access control decisions.

  1. Manage user metadata. Tag your users with attributes and groups that are meaningful for Immuta global policies.

  2. Author policy. Define your global subscription policy logic.

Last updated

Was this helpful?