LogoLogo
2025.1Book a demo
  • Immuta Documentation - 2025.1
  • Configuration
    • Deploy Immuta
      • Requirements
      • Install
        • Managed Public Cloud
        • Red Hat OpenShift
      • Upgrade
        • Migrating to the New Helm Chart
        • Upgrading IEHC
      • Guides
        • Ingress Configuration
        • TLS Configuration
        • Cosign Verification
        • Production Best Practices
        • Rotating Credentials
        • External Cache Configuration
        • Enabling Legacy Query Engine
        • Private Container Registries
        • Air-Gapped Environments
      • Disaster Recovery
      • Troubleshooting
      • Conventions
    • Connect Data Platforms
      • Data Platforms Overview
      • Amazon S3
      • AWS Lake Formation
        • Register an AWS Lake Formation Connection
        • AWS Lake Formation Reference Guide
      • Azure Synapse Analytics
        • Getting Started with Azure Synapse Analytics
        • Configure Azure Synapse Analytics Integration
        • Reference Guides
          • Azure Synapse Analytics Integration
          • Azure Synapse Analytics Pre-Configuration Details
      • Databricks
        • Databricks Spark
          • Getting Started with Databricks Spark
          • How-to Guides
            • Configure a Databricks Spark Integration
            • Manually Update Your Databricks Cluster
            • Install a Trusted Library
            • Project UDFs Cache Settings
            • Run R and Scala spark-submit Jobs on Databricks
            • DBFS Access
            • Troubleshooting
          • Reference Guides
            • Databricks Spark Integration Configuration
              • Installation and Compliance
              • Customizing the Integration
              • Setting Up Users
              • Spark Environment Variables
              • Ephemeral Overrides
            • Security and Compliance
            • Registering and Protecting Data
            • Accessing Data
              • Delta Lake API
        • Databricks Unity Catalog
          • Getting Started with Databricks Unity Catalog
          • How-to Guides
            • Register a Databricks Unity Catalog Connection
            • Configure a Databricks Unity Catalog Integration
            • Migrate to Unity Catalog
          • Databricks Unity Catalog Integration Reference Guide
      • Google BigQuery
      • Redshift
        • Getting Started with Redshift
        • How-to Guides
          • Configure Redshift Integration
          • Configure Redshift Spectrum
        • Reference Guides
          • Redshift Integration
          • Redshift Pre-Configuration Details
      • Snowflake
        • Getting Started with Snowflake
        • How-to Guides
          • Register a Snowflake Connection
          • Configure a Snowflake Integration
          • Snowflake Table Grants Migration
          • Edit or Remove Your Snowflake Integration
          • Integration Settings
            • Enable Snowflake Table Grants
            • Use Snowflake Data Sharing with Immuta
            • Configure Snowflake Lineage Tag Propagation
            • Enable Snowflake Low Row Access Policy Mode
              • Upgrade Snowflake Low Row Access Policy Mode
        • Reference Guides
          • Snowflake Integration
          • Snowflake Data Sharing
          • Snowflake Lineage Tag Propagation
          • Snowflake Low Row Access Policy Mode
          • Snowflake Table Grants
          • Warehouse Sizing Recommendations
        • Explanatory Guides
          • Phased Snowflake Onboarding
      • Starburst (Trino)
        • Getting Started with Starburst (Trino)
        • How-to Guides
          • Configure Starburst (Trino) Integration
          • Customize Read and Write Access Policies for Starburst (Trino)
        • Starburst (Trino) Integration Reference Guide
      • Queries Immuta Runs in Remote Platforms
      • Legacy Integrations
        • Securing Hive and Impala Without Sentry
        • Enabling ImmutaGroupsMapping
      • Connect Your Data
        • Connections
          • How-to Guides
            • Run Object Sync
            • Manage Connection Settings
            • Use the Connection Upgrade Manager
              • Troubleshooting
          • Reference Guides
            • Connections Reference Guide
            • Upgrading to Connections
              • Before You Begin
              • API Changes
              • FAQ
        • Data Sources
          • Data Sources in Immuta
          • Register Data Sources
            • Amazon S3 Data Source
            • Azure Synapse Analytics Data Source
            • Databricks Data Source
            • Google BigQuery Data Source
            • Redshift Data Source
            • Snowflake Data Source
              • Bulk Create Snowflake Data Sources
            • Starburst (Trino) Data Source
          • Data Source Settings
            • How-to Guides
              • Manage Data Sources and Data Source Settings
              • Manage Data Source Members
              • Manage Access Requests and Tasks
              • Manage Data Dictionary Descriptions
              • Disable Immuta from Sampling Raw Data
            • Data Source Health Checks Reference Guide
          • Schema Monitoring
            • How-to Guides
              • Run Schema Monitoring and Column Detection Jobs
              • Manage Schema Monitoring
            • Reference Guides
              • Schema Monitoring
              • Schema Projects
            • Why Use Schema Monitoring?
    • Manage Data Metadata
      • Connect External Catalogs
        • Getting Started with External Catalogs
        • Configure an External Catalog
        • Reference Guides
          • External Catalogs
          • Custom REST Catalogs
            • Custom REST Catalog Interface Endpoints
      • Data Identification
        • Introduction
        • Getting Started with Data Identification
        • How-to Guides
          • Use Identification
          • Manage Identifiers
          • Run and Manage Identification
          • Manage Identification Frameworks
          • Use Sensitive Data Discovery (SDD)
        • Reference Guides
          • How Competitive Criteria Analysis Works
          • Built-in Identifier Reference
            • Built-In Identifier Changelog
          • Built-in Discovered Tags Reference
      • Data Classification
        • How-to Guides
          • Activate Classification Frameworks
          • Adjust Identification and Classification Framework Tags
          • How to Use a Built-In Classification Framework with Your Own Tags
        • Classification Frameworks Reference Guide
      • Manage Tags
        • How-to Guides
          • Create and Manage Tags
          • Add Tags to Data Sources and Projects
        • Tags Reference Guide
    • Manage Users
      • Getting Started with Users
      • Identity Managers (IAMs)
        • How-to Guides
          • Okta LDAP Interface
          • OpenID Connect
            • OpenID Connect Protocol
            • Okta and OpenID Connect
            • OneLogin with OpenID Connect
          • SAML
            • SAML Protocol
            • Microsoft Entra ID
            • Okta SAML SCIM
        • Reference Guides
          • Identity Managers
          • SAML Single Logout
          • SAML Protocol Configuration Options
      • Immuta Users
        • How-to Guides
          • Managing Personas and Permissions
          • Manage Attributes and Groups
          • User Impersonation
          • External User ID Mapping
          • External User Info Endpoint
        • Reference Guides
          • Attributes and Groups in Immuta
          • Permissions and Personas
    • Organize Data into Domains
      • Getting Started with Domains
      • Domains Reference Guide
    • Application Settings
      • How-to Guides
        • App Settings
        • BI Tools
          • BI Tool Configuration Recommendations
          • Power BI Configuration Example
          • Tableau Configuration Example
        • Add a License Key
        • Add ODBC Drivers
        • Manage Encryption Keys
        • System Status Bundle
      • Reference Guides
        • Data Processing, Encryption, and Masking Practices
        • Metadata Ingestion
  • Governance
    • Introduction
      • Automate Data Access Control Decisions
        • The Two Paths: Orchestrated RBAC and ABAC
        • Managing User Metadata
        • Managing Data Metadata
        • Author Policy
        • Test and Deploy Policy
      • Compliantly Open More Sensitive Data for ML and Analytics
        • Managing User Metadata
        • Managing Data Metadata
        • Author Policy
    • Author Policies for Data Access Control
      • Introduction
        • Scalability and Evolvability
        • Understandability
        • Distributed Stewardship
        • Consistency
        • Availability of Data
      • Policies
        • Authoring Policies at Scale
        • Data Engineering with Limited Policy Downtime
        • Subscription Policies
          • How-to Guides
            • Author a Subscription Policy
            • Author an ABAC Subscription Policy
            • Subscription Policies Advanced DSL Guide
            • Author a Restricted Subscription Policy
            • Clone, Activate, or Stage a Global Policy
          • Reference Guides
            • Subscription Policies
            • Subscription Policy Access Types
            • Advanced Use of Special Functions
        • Data Policies
          • Overview
          • How-to Guides
            • Author a Masking Data Policy
            • Author a Minimization Policy
            • Author a Purpose-Based Restriction Policy
            • Author a Restricted Data Policy
            • Author a Row-Level Policy
            • Author a Time-Based Restriction Policy
            • Policy Certifications and Diffs
          • Reference Guides
            • Data Policy Types
            • Masking Policies
            • Row-Level Policies
            • Custom WHERE Clause Functions
            • Data Policy Conflicts and Fallback
            • Custom Data Policy Certifications
            • Orchestrated Masking Policies
      • Projects and Purpose-Based Access Control
        • Projects and Purpose Controls
          • Getting Started
          • How-to Guides
            • Create a Project
            • Create and Manage Purposes
            • Project Management
              • Manage Projects and Project Settings
              • Manage Project Data Sources
              • Manage Project Members
          • Reference Guides
            • Projects and Purposes
          • Why Use Purposes?
        • Equalized Access
          • Manage Project Equalization
          • Project Equalization Reference Guide
          • Why Use Project Equalization?
        • Masked Joins
          • Enable Masked Joins
          • Why Use Masked Joins?
        • Writing to Projects
          • How-to Guides
            • Create and Manage Snowflake Project Workspaces
            • Create and Manage Databricks Spark Project Workspaces
            • Write Data to the Workspace
          • Reference Guides
            • Project Workspaces
            • Project UDFs (Databricks)
    • Observe Access and Activity
      • Introduction
      • Audit
        • How-to Guides
          • Export Audit Logs to S3
          • Export Audit Logs to ADLS
          • Run Governance Reports
        • Reference Guides
          • Universal Audit Model (UAM)
            • UAM Schema
          • Query Audit Logs
            • Snowflake Query Audit Logs
            • Databricks Unity Catalog Query Audit Logs
            • Databricks Spark Query Audit Logs
            • Starburst (Trino) Query Audit Logs
          • Audit Export GraphQL Reference Guide
          • Governance Report Types
          • Unknown Users in Audit Logs
      • Dashboards
        • Use the Audit Dashboards How-To Guide
        • Audit Dashboards Reference Guide
      • Monitors
        • Manage Monitors and Observations
        • Monitors Reference Guide
    • Access Data
      • Subscribe to a Data Source
      • Query Data
        • Querying Snowflake Data
        • Querying Databricks Data
        • Querying Databricks SQL Data
        • Querying Starburst (Trino) Data
        • Querying Redshift Data
        • Querying Azure Synapse Analytics Data
        • Connect to a Database Tool to Run Ad Hoc Queries
      • Subscribe to Projects
  • Releases
    • Release Notes
      • Immuta v2025.1 Release Notes
        • User Interface Changes in v2025.1 LTS
      • Immuta LTS Changelog
      • Immuta Image Digests
      • Immuta CLI Release Notes
    • Immuta Release Lifecycle
    • Immuta Support Matrix Overview
    • Preview Features
      • Features in Preview
    • Deprecations and EOL
  • Developer Guides
    • The Immuta CLI
      • Install and Configure the Immuta CLI
      • Manage Your Immuta Tenant
      • Manage Data Sources
      • Manage Sensitive Data Discovery
        • Manage Sensitive Data Discovery Rules
        • Manage Identification Frameworks
        • Run Sensitive Data Discovery on Data Sources
      • Manage Policies
      • Manage Projects
      • Manage Purposes
      • Manage Audit
    • The Immuta API
      • Integrations API
        • Getting Started
        • How-to Guides
          • Configure an Amazon S3 Integration
          • Configure an Azure Synapse Analytics Integration
          • Configure a Databricks Unity Catalog Integration
          • Configure a Google BigQuery Integration
          • Configure a Redshift Integration
          • Configure a Snowflake Integration
          • Configure a Starburst (Trino) Integration
        • Reference Guides
          • Integrations API Endpoints
          • Integration Configuration Payload
          • Response Schema
          • HTTP Status Codes and Error Messages
      • Connections API
        • How-to Guides
          • Register a Connection
            • Register a Snowflake Connection
            • Register a Databricks Unity Catalog Connection
            • Register an AWS Lake Formation Connection
          • Manage a Connection
          • Deregister a Connection
        • Connection Registration Payloads Reference Guide
      • Immuta V2 API
        • Data Source Payload Attribute Details
        • Data Source Request Payload Examples
        • Create Policies API Examples
        • Create Projects API Examples
        • Create Purposes API Examples
      • Immuta V1 API
        • Authenticate with the API
        • Configure Your Instance of Immuta
          • Get Job Status
          • Manage Frameworks
          • Manage IAMs
          • Manage Licenses
          • Manage Notifications
          • Manage Tags
          • Manage Webhooks
          • Search Filters
          • Manage Identification
            • Identification Frameworks to Identifiers in Domains
            • Manage Sensitive Data Discovery (SDD)
        • Connect Your Data
          • Create and Manage an Amazon S3 Data Source
          • Create an Azure Synapse Analytics Data Source
          • Create an Azure Blob Storage Data Source
          • Create a Databricks Data Source
          • Create a Presto Data Source
          • Create a Redshift Data Source
          • Create a Snowflake Data Source
          • Create a Starburst (Trino) Data Source
          • Manage the Data Dictionary
        • Use Domains
        • Manage Data Access
          • Manage Access Requests
          • Manage Data and Subscription Policies
          • Manage Write Policies
            • Write Policies Payloads and Response Schema Reference Guide
          • Policy Handler Objects
          • Search Connection Strings
          • Search for Organizations
          • Search Schemas
        • Subscribe to and Manage Data Sources
        • Manage Projects and Purposes
          • Manage Projects
          • Manage Purposes
        • Generate Governance Reports
Powered by GitBook

Other versions

  • SaaS
  • 2025.1
  • 2024.3
  • 2024.2

Copyright © 2014-2024 Immuta Inc. All rights reserved.

On this page
  • Considerations
  • Orchestrated RBAC: user attributes and groups
  • ABAC: user attributes and groups
  • Applying user metadata

Was this helpful?

Export as PDF
  1. Governance
  2. Introduction
  3. Automate Data Access Control Decisions

Managing User Metadata

Last updated 4 months ago

Was this helpful?

No matter if you choose orchestrated RBAC or ABAC as described in the , you must have metadata on your users. This can be done , via your , or other .

Doing so allows you to build scalable policies that do not reference individual users and instead use metadata about them to target them with access decisions. In Immuta, user metadata is termed user attributes and groups.

Considerations

Referring back to our triangle of access decision, user metadata is the first point of that triangle. User metadata is made up of the following elements:

  • Attributes: key:value pairs tied to your users

  • Groups: a single value tied to your users - you can think of groups as containing users. Groups can also have their own attributes, a shortcut for assigning attributes to all members of a group. In either case, multiple users can be attached to the same attribute or group.

Fact-based user metadata (ABAC) allows you to decouple policy logic from user metadata, allowing highly scalable ABAC policy authoring:

  • Steve has attribute: country:USA

  • Sara has attribute: role:administrator

  • Stephanie has attribute: sales_region: Ohio, Michigan, Indiana

  • Sam is in group: developers

  • Group developers has attribute: organization:engineering

Logic-based user metadata (orchestrated-RBAC) couples user metadata with access logic:

  • Steve has attribute: access_to:USA

  • Sara has attribute: role:admin_functions

  • Stephanie has groups: Ohio_sales, Michigan_sales, Indiana_sales

  • Sam is in group: all_developer_data

  • Group developer has attribute: access_to:AWS_all

The one exception where you have the above scenario but may want to use orchestrated RBAC is if these multiple paths for access to the same objects are a true hierarchy. For example, data is tagged with Strictly Confidential, Confidential, Internal, and Public:

  • users assigned group Strictly Confidential can see data tagged Strictly Confidential, Confidential, Internal, and Public

  • users assigned group Confidential can see data tagged Confidential, Internal, and Public

Orchestrated RBAC: user attributes and groups

Without getting into the details of how the policy is authored (yet), it is possible to leverage a user attribute hierarchy to contribute to access decisions through hierarchical matching. For example, users with attributes that are either a hierarchical parent or an exact match to a given data tag can be subscribed to the data source via a hierarchical match. To better illustrate this behavior, see the matrix below:

User attributes

Data source Tags

Subscribed?

Notes

'Exercise': ['Gym.Treadmill', 'Sports.Football.Quarterback']

['Athletes.Performance', 'Sports.Football.Quarterback']

Yes

Exact match on 'Sports.Football.Quarterback'

'Exercise': ['Gym.Weightlifting', 'Sports.Football']

['Athletes.Performance', 'Sports.Football.Quarterback']

Yes

User attribute 'Sports.Football' is a hierarchical parent of data source tag 'Sports.Football.Quarterback'

'News Articles': ['Gym.Weightlifting', 'Sports.Football']

['Athletes.Performance', 'Sports.Football.Quarterback']

No

The policy is written to only match values under the 'Exercise' attribute key. Not 'News Articles'.

'Exercise': ['Sports']

['Sports.Baseball.Pitcher']

Yes

User attribute 'Sports' is a hierarchical parent of data source tag 'Sports.Baseball.Pitcher'

'Exercise': ['Gym.Gymnastics.Trapeze', 'Sports.Football.Quarterback']

['Gym.Gymnastics', 'Sports.Football']

No

Hierarchical matching only happens in one direction (user attribute contains data source tag). In this case, the user attributes are considered hierarchical children of the data source tags.

So you should take care organizing your user attributes with a hierarchy that will support the hierarchical matching.

Going back to our hierarchical example of Strictly Confidential, Confidential, Internal, and Public above, you would want to ensure that user attributes follow the same hierarchy. For example,

  • A user with access to all data: Classification: Strictly Confidential

  • A user with access to only Internal and Public: Classification: Strictly Confidential.Confidential.Internal

Remembering the last row in Figure 3, hierarchical matching only happens in a single direction → user attribute contains the data source tag. Since the user attribute is more specific (Strictly Confidential.Confidential.Internal) than data tagged Strictly Confidential and Strictly Confidential.Confidential, the user would only gain access to data tagged Strictly Confidential.Confidential.Internal and Strictly Confidential.Confidential.Internal.Public.

ABAC: user attributes and groups

To use ABAC, you need to divorce your mind from thinking that every permutation of access requires a new role. (We use the word role interchangeably with groups in this document, because many identity and access systems call groups roles, especially when they contain policy logic.)

Instead you need to focus on the facts about your users they must possess in order to grant them access to data. Ask yourself, why should someone have access to this data? It’s easiest to come at this from the compliance perspective and literally ask yourself that question for all the compliance rules that exist in your organization.

Why should someone have access to my PHI data? Example answers might be

  • If they are an analyst

  • If they have taken privacy training

So, at a minimum, you are going to need to attach attributes or groups to users representing if they are an analyst and if they have taken privacy training. Note you do not have to (nor should you) negate these attributes, meaning nobody should have a group not_analyst and no_privacy_training.

Repeat this process until you have a good starting point for the user attributes you need. This is a lot of upfront work and may lead to the perception that ABAC also suffers from attribute explosion. However, when attributes are well enumerated, there’s a natural limit to how many different relevant attributes will be present in an organization. This differs substantially from the world where every permutation of access is represented with a role, where role growth starts off slow, but increases linearly over time.

As an illustration, well-defined subject attributes tend to follow the red curve over time, while roles and poorly defined subject attributes tend to follow the blue and quickly become unwieldy.

Applying user metadata

Where do those attributes and groups come from? Really, wherever you want - Immuta has several interfaces to make this work, and you can combine them:

Referring back to will help drive how you manage user metadata.

Typically these groups (most common) and attributes come from your identity management system. If they do, they also potentially contain access logic as part of them (examples in the logic-based user metadata list above). It’s not the end of the world if they do, though this means you are potentially set up for . If each group from your identity management system represents a single static fact that provides access for a set of objects, with no overlap with other facts that provide access - then you can use orchestrated RBAC. A simple example to demonstrate groups ready for orchestrated RBAC is “you must have signed license agreement x to see data tagged x” - it’s a simple fact: you’ve signed license agreement x, that provides access to data tagged x. There are no other paths to data tagged x.

But the key to orchestrated RBAC is that it’s a 1:1 relationship with the group from your identity manager and the set of objects that group gives them access to. If this is not true, and there are multiple permutations of groups that have implied policy logic and that logic provides access to the same sets of objects, then you are in a situation with role explosion. This is a problem that Immuta can help you solve, and you should solve that problem with .

In this case, you could use even though there are multiple paths of access to the same objects because it is a one-to-one relationship represented in a hierarchy that contains all paths. How you actually write this policy will be described later.

Since orchestrated RBAC is a one-to-one match (or a true hierarchy), you can use your groups as-is from your or other .

Option 1: Store them in your . If you have the power to do this, you should. It becomes the primary source of facts about your users instead of the source of policy decisions, which should not live in your identity manager. We recommend synchronizing this user metadata from your identity manager to Immuta over SCIM where possible.

Option 2: Store them . It is possible to manually add attributes and groups to users directly in the Immuta UI or through the API.

Option 3: Refer to them from some other . For example, perhaps training courses are already stored in another system. If so, you can able to build an external user into endpoint to pull those groups (or attributes) into Immuta. However, there are some considerations with this approach tied to which IAM protocol you use. If your identity manager restricts you from using the approach in the , you can instead periodically push the attributes from the source of truth into Immuta using our APIs.

All of these options assume you have already . This has to be done in order to create the user identities that these attributes and groups will be attached to.

the two paths: orchestrated RBAC and ABAC
identity manager
custom sources
identity manager
directly in Immuta
orchestrated RBAC
ABAC
orchestrated RBAC
prior guide
in Immuta
identity manager
custom sources
external source of truth
“Consume attributes/groups from arbitrary sources” row in the IAM protocol table
configured your identity manager in Immuta
Figure 2: User Metadata Hierarchy Example