Managing User Metadata

Overview

No matter if you choose orchestrated RBAC or ABAC as described in the prior guide, you must have metadata on your users. This can be done in Immuta, via your identity manager, or other custom sources.

Doing so allows you to build scalable policies that do not reference individual users and instead use metadata about them to target them with access decisions. In Immuta, user metadata is termed user attributes and groups.

Considerations

Figure 1: The Triangle of Data Access Decisions

Referring back to our triangle of access decision, user metadata is the first point of that triangle. User metadata is made up of the following elements:

Attributes: key:value pairs tied to your users
Groups: a single value tied to your users - you can think of groups as containing users. Groups can also have their own attributes, a shortcut for assigning attributes to all members of a group. In either case, multiple users can be attached to the same attribute or group.

Referring back to the two paths: orchestrated RBAC and ABAC will help drive how you manage user metadata.

Fact-based user metadata (ABAC) allows you to decouple policy logic from user metadata, allowing highly scalable ABAC policy authoring:

Steve has attribute: country:USA
Sara has attribute: role:administrator
Stephanie has attribute: sales_region: Ohio, Michigan, Indiana
Sam is in group: developers
Group developers has attribute: organization:engineering

Logic-based user metadata (orchestrated-RBAC) couples user metadata with access logic:

Steve has attribute: access_to:USA
Sara has attribute: role:admin_functions
Stephanie has groups: Ohio_sales, Michigan_sales, Indiana_sales
Sam is in group: all_developer_data
Group developer has attribute: access_to:AWS_all

Typically these groups (most common) and attributes come from your identity management system. If they do, they also potentially contain access logic as part of them (examples in the logic-based user metadata list above). It’s not the end of the world if they do, though this means you are potentially set up for orchestrated RBAC. If each group from your identity management system represents a single static fact that provides access for a set of objects, with no overlap with other facts that provide access - then you can use orchestrated RBAC. A simple example to demonstrate groups ready for orchestrated RBAC is “you must have signed license agreement x to see data tagged x” - it’s a simple fact: you’ve signed license agreement x, that provides access to data tagged x. There are no other paths to data tagged x.

But the key to orchestrated RBAC is that it’s a 1:1 relationship with the group from your identity manager and the set of objects that group gives them access to. If this is not true, and there are multiple permutations of groups that have implied policy logic and that logic provides access to the same sets of objects, then you are in a situation with role explosion. This is a problem that Immuta can help you solve, and you should solve that problem with ABAC.

The one exception where you have the above scenario but may want to use orchestrated RBAC is if these multiple paths for access to the same objects are a true hierarchy. For example, data is tagged with Strictly Confidential, Confidential, Internal, and Public:

users assigned group Strictly Confidential can see data tagged Strictly Confidential, Confidential, Internal, and Public
users assigned group Confidential can see data tagged Confidential, Internal, and Public;

Figure 2: User Metadata Hierarchy Example

In this case, you could use orchestrated RBAC even though there are multiple paths of access to the same objects because it is a one-to-one relationship represented in a hierarchy that contains all paths. How you actually write this policy will be described later.

Orchestrated RBAC: user attributes and groups

Since orchestrated RBAC is a one-to-one match (or a true hierarchy), you can use your groups as-is from your identity manager or other custom sources.

Without getting into the details of how the policy is authored (yet), it is possible to leverage a user attribute hierarchy to contribute to access decisions through hierarchical matching. For example, users with attributes that are either a hierarchical parent or an exact match to a given data tag can be subscribed to the data source via a hierarchical match. To better illustrate this behavior, see the matrix below:

User attributes	Data source Tags	Subscribed?	Notes
'Exercise': ['Gym.Treadmill', 'Sports.Football.Quarterback']	['Athletes.Performance', 'Sports.Football.Quarterback']	Yes	Exact match on 'Sports.Football.Quarterback'
'Exercise': ['Gym.Weightlifting', 'Sports.Football']	['Athletes.Performance', 'Sports.Football.Quarterback']	Yes	User attribute 'Sports.Football' is a hierarchical parent of data source tag 'Sports.Football.Quarterback'
'News Articles': ['Gym.Weightlifting', 'Sports.Football']	['Athletes.Performance', 'Sports.Football.Quarterback']	No	The policy is written to only match values under the 'Exercise' attribute key. Not 'News Articles'.
'Exercise': ['Sports']	['Sports.Baseball.Pitcher']	Yes	User attribute 'Sports' is a hierarchical parent of data source tag 'Sports.Baseball.Pitcher'
'Exercise': ['Gym.Gymnastics.Trapeze', 'Sports.Football.Quarterback']	['Gym.Gymnastics', 'Sports.Football']	No	Hierarchical matching only happens in one direction (user attribute contains data source tag). In this case, the user attributes are considered hierarchical children of the data source tags.

So you should take care organizing your user attributes with a hierarchy that will support the hierarchical matching.

Going back to our hierarchical example of Strictly Confidential, Confidential, Internal, and Public above, you would want to ensure that user attributes follow the same hierarchy. For example,

A user with access to all data: Classification: Strictly Confidential
A user with access to only Internal and Public: Classification: Strictly Confidential.Confidential.Internal

Remembering the last row in Figure 3, hierarchical matching only happens in a single direction → user attribute contains the data source tag. Since the user attribute is more specific (Strictly Confidential.Confidential.Internal) than data tagged Strictly Confidential and Strictly Confidential.Confidential, the user would only gain access to data tagged Strictly Confidential.Confidential.Internal and Strictly Confidential.Confidential.Internal.Public.

ABAC: user attributes and groups

To use ABAC, you need to divorce your mind from thinking that every permutation of access requires a new role. (We use the word role interchangeably with groups in this document, because many identity and access systems call groups roles, especially when they contain policy logic.)

Instead you need to focus on the facts about your users they must possess in order to grant them access to data. Ask yourself, why should someone have access to this data? It’s easiest to come at this from the compliance perspective and literally ask yourself that question for all the compliance rules that exist in your organization.

Why should someone have access to my PHI data? Example answers might be

If they are an analyst
If they have taken privacy training

So, at a minimum, you are going to need to attach attributes or groups to users representing if they are an analyst and if they have taken privacy training. Note you do not have to (nor should you) negate these attributes, meaning nobody should have a group not_analyst and no_privacy_training.

Repeat this process until you have a good starting point for the user attributes you need. This is a lot of upfront work and may lead to the perception that ABAC also suffers from attribute explosion. However, when attributes are well enumerated, there’s a natural limit to how many different relevant attributes will be present in an organization. This differs substantially from the world where every permutation of access is represented with a role, where role growth starts off slow, but increases linearly over time.

As an illustration, well-defined subject attributes tend to follow the red curve over time, while roles and poorly defined subject attributes tend to follow the blue and quickly become unwieldy.

Figure 4: Attributes vs Roles

Applying user metadata

Where do those attributes and groups come from? Really, wherever you want - Immuta has several interfaces to make this work, and you can combine them:

Option 1: Store them in your identity manager. If you have the power to do this, you should. It becomes the primary source of facts about your users instead of the source of policy decisions, which should not live in your identity manager. We recommend synchronizing this user metadata from your identity manager to Immuta over SCIM where possible. (This concept was covered in the Detect getting started.)

Option 2: Store them directly in Immuta. It is possible to manually add attributes and groups to users directly in the Immuta UI or through the API.

Option 3: Refer to them from some other external source of truth. For example, perhaps training courses are already stored in another system. If so, you can able to build an external user into endpoint to pull those groups (or attributes) into Immuta. However, there are some considerations with this approach tied to which IAM protocol you use. If your identity manager restricts you from using the approach in the “Consume attributes/groups from arbitrary sources” row in the IAM protocol table, you can instead periodically push the attributes from the source of truth into Immuta using our APIs.

All of these options assume you have already configured your identity manager in Immuta. This has to be done in order to create the user identities that these attributes and groups will be attached to.