Skip to content

Projects and Purposes

Audience: Data Owners, Data Users, and Data Governors

Content Summary: This overview describes concepts related to and major features of projects.

Project Roles

The features and capabilities of each user differ based on the user's role within the project and within Immuta. Roles and their capabilities are outlined below.

Project Role Capabilities
Project Owner Users with the CREATE_PROJECT permission can manage project members, project documentation, and project subscription policies; enable Project Equalization and Masked Joins; manage project data sources, project tags, and discussion threads; disable, delete, and restore projects; create derived data sources; and manage native workspaces.
Project Governor Users with the GOVERNANCE permission can configure project purposes and acknowledgement statements, manage project members and project subscription policies, add and remove project data sources, manage project tags, and disable and restore projects.
Project Manager Users with the PROJECT _MANAGEMENT permission can approve or deny purposes in a project, create purposes, and add and remove project data sources.
Project Member Once subscribed to a project, all users can add data sources to the project (unless Project Equalization or Masked Joins is enabled), remove data sources they’ve added to the project, post and reply to discussion threads, delete their own discussion threads and replies, and create derived data sources.

Project Purposes and Acknowledgement Statements

Data Governors and users with the PROJECT_MANAGEMENT permission are responsible for configuring and approving project purposes and acknowledgement statements.

Purposes

Best Practice: Purposes

Consider purposes as attributes. Attributes identify a user, and purposes identify why that user should have access.

Purposes help define the scope and use of data within a project and allow users to meet purpose restrictions on policies. Governors create and manage purposes and their sub-purposes, which project owners then add to their project(s) and use to drive Data Policies. If project owners create a purpose, they remain in a staged state until a Governor or Project Manager approves the purpose request.

Purposes Tab

Sub-Purposes

Purposes can be constructed as a hierarchy, meaning that purposes can contain nested sub-purposes, much like tags in Immuta. This design allows more flexibility in managing purpose-based restriction policies and transparency in the relationships among purposes.

For example, consider this organization of the sub-purposes of Research:

Sub-Purpose Builder

Instead of creating separate purposes, which must then each be added to policies as they evolve, a Governor could write the following Global Policy:

Limit usage to purpose(s) Research for everyone on data sources tagged PHI.

Now, any user acting under the purpose or sub-purpose of Research - whether Research.Marketing, Research.Onboarding.Customer, or Research.MedicalClaims - will meet the criteria of this policy. Consequently, purpose hierarchies eliminate the need for a Governor to re-write these Global Policies when sub-purposes are added or removed. Furthermore, if new projects with new Research purposes are added, for example, the relevant Global Policy will automatically be enforced.

Acknowledgement Statements

Projects with purposes require owners and subscribers to acknowledge that they will only use the data for those purposes by affirming or rejecting acknowledgement statements. This practice ensures that project members are aware of and agree to all purpose-based restrictions before accessing the project's content. Each purpose is associated with its own acknowledgement statement, so a project with multiple purposes requires users to accept more than one acknowledgement statement. Immuta keeps a record of whether each project member has agreed to the acknowledgement statement(s) and records the purpose associated to the acknowledgement, the time of the acknowledgement, and the text of the acknowledgement. All purposes are associated with the default acknowledgement statement unless their statement has been customized by a Data Governor.

If users accept the statement, they become a project member. If they reject the acknowledgement statement, they are denied access to the project.

Project Member Acknowledgement

Switching Project Contexts

When a user is working within the context of a project, they will only see the data in that project. This helps to prevent data leakage when users collaborate. Users can switch project contexts to access various data sources while acting under the appropriate purpose. By default, there will be no project selected, even if the user belongs to one or more projects in Immuta.

Project Context

When users change project contexts, all SQL queries or blob fetches that run through Immuta will reflect users as acting under the purposes of that project, which may allow additional access to data if there are purpose restrictions on the data source(s). This process also allows organizations to track not just whether a specific data source is being used, but why.

Project UDFs

Use Project UDFs in Databricks

Currently, caches are not all invalidated outside of Databricks because Immuta caches information pertaining to a user's current project in the NameNode plugin and in Vulcan. Consequently, this feature should only be used in Databricks.

You can also switch project contexts, and view a list of your current project or available projects, through UDFs in Spark.

Virtual Tables:

To view a list of your current project or available projects in a Spark job, you can query these virtual tables.

Virtual Table Query Return
immuta.get_current_project select * from immuta.get_current_project This virtual table returns a single row with "name" and "id" columns that show your currently selected project.
immuta.list_projects select * from immuta.list_projects This virtual table returns rows with "name," "id," and "current_project" columns. Each row is a different project to which you are subscribed (and can use as your current project). The "current_project" row will be true for the row defining the project that you have set as your current project.

UDF Documentation:

To see how to use the functions below, navigate to the Switching Project Contexts with UDFs tutorial.

UDF Description
immuta.set_current_project(id) Sets the user's current project to the project ID denoted by the id parameter.
immuta.set_current_project() (no parameters) Sets the user's current project to None.
immuta.clear_caches() Clears all client caches for the current user's ImmutaClient instance. This can be used when a user would like to invalidate cached items, like data source subscription information or if the state of Immuta has changed and the cache is outdated. For backward compatibility, this UDF is also available at default.immuta_clear_caches()
default.immuta_clear_metastore_cache() Clears the cluster-wide Metastore cache. This UDF can only be run by a privileged user.

Project Equalization

The same security restrictions regarding data sources are applied to projects; project members still need to be subscribed to data sources to access data, and only users with appropriate attributes and credentials can see the data if it contains any row-level or masking security.

However, Project Equalization improves collaboration by ensuring that the data in the project looks identical to all members, regardless of their level of access to data. When enabled, this feature automatically equalizes all permissions so that no project member has more access to data than the member with the least access. For a Project Equalization tutorial, navigate to Create a Project.

Project Equalization

Note: Only project owners can add data sources to the project if this feature is enabled.

Once Project Equalization is enabled, the Subscription Policy for the project is locked and can only be adjusted by the project owner by changing the Equalized Entitlements. Click the tabs below for more information about the settings related to Project Equalization.

Equalized Entitlements

This setting adjusts the minimum entitlements (i.e., users' groups and attributes) required to join the project and to access data within the project. When Project Equalization is enabled, Equalized Entitlements default to Immuta's recommended settings, but project owners can edit these settings by adding or removing parts of the entitlements. However, making these changes entails two potential disadvantages:

  • If you add entitlements, members might see more data as a whole, but at least some members of the project will be out of compliance. The status of users' compliance is visible from the Members tab within the project.

    Compliance Status

  • If you remove entitlements, the project will be open to users with fewer privileges, but this change might make less data visible to all project members. Removing entitlements is only recommended if you foresee new users joining with less access to data than the current members.

Validation Frequency

This setting determines how often user credentials are validated, which is critical if users share data with project members outside of Immuta, as they need a way to verify that those members' permissions are still valid. Validation Frequency provides those means of verification.

Project Equalization and Subscription Policies

Once Project Equalization is enabled, the project Subscription Policy builder locks and can only be adjusted by manually editing the Equalized Entitlements. Then, the Subscription Policy will combine with the entitlement settings, depending on the policy type.

Combinations by Policy Type

The way entitlements and approvals combine differs depending on the policy type; for clarity, the table below illustrates various scenarios for each type. Every row demonstrates how a specific project Subscription Policy changes after Project Equalization is enabled (when an equalized entitlement is set and when no entitlement is set) and how the policy reverts if Project Equalization is subsequently disabled.

Original Policy Equalized Policy (Example Entitlement: member of group Accounting) Equalized Policy (No Entitlement) Policy After Disabling Equalization
Anyone Allow user to subscribe when user is a member of group Accounting Individual Users You Select Individual Users You Select
Allow users to subscribe when approved by anyone with permission Owner (of this project) Allow users to subscribe when they satisfy all of the following: is a member of group Accounting and is approved by anyone with permission Owner (of this Project) Allow users to subscribe when approved by anyone with permission Owner (of this project) Allow users to subscribe when approved by anyone with permission Owner (of this project)
Allow users to subscribe to the project when user is a member of group Legal Allow users to subscribe to the project when user is a member of group Accounting Individual Users You Select Individual Users You Select
Individual Users You Select Allow users to subscribe to the project when user is a member of group Accounting Individual Users You Select Individual Users You Select

Example

For example, consider the Subscription Policy of the following sample project, Fraud Prevention, before Project Equalization is enabled:

Fraud Prevention

Subscription Policy: Allow users to subscribe when approved by anyone with permission Owner (of this project).

After enabling Project Equalization, the following Equalized Entitlement is recommended by Immuta: User is a member of group Accounting.

Equalized Entitlements

In this particular example, the Equalized Subscription Policy contains the Equalized Entitlement and the approval of the original policy, so users must satisfy both conditions to subscribe:

  • the user must be a member of the group Accounting and
  • the user must be approved by anyone with permission Owner (of this project).

Combined Subscription and Entitlements

Masked Joins

This feature allows masked columns to be joined within the context of a project. However, joining on columns masked by rounding, by making null, with a constant or regex or on columns that have conditional masking policies applied to them is not supported and will be blocked.

Masked Joins

Note: Masked columns cannot be joined across data sources that are not linked by a project.

For instructions on enabling Masked Joins, navigate to Create a Project.