Skip to content

You are viewing documentation for Immuta version 2022.1.

For the latest version, view our documentation for Immuta SaaS or the latest self-hosted version.

Immuta Introduction

Audience: All Immuta users

Content Summary: This chapter introduces all users to Immuta, broadly describing Immuta's benefits, concepts, user personas, and integrations.

Immuta Overview

The Immuta platform solves two of the largest issues facing data-driven organizations: access and control. In large organizations, it can be difficult, if not impossible, for data scientists to access all the data they need. Once they do get access, it’s often difficult to make sure they use the data in ways that are compliant with regulations.

The Immuta platform solves both problems by providing a consistent point of access for all data analysis and dynamically protects your data with complex policies -- enforced based on the user accessing the data and the logic of the policy -- creating efficient digital data exchanges compliant with organizations' regulations with complete visibility of policy enforcement. Benefits include

  • Scalability and Evolvability: A scalable and evolvable data management system allows you to make changes that impact thousands of tables at once, accurately. It also allows you to evolve your policies over time with minor changes (or no changes at all) through policy logic.
  • Understandability: Immuta can present policies in a natural language form that is easily understood and provide an audit history of change to create a trust and verify environment. This allows you to prove policy is being implemented correctly to business leaders concerned with compliance and risk, and your business can meet audit obligations to external parties or customers.
  • Stability and Repeatability: Immuta was built with the “as-code” movement in mind, allowing you to treat Immuta as ephemeral and represent state in source control. You can merge data policy management into your existing engineering paradigms and toolchains, allowing full automation of every component of Immuta. Additionally, time-to-data is reduced across the organization because policy management is stable and time can be spent on other complex initiatives.
  • Distributed Stewardship: Immuta enables fine-grained data ownership and controls over organizational domains, allowing a data mesh environment for sharing data - embracing the ubiquity of your organization. You can enable different parts of your organization to manage their data policies in a self-serve manner without involving you in every step, and you can make data available across the organization without the need to centralize both the data and authority over the data. This frees your organization to share more data more quickly.
  • Consistency: With inconsistency comes complexity, both for your team and the downstream analysts trying to read data. That complexity from inconsistency removes all value of separating policy from compute. Immuta provides complete consistency so that you can build a policy once, in a single location, and have it enforced scalably and consistently across all your data warehouses.
  • Availability: Availability of these highly granular decisions at the access control level can increase data access by over 50% in some cases when using Immuta because friction between compliance and data access is reduced.
  • Performance: Performance is tied to how Immuta implements policy enforcement. Rather than requiring a copy of data to be created, Immuta enforces policies live.

Immuta Concepts

Immuta Element Description
Data Sources A data source is how users virtually expose data (that lives in a remote data storage technology) across their enterprise to other users. When you expose a data source you are not copying the data; you are using metadata to tell Immuta how to expose it. Once exposed and subscribed to, the data will be accessed in a consistent manner across analytics and visualization tools, allowing reproducibility and sharing. For more information and tutorials about data sources, see Chapter 4.
Policies Policies are fine-grained security controls applied to data sources by Data Owners or Data Governors, who determine the logic behind what is hidden from whom. Immuta offers two policy types: Subscription Policies, which determine who can access a data source, and Data Policies, which determine what data the user sees once they get access to a data source. Through these policies, data is hidden, masked, redacted, and anonymized in the control plane based on the attributes of the users accessing the data and the purpose under which they are acting. For more information and tutorials about policies, see Global Policies in Immuta or the Local Policy Overview.
Projects Projects allow users to logically group work by linking data sources and can be created to efficiently organize work or to provide special access to data to specific users. The same security restrictions regarding data sources are applied to projects; project members still need to be subscribed to data sources in order to access data, and only users with appropriate attributes and credentials will be able to see the data if it contains any row-level or masking security. However, Project Owners can enable Project Equalization, which improves collaboration by ensuring that the data in the project looks identical to all members, regardless of their level of access to data. When enabled, this feature automatically equalizes all permissions so that no project member has more access to data than the member with the least access. For more detailed discussion and tutorials about projects, see Chapter 5.
Audit Logs and Immuta Reports All activity in Immuta is audited, and Data Owners and users with the AUDIT permission can access audit logs that detail who subscribes to each data source, why they subscribe, when they access data, and which files they access. These logs can be used for a number of intentions, including insider threat surveillance and data access monitoring for billing purposes. Audit logs can also be shipped to your enterprise auditing capability, if desired. Similarly, Governors can build Immuta Reports to analyze how data is being used and accessed across Immuta using the Immuta Report Builder. Reports can be based on users, groups, projects, data sources, tags, purposes, policies, and connections within Immuta. For more information and tutorials about audit logs and Immuta Reports, see the Viewing Audit Logs tutorial and the Immuta Reports guide, respectively.

User Personas

  • Application Admins: Application Admins manage the configuration of Immuta for their organization. These users can configure Immuta to use external identity managers and catalogs, enable or disable data handlers, adjust email and cache settings, generate system API keys, and manage various other advanced settings.

  • Data Owners: In order for data to be available in the Immuta platform, a Data Owner — the individual or team responsible for the data — needs to connect their data to Immuta. Once data is connected to Immuta, that data is called a data source. In the process of creating a data source, Data Owners are able to set policies on their data source that restrict which users can access it, which rows within the data a user can access, and which columns within the data source are visible or masked. Data Owners can also decide whether to make their data source public, which makes it available for discovery to all users in the Immuta Web UI, or made private, which means only the Data Owner and its assigned subscribers know it exists.

  • Data Users: Data Users consume the data that’s been made available through Immuta. Data Users can browse the Immuta Web UI seeking access to data and easily connect their third-party data science tools to Immuta.

  • Project Owners: These users can create their own project to restrict how their data will be utilized using purpose-based restrictions or to efficiently organize their data sources.

  • Governors: Governors set Global Policies within Immuta, meaning they can restrict the ways that data is used within Immuta across multiple projects and data sources. Governors can also set purpose-based usage restrictions on projects, which can help limit the ways that data is used within Immuta. By default, Governors can subscribe to data sources; however, this setting can be disabled on the App Settings page. Additionally, users can be a Governor and Admin simultaneously by default, but this setting can also be changed on the App Settings page to render the Governor and Admin roles mutually exclusive.

  • Project Managers: These users inspect, manage, approve, and deny various project changes, including purpose requests and project data sources.

  • User Admins: Another type of System Administrator is the User Admin, who is able to manage the permissions, attributes, and groups that attach to each user. Permissions are only managed locally within Immuta, but groups and attributes can be managed locally or derived from user management frameworks such as LDAP or Active Directory that are external to Immuta. By default, Admins can subscribe to data sources; however, this setting can be disabled on the App Settings page to remove the Admin's ability to create or subscribe to data sources. Additionally, users can be an Admin and Governor simultaneously by default, but this setting can also be changed on the App Settings page to render the Admin and Governor roles mutually exclusive.

User Persona Immuta Permission
Application Admin APPLICATION_ADMIN
Data Owner CREATE_DATA_SOURCE, CREATE_DATA_SOURCE_IN_PROJECT, CREATE_PROJECT
Data User
Data Governor GOVERNANCE
Project Manager PROJECT_MANAGEMENT
User Admin USER_ADMIN

Permissions

Permissions are a system-level mechanism that control what actions a user is allowed to take. These are applied to both the API and UI actions. Permissions can be added to any user by a System Administrator (any user with the USER_ADMIN permission), but the permissions themselves are managed by Immuta and cannot be added or removed in the Immuta UI; however, custom permissions can be created on the App Settings page.

  • APPLICATION_ADMIN: Gives the user access to administrative actions for the configuration of Immuta. These actions include
    • Adding external IAMs.
    • Adding ODBC drivers.
    • Adding external catalogs.
    • Configuring email settings.
  • AUDIT: Gives the user access to the audit logs.
  • CREATE_DATA_SOURCE: Gives the user the ability to create data sources.
  • CREATE_DATA_SOURCE_IN_PROJECT: Gives the user the ability to create data sources within a project.
  • CREATE_S3_DATASOURCE_WITH_INSTANCE_ROLE: When creating an S3 data source, this allows the user to the handler to assume an AWS Role when ingesting data.
  • CREATE_FILTER: Gives the user the ability to create and save a search filter.
  • CREATE_PROJECT: Gives the user the ability to create projects.
  • FETCH_POLICY_INFO: Gives the user access to an endpoint that returns visibilities, masking information, and filters for a given data source.
  • GOVERNANCE: Gives the user the ability to set Global Policies, create purpose-based usage restrictions on projects, and manage tags.
  • IMPERSONATE_USER: Allows user to impersonate other Immuta users by entering their own SQL credentials to authenticate with the Immuta Query Engine and then specifying which user they would like to impersonate.
  • IMPERSONATE_HDFS_USER: When creating an HDFS data source, this allows the user to enter any HDFS username to use when accessing data.
  • PROJECT_MANAGEMENT: Allows users to create purposes, approve and deny purpose requests, and manage project data sources.
  • USER_ADMIN: Gives the user access to administrative actions for managing users in Immuta. These include
    • Creating and managing users and groups.
    • Add and remove user permissions.
    • Create and manage user attributes.

Deployment Options

  • SaaS: This deployment option provides data access control through Immuta's native integrations, with automatic software updates and no infrastructure or maintenance costs.

  • Self-Managed: Immuta supports self-managed deployments for users who store their data on-premises or in private clouds, such as VPC. Users can connect to on-premises data sources and cloud data platforms that run on Amazon Web Services, Microsoft Azure, and Google Cloud Platform.