Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
The data sources page allows Immuta users to view, subscribe to, and create data sources in Immuta. On the main data source page is a list of data sources. Users can navigate between the All Data Sources tab and the My Data Sources tab to filter this list. Additionally, the Search bar can be used to filter search results by data source name, tag, project, connection strings, or columns.
To navigate to a specific data source, click on it from this list, and you will be taken to the data source overview page
In addition to the data source's health, this page provides detailed information about the data source and is organized by tabs across the top of the page: Overview, Members, Policies, Data Dictionary, Discussions, Contacts, and Relationships. The visibility and appearance of the tabs will vary slightly depending on the type of user accessing the data source.
This section includes detailed information regarding Data Source Health and Data Source Health Checks. The health status of a data source is visible in the top right corner of the data source details page.
If you click the health status text, a dropdown menu displays the status of specific data source checks.
Health Check: When an Immuta data source is created, a background job is submitted to compute the row count and high cardinality column for the data source. This job uses the connection information provided at data source creation time. A data source initially has a health status of “healthy” because the initial health check performed is a simple SQL query against the source to make sure the source can be queried at all. After the background job for the row count/high cardinality column computation is complete, the health status is updated. If one or both of those jobs failed, the health status will change to “Unhealthy.”
Fingerprint: Captures summary statistics of a data source when a data source is created, when a policy is applied or changed, or when a user manually updates the data source fingerprint.
View: Depending on the integration, this records if a view has been created to represent the data source in an integration, when it was created, and gives a button to re-create the view if policies have been changed.
Row Count: Calculates the number of rows in the data source.
High Cardinality: Calculates the high cardinality columns, which contain unique values such as identification numbers, email addresses, or usernames. A high cardinality column is required to generate consistent random subsets of data for use in certain minimization techniques.
Global Policies Applied: Verifies that relevant Global Policies are successfully applied.
Schema Detection: Detects when a new table has been added in a remote database and automatically creates a new data source. Correspondingly, if a remote table is removed, that data source will be disabled in the console. Schema detection is set to run every night.
Column Detection: Detects when a column has been added or removed in a remote database and automatically updates the data source in Immuta. This detection is set to run every night, but users can manually trigger the job here.
This tab includes detailed information about the data source, including its Description, Technology, Table Name, Remote Database, Remote Table, the Parent Server, and the Data Source ID.
From here, data owners can also manage data source tags and edit or create a data source description.
This tab contains information about the users associated with the data source, their username, when their access expires, what their role is, how they are subscribed to the data source, and an Actions button that details the users' subscription history, including the reason users need access to the data and how they plan to use it.
This tab is visible to everyone, but Data Owners and Governors can manage users from this page.
Members can be filtered by Role or Subscription using the Filters button.
This tab lists the policies associated with the data source and includes three components:
Subscribers: Lists who may access the data source. If a Subscription Policy has already been set by a Global Policy, a notification and a Disable button appear at the bottom of this section. Data Owners can click the Disable button to make changes to the Subscription Policy.
Data Policies: Lists policies that enforce privacy controls on the data source. Data Owners can use this section to manage policies.
Activity Panel: Records all changes made to policies by Data Owners or Governors, including when the data source was created, the name and type of the policy, when the policy was applied or changed, and if the policy is in conflict on the data source. Global policy changes are identified by the Governance icon; all other updates are labeled by the Data Sources icon.
This tab is visible to everyone, but Data Owners and Governors can manage policies from this page.
The Data Dictionary is a table that details information about each column in a data source. The information within the Data Dictionary is generated automatically when the data source is created if the remote data platform supports SQL. Otherwise, Data Owners or Experts can manually create Data Dictionaries. The Data Dictionary tab includes three sections:
Name: The name of the column in the table.
Type: The type of value, which may be text, integer, decimal, or timestamp.
Actions: Users may use the buttons in this column to edit, comment, or tag items in the Data Dictionary.
Deprecation notice
Support for this feature has been deprecated.
Users are able to comment on or ask questions about the Data Dictionary columns and definitions, public queries, and the data source in general. Resolved comments and questions are available for review to keep a complete history of all the knowledge sharing that has occurred on a data source.
Contact information for Data Owners is provided for each data source, which allows users to ask questions about accessibility and attributes required for viewing the data.
This tab lists all projects, derived data sources, or parent data sources associated with the data source and includes the reason the data source was added to a project, who added the data source to the project or created it, and when the data source was added to the project or created.
When users submit an Unmask request in the UI, a Tasks tab appears beside the Relationships tab for the requesting user and the user receiving the request. This tab contains information about the request and allows users to view and manage the tasks listed.
The Immuta people page is visible only to user administrators; the following actions can be completed on the Immuta people page:
Create, manage, and delete users.
Add or delete permissions from users and groups.
Add or delete attributes from users and groups.
Create, manage, and delete groups.
On this tab, administrators can add users, filter the list of users, or navigate to users' profiles by clicking on their name.
After clicking on an individual user from this list, the user's email, position, and last login and update appear. From here, admins can manage the user's permissions, attributes, and groups.
Similar to the Users tab, the Groups tab includes a list of groups. After clicking on a specific group, administrators can view the group details, add and remove group members, and manage attributes for the group.
Immuta does not require users to learn a new API or language to access data exposed there. Instead, Immuta integrates with existing tools and ongoing work while remaining invisible to downstream consumers. This page outlines those integrations.
The Snowflake integration differs based on your Snowflake Edition:
Snowflake Integration Using Snowflake Governance Features: With this integration, policies administered in Immuta are pushed down into Snowflake as Snowflake governance features (row access policies and masking policies). This integration requires Snowflake Enterprise Edition or higher.
Snowflake Integration Without Snowflake Governance Features: With this integration, policies administered by Immuta are pushed down into Snowflake as views with a 1-to-1 relationship to the original table and all policy logic is contained in that view.
Click a link below for details about each question:
This integration allows you to manage multiple Databricks workspaces through Unity Catalog while protecting your data with Immuta policies. Instead of manually creating UDFs or granting access to each table in Databricks, you can author your policies in Immuta and have Immuta manage and enforce Unity Catalog access-control policies on your data in Databricks clusters or SQL warehouse.
Immuta’s Databricks Spark integration with Unity Catalog support uses a custom Databricks plugin to enforce Immuta policies on a Databricks cluster with Unity Catalog enabled. This integration allows you to add your tables to the Unity Catalog metastore so that you can use the metastore from any workspace while protecting your data with Immuta policies.
This integration enforces policies on Databricks tables registered as data sources in Immuta, allowing users to query policy-enforced data on Databricks clusters (including job clusters). Immuta policies are applied to the plan that Spark builds for users' queries, all executed directly against Databricks tables.
Deprecation notice
Support for this integration has been deprecated. Use the Starburst (Trino) v2.0 integration instead.
The Starburst (Trino) integration enables Immuta to apply policies directly in Starburst and Trino clusters without going through a proxy. This means users can use their existing Starburst and Trino tooling (querying, reporting, etc.) and have per-user policies dynamically applied at query time.
The Starburst (Trino) integration v2.0 allows you to access policy-protected data directly in your Starburst (Trino) catalogs without rewriting queries or changing your workflows. Instead of generating policy-enforced views and adding them to an Immuta catalog that users have to query (like in the legacy Starburst (Trino) integration), Immuta policies are translated into Starburst (Trino) rules and permissions and applied directly to tables within users’ existing catalogs.
With the Redshift integration, Immuta applies policies directly in Redshift. This allows data analysts to query their data directly in Redshift instead of going through a proxy.
The Azure Synapse Analytics integration allows Immuta to apply policies directly in Azure Synapse Analytics dedicated SQL pools without needing users to go through a proxy. Instead, users can work within their existing Synapse Studio and have per-user policies dynamically applied at query time.
Private preview
This integration is available to select accounts. Reach out to your Immuta representative for details.
The Amazon S3 integration allows users to apply subscription policies to data in S3 to restrict what prefixes, buckets, or objects users can access. To enforce access controls on this data, Immuta creates S3 grants that are administered by S3 Access Grants, an AWS feature that defines access permissions to data in S3.
Private preview
This integration is available to select accounts. Reach out to your Immuta representative for details.
In this integration, Immuta generates policy-enforced views in your configured Google BigQuery dataset for tables registered as Immuta data sources.
Users who want to use tagging capabilities outside of Immuta and pull tags from external table schemas can connect Collibra or Alation as an external catalog. Once they have been connected, Immuta will ingest a data dictionary from the catalog that will apply data source and column tags directly onto data sources. These tags can then be used to write and drive policies.
If users have another catalog, or have customized their Collibra or Alation integrations, they can connection through the REST Catalog using the Immuta API.
Users can also connect a Snowflake account to allow Immuta to ingest Snowflake tags onto Snowflake data sources.
External identity managers configured in Immuta allow users to authenticate using an existing identity management system and can optionally be used to synchronize user groups and attributes into Immuta.
The table below outlines the features supported by each of Immuta's integrations.
The table below outlines the audit support by each of Immuta's integrations and what information is included in the audit logs.
Legend:
Limited support: There is limited support for audit for this integration.
Certain policies are unsupported or supported with caveats, depending on the integration:
*Supported with Caveats:
On Databricks data sources, joins will not be allowed on data protected with replace with NULL/constant policies.
On Trino data sources, the Immuta functions @iam
and @interpolatedComparison
for WHERE clause policies can block the creation of views.
For details about each of these policies, see the Policies in Immuta page.
The Immuta UI allows users to share, access, and analyze data from one secure location efficiently and easily. This section of documentation introduces all Immuta users to pages and basic features found in the Immuta console.
Data:
: Create, manage, and subscribe to data sources.
: Combine data sources, work under specified purposes, and collaborate with other users.
: Manage user roles, groups, and attributes.
: Manage global policies and view all policies and the data sources they apply to.
: Configure purposes, run governance reports, and view notifications.
: Analyze how data is being used across your organization.
: Write, modify, and execute queries against data sources you're subscribed to in the Immuta UI.
: Configure Immuta to meet your organization's needs.
: View access requests and receive activity updates.
: Manage username and password, access SQL credentials, and generate API keys.
Policy decision data is transmitted to ensure end users querying data are limited to the appropriate access as defined by the policies in Immuta.
Spark Plugin
In the Databricks integration, the user, data source information, and query are sent to Immuta through the Spark plugin to determine what policies need to be applied while the query is being processed. Data that travels from Immuta to the Databricks cluster could include
user attributes.
what columns to mask.
the entire predicate itself (for row-level policies).
A user runs a query against data in their environment.
The query is sent to the Immuta Web Service.
The Web Service queries the Metadata Database to obtain the policy definition, which includes data source metadata (tags, column names, etc.) and user entitlements (groups and attributes).
The policy information is transmitted to the remote data system for native policy enforcement.
Query results are displayed based on what policy definition was applied.
Sample data is processed and aggregated or reduced during Immuta's and specific . Note: Data Owners can see sample data when editing a data source. However, this action requires the database password, and the small sample of data visible is only displayed in the UI and is not stored in Immuta.
When enabled, statistical queries made during data source health checks are distilled into summary statistics, called fingerprints. The sample data processed for fingerprinting allows Immuta to track data source changes.
During this process, statistical query results and data samples (which may contain PII) are temporarily held in memory by the Fingerprint Service.
The fingerprinting process checks for new tables through schema monitoring (when enabled) and captures summary statistics of changes to data sources, including when policies were applied, external views were created, or sensitive data elements were added.
Immuta does not sample data for row redaction policies.
Immuta does not sample data for row redaction policies; Immuta only pulls samples of data to determine if a column is a candidate for randomized response and aggregates of user-defined cohorts for k-anonymization. Both datasets only exist in memory during the computation.
Sample data is processed when k-anonymization or randomized response policies are applied to data sources.
Sample data exists temporarily in memory in the Fingerprint Service during the computation.
k-Anonymization Policies: At the time of its application, the columns of a k-anonymization policy are queried under a separate fingerprinting process that generates rules enforcing k-anonymity. The results of this query, which may contain PII, are temporarily held in memory by the Fingerprint Service. The final rules are stored in the Metadata Database as the policy definition for enforcement.
Randomized Response Policies: If the list of substitution values for a categorical column is not part of the policy specification (e.g., when specified via the API), a list is obtained via query and merged into the policy definition in the Metadata Database.
Raw data is processed for masking, producing either a distinct set of values or aggregated groups of values.
Immuta captures metadata and stores it in an internal PostgreSQL database. Customers can encrypt the volumes backing the database using an external Key Management Service to ensure that data is encrypted at rest.
To encrypt data in transit, Immuta uses TLS protocol, which is configured by the customer.
Immuta encrypts values with data encryption keys, either those that are system-generated or managed using an external key management service (KMS). Immuta recommends a KMS to encrypt or decrypt data keys and supports the AWS Key Management Service; however, if no KMS is configured, Immuta will generate a data encryption key on a user-defined rollover schedule, using the most recent data key to encrypt new values while preserving old data keys to decrypt old values.
Immuta employs three families of functions in its masking policies:
One-way Hashing: One-way (irreversible) hashing is performed via a salted SHA256 hash. A consistent salt is used for values throughout the data source, so users can count or track the specific values without revealing the true value. Since hashed values are different across data sources, users are unable to join on hashed values. Note: joining on masked values can be enabled in Immuta Projects.
Reversible Masking: For reversible masking, values are encrypted using AES-256 CBC encryption. Encryption is performed using a cell-specific initialization vector. The resulting values can be unmasked by an authorized user. Note that this is dynamic encryption of individual fields as results are streamed to the querying system; Immuta is not modifying records in the data store.
Reversible Format Preserving Masking: Format preserving masking maintains the format of the data while masking the value, and is achieved by initializing and applying the NIST standard method FF1 at the column level. The resulting values can be unmasked by an authorized user.
Immuta communicates with remote databases over a TCP connection.
Audience: Data Owners, Data Users, and Data Governors
Content Summary: Projects allow users to collaborate in their data analysis by combining data sources and providing special access to data for project members. Projects are created, managed, and joined from the Projects page.
This page highlights the major features of the Projects page. For conceptual details or specific tutorials, click the links below or navigate to .
This page lists all the public projects available to be joined by others in the All Projects tab and all projects users own or belong to are listed in the My Projects tab. Additionally, users with the CREATE_PROJECT
permission can create a new project from this page.
To view details about a specific project, users click the project name.
After navigating to a specific project from the Projects page, the following information about the project is visible to users on the Overview tab:
Project Details: Information about the project appears in the sidebar on the left of the Overview tab. Details include when the project was created, the purposes associated with the project, a description of the project, the project ID, and credentials.
Documentation: If Project Owners choose, they may add documentation about their project, which will appear in this section to viewers. If no additional documentation about the project is added, only the project name will appear here.
Data Sources: The data sources associated with the project are listed here. Users can click on individual data sources to view the reason why it was added to the project and they can navigate to the data source itself. Project Owners can also manage their project data sources in this section.
Tags: Tags associated with the data source are listed here. Project Owners can manage tags from this section.
Activity Panel: All activity associated with with the project is listed in the sidebar on the right of the screen. Information recorded here includes who added data sources and tags to the project, members who have been added and removed from the project, and policy updates to the project.
This page includes a list of project members, their contact information and role, how they are subscribed, and when their membership expires. From this page, Project Owners can add and remove members from the project.
Members can be filtered by Role or Subscription using the Filters button.
This tab allows Project Owners to choose who may request access to their project or whether or not their project is visible at all to users who are not project members.
The Project Equalization section enables Project Owners to level all members' access to data so that data appears the same to all project members, regardless of their individual attributes or groups.
The Subscribers section allows Project Owners to make their project open to anyone, to users who request and are granted access, to users with specified groups and attributes, or only to users the Project Owners manually add.
Deprecation notice
Support for this feature has been deprecated.
Project members can view, create, reply to, delete, and resolve discussion threads in this tab.
A list of data sources within the project appears in this tab. Project members can view, comment on, and add data sources to the project here as well. Any project member can add data sources to the project, unless the Allow Masked Joins or Project Equalization features are enabled; in those instances, only Project Owners can add data sources to the project.
Governors manage purposes for data use across Immuta. After creating a purpose, governors can customize acknowledgement statements that users must agree to before accessing a project or data source. Project owners also have the ability to create purposes that will populate on the purposes tab of the governance page.
Governors can build reports to analyze how data is being used and accessed across Immuta using this report builder. Reports can be based on users, groups, projects, data sources, tags, purposes, policies, and connections within Immuta.
For detailed information on how to run reports, see .
This tab contains a list of all activity associated with the governor, data sources, and global and local policies.
This tab contains a list of all tags within the Immuta environment. This includes built-in Immuta tags, tags created by governors, and tags imported from an external catalog. These tags can then be applied to projects, data sources, and the data dictionary by governors, data owners, or data source experts.
Governors can click on the tags listed here to open up a tag details page. This details page has an overview tab with information about the tag's description, origin, and creation. It also includes a data sources tab that lists the data sources the tag has been applied to and information about its application. The tag details page also includes a columns tab with the columns the tag has been applied to and information about its application, like the other tags applied to that column.
For more information on tags, see the .
The Immuta platform solves two of the largest issues facing data-driven organizations: access and control. In large organizations, it can be difficult, if not impossible, for data scientists to access all the data they need. Once they do get access, it’s often difficult to make sure they use the data in ways that are compliant with regulations.
The Immuta platform solves both problems by providing a consistent point of access for all data analysis and dynamically protects your data with complex policies -- enforced based on the user accessing the data and the logic of the policy -- creating efficient digital data exchanges compliant with organizations' regulations with complete visibility of policy enforcement. Benefits include
Scalability and Evolvability: A scalable and evolvable data management system allows you to make changes that impact thousands of tables at once, accurately. It also allows you to evolve your policies over time with minor changes (or no changes at all) through policy logic.
Understandability: Immuta can present policies in a natural language form that is easily understood and provide an audit history of change to create a trust and verify environment. This allows you to prove policy is being implemented correctly to business leaders concerned with compliance and risk, and your business can meet audit obligations to external parties or customers.
Stability and Repeatability: Immuta was built with the “as-code” movement in mind, allowing you to treat Immuta as ephemeral and represent state in source control. You can merge data policy management into your existing engineering paradigms and toolchains, allowing full automation of every component of Immuta. Additionally, time-to-data is reduced across the organization because policy management is stable and time can be spent on other complex initiatives.
Distributed Stewardship: Immuta enables fine-grained data ownership and controls over organizational domains, allowing a data mesh environment for sharing data - embracing the ubiquity of your organization. You can enable different parts of your organization to manage their data policies in a self-serve manner without involving you in every step, and you can make data available across the organization without the need to centralize both the data and authority over the data. This frees your organization to share more data more quickly.
Consistency: With inconsistency comes complexity, both for your team and the downstream analysts trying to read data. That complexity from inconsistency removes all value of separating policy from compute. Immuta provides complete consistency so that you can build a policy once, in a single location, and have it enforced scalably and consistently across all your data warehouses.
Availability: Availability of these highly granular decisions at the access control level can increase data access by over 50% in some cases when using Immuta because friction between compliance and data access is reduced.
Performance: Performance is tied to how Immuta implements policy enforcement. Rather than requiring a copy of data to be created, Immuta enforces policies live.
Immuta Element | Description |
---|
Application Admins: Application Admins manage the configuration of Immuta for their organization. These users can configure Immuta to use external identity managers and catalogs, enable or disable data handlers, adjust email and cache settings, generate system API keys, and manage various other advanced settings.
Data Owners: In order for data to be available in the Immuta platform, a Data Owner — the individual or team responsible for the data — needs to connect their data to Immuta. Once data is connected to Immuta, that data is called a data source. In the process of creating a data source, Data Owners are able to set policies on their data source that restrict which users can access it, which rows within the data a user can access, and which columns within the data source are visible or masked. Data Owners can also decide whether to make their data source public, which makes it available for discovery to all users in the Immuta Web UI, or made private, which means only the Data Owner and its assigned subscribers know it exists.
Data Users: Data Users consume the data that’s been made available through Immuta. Data Users can browse the Immuta Web UI seeking access to data and easily connect their third-party data science tools to Immuta.
Project Owners: These users can create their own project to restrict how their data will be utilized using purpose-based restrictions or to efficiently organize their data sources.
Governors: Governors set Global Policies within Immuta, meaning they can restrict the ways that data is used within Immuta across multiple projects and data sources. Governors can also set purpose-based usage restrictions on projects, which can help limit the ways that data is used within Immuta. By default, Governors can subscribe to data sources; however, this setting can be disabled on the App Settings page. Additionally, users can be a Governor and Admin simultaneously by default, but this setting can also be changed on the App Settings page to render the Governor and Admin roles mutually exclusive.
Project Managers: These users inspect, manage, approve, and deny various project changes, including purpose requests and project data sources.
User Admins: Another type of System Administrator is the User Admin, who is able to manage the permissions, attributes, and groups that attach to each user. Permissions are only managed locally within Immuta, but groups and attributes can be managed locally or derived from user management frameworks such as LDAP or Active Directory that are external to Immuta. By default, Admins can subscribe to data sources; however, this setting can be disabled on the App Settings page to remove the Admin's ability to create or subscribe to data sources. Additionally, users can be an Admin and Governor simultaneously by default, but this setting can also be changed on the App Settings page to render the Admin and Governor roles mutually exclusive.
APPLICATION_ADMIN: Gives the user access to administrative actions for the configuration of Immuta. These actions include
Adding external IAMs.
Adding ODBC drivers.
Adding external catalogs.
Configuring email settings.
AUDIT: Gives the user access to the audit logs.
CREATE_DATA_SOURCE_IN_PROJECT: Gives the user the ability to create data sources within a project.
CREATE_S3_DATASOURCE_WITH_INSTANCE_ROLE: When creating an S3 data source, this allows the user to the handler to assume an AWS Role when ingesting data.
CREATE_FILTER: Gives the user the ability to create and save a search filter.
FETCH_POLICY_INFO: Gives the user access to an endpoint that returns visibilities, masking information, and filters for a given data source.
IMPERSONATE_USER: Allows user to impersonate other Immuta users by entering their own SQL credentials to authenticate with the Immuta Query Engine and then specifying which user they would like to impersonate.
IMPERSONATE_HDFS_USER: When creating an HDFS data source, this allows the user to enter any HDFS username to use when accessing data.
PROJECT_MANAGEMENT: Allows users to create purposes, approve and deny purpose requests, and manage project data sources.
USER_ADMIN: Gives the user access to administrative actions for managing users in Immuta. These include
Creating and managing users and groups.
Add and remove user permissions.
Create and manage user attributes.
SaaS: This deployment option provides data access control through Immuta's native integrations, with automatic software updates and no infrastructure or maintenance costs.
Self-Managed: Immuta supports self-managed deployments for users who store their data on-premises or in private clouds, such as VPC. Users can connect to on-premises data sources and cloud data platforms that run on Amazon Web Services, Microsoft Azure, and Google Cloud Platform.
If you want to disable the metadata collection that requires sampling data, you must
These steps will ensure that Immuta queries no data, under any circumstances. Without this sample data, some Immuta features will be unavailable. Sensitive Data Discovery (SDD) cannot be used to automatically detect sensitive data in your data sources, and the following masking policies will not work:
Masking with format preserving masking
Masking with k-anonymization
Masking using randomized response
To stop Immuta from running fingerprints on all data sources,
Navigate to the App Settings page, and scroll to the Advanced Configuration section.
Enter the following YAML:
Click Save.
To stop Immuta from running data source health checks on all data sources,
Navigate to the App Settings page, and scroll to the Advanced Configuration section.
Enter the following YAML:
Click Save.
For Immuta to enforce policies, it needs to catalog the resources policies are being applied to by performing metadata ingestion. Metadata ingestion is the process that occurs when you where Immuta gathers details about your tables. However, Immuta does not need access to the data within the tables in order to protect it, with the exception of a few specific and advanced masking policies detailed below.
Immuta collects and stores the following kinds of information in Immuta's Metadata Database for policy enforcement. Further, policy information may be transmitted to data source host systems for enforcement purposes as part of a query or to enable the host system to perform native enforcement.
Identity Management Information: Usernames, group information, and other kinds of personal identifiers may be stored and referenced for the purposes of performing authentication and access control and may be retained in audit logs. When such information is relevant for access determination under policy, it may be retained as part of the policy definition.
Schema Information: Data source metadata such as schema, column data types, and information about the host.
Immuta's Metadata Database can also contain the following forms of metadata for policy enforcement. These forms contain sample data from your tables and if you do not want Immuta to have access to the data being protected.
Fingerprints: When enabled, additional statistical queries made during the health check are distilled into summary statistics, called fingerprints. During this process, statistical query results and data samples (which may contain PII) are temporarily held in memory by the Fingerprint Service.
k-Anonymization Policies: When a k-anonymization policy is applied, the columns under the k-anonymization policy are queried within a separate fingerprinting process which generates rules enforcing k-anonymity. The results of this query, which may contain PII, are temporarily held in memory by the Fingerprint Service. The final rules are stored for enforcement.
Randomized Response Policies: If the list of substitution values for a categorical column is not part of the policy specification (e.g., when specified via the API), a list is obtained via query and merged into the policy definition.
If no metadata collection types have been disabled, data is processed in the following workflow to support data source creation, health checks, policy enforcement, and dictionary features.
A System Administrator configures the integration in Immuta.
A Data Owner registers data sources from their remote data platforms with Immuta. Note: Data Owners can see sample data when editing a data source. However, this action requires the database password, and the small sample of data visible is only displayed in the UI and is not stored in Immuta.
When a data source is created or updated, the Metadata Database pulls in and stores statistics about the data source, including row count and high cardinality calculations.
The data source health check runs daily to ensure existing tables are still valid.
If an external catalog is enabled, the daily health check will pull in data source attributes (e.g., tags and definitions) and store them in the Metadata Database.
Immuta requires certain privileges to perform metadata ingestion. The user connecting a table to Immuta as a data source must have privileges specific to their data platform to perform metadata ingestion.
For example, a user registering a Snowflake table as an Immuta data source must have the REFERENCES
privilege to view the structure of the table and allow Immuta access to that information as well. This does not require the user (or Immuta) to have access to view the data itself.
Project Workspaces | Tag Ingestion | User Impersonation | Native Query Audit | Multiple Integrations | |
---|---|---|---|---|---|
Snowflake | Databricks Spark | Databricks Unity Catalog | Starburst (Trino) | Redshift | Azure Synapse Analytics | |
---|---|---|---|---|---|---|
This is available and the information is included in audit logs.
This is not available and the information is not included in audit logs.
User Persona | Immuta Permission |
---|
Permissions are a system-level mechanism that control what actions a user is allowed to take. These are applied to both the and actions. Permissions can be added to any user by a System Administrator (any user with the USER_ADMIN
permission), but the permissions themselves are managed by Immuta and cannot be added or removed in the Immuta UI; however, custom permissions can be created on the .
CREATE_DATA_SOURCE: Gives the user the ability to .
CREATE_PROJECT: Gives the user the ability to .
GOVERNANCE: Gives the user the ability to , create purpose-based usage restrictions on , and .
Tag each data source with the seeded Skip Stats Job
tag to stop Immuta from collecting a sample and running table stats on the sample. You can tag data sources as you or .
Audience: All Immuta users
Content Summary: Notifications in the Immuta UI fall into two categories: Access Requests and Activity. This page illustrates these basic Notification features in the Immuta UI.
Request notifications alert Data Owners that users wish to subscribe to their data sources.
Users can view their request notifications by clicking on the cell phone icon in the top right corner of the Immuta Console.
After clicking on the icon, Data Owners can grant or reject requests directly in the notifications drop-down.
Users will see their pending access requests in the same dropdown.
Activity notifications are used to alert users to actions that other users have performed within Immuta. The activity requests that each user receives depend on their permissions and responsibilities.
Data Users: Data Users receive activity notifications when Data Owners accept or deny their pending access requests.
Data Owners: Data Owners receive notifications about activity in their data sources and projects and when users query their data sources that have policies enforced. These notifications are shown when the user selects the bell icon in the upper righthand corner.
Governors: Governors receive notifications for all data source activity, including policy updates within Immuta. These notifications are shown when the user selects the bell icon in the upper right-hand corner.
Administrators: Administrators receive notifications for user, group, and attribute activity, such as when a new user is created or when an attribute is added to a group. These notifications are shown when the user selects the bell icon in the upper right-hand corner.
For an extensive list of notifications, see the Webhooks API page.
If SMTP is configured for an organization's Immuta instance, users may also receive notifications at the email address they configure in their profile.
Users can subscribe to email notifications by completing the following steps:
Navigate to the User Profile page, and select Edit from the dropdown menu in the top right corner of the user profile information panel.
Select the Receive System Notifications as Emails checkbox at the bottom of the window that appears.
Click Save.
Once this setting is enabled, Immuta will compile notifications and distribute these compilations via email at 8-hour intervals.
Deprecation notice
Support for this feature has been deprecated.
This page outlines the basic features of the Query Editor, which contains three main components: Table List and Schema View, the Query Editor, and the Query Results View. For a tutorial that details how to use the Query Editor, navigate to the Data Source User Guide.
The Query Editor allows users who are subscribed to a data source to preview data and write and execute queries directly in the Immuta UI for any data sources they are subscribed to. Additionally, Data Owners can examine how their policies impact the underlying data.
This panel contains a list of tables (grouped by schema) the user is subscribed to, and this list will automatically update when users switch their current project. Clicking a table in the list displays the schema view, which shows all columns with their respective data types.
Users can enter, modify, and execute their own queries in this panel. After users click Run Query, results will appear in the Query Results panel.
In the top right corner of the Query Editor is a dropdown to select a schema. Any tables in SELECT
statements that are not schema-qualified will use the schema chosen from the dropdown.
This panel displays the data returned by the query. Table columns can be resized or re-arranged by clicking and dragging, and results can be filtered. Currently displayed results can also be exported to .csv (limited to 1000 rows.)
Application Administrators can turn off the Query Engine to ensure data does not leave a data localization zone when authorized users access the Immuta Application outside data jurisdiction.
When the Query Engine is disabled, the SQL Credentials tab on a user profile page is removed. The associated SQL accounts are also deleted, so if an Administrator re-enables the Query Engine those SQL accounts must be recreated.
For a tutorial that details how to disable the Query Engine, navigate to the App Settings Tutorial.
Deprecation notice
Support for the audit page has been deprecated. Instead, pull audit logs from Kubernetes and push them to your SIEM.
All activity in Immuta is audited. This process provides rich audit logs that detail who subscribes to each data source, why they subscribe, when they access data, what SQL queries and blob fetches they run, and which files they access. Audit logs can be used for a number of intentions, including insider threat surveillance and data access monitoring for billing purposes. Audit logs can also be shipped to your enterprise auditing capability.
For more details about using audit logs, see the Audit Logs User Guide.
Immuta's logging system is designed to easily connect with enterprise log collection and aggregation systems. Please see the Immuta System Audit Logs page for full details.
Immuta provides access to all of the audit logs via the Audit page.
Only users with the AUDIT
permission can access this page. See the Administration section for more information.
Users can sort these logs by ascending (oldest entries first) or descending (latest entries first) order. By default, 50 log entries are displayed to a page, but that can be changed to 100 or 200. Additionally, users can filter the entries in a variety of ways, including by project purpose, blobId, remote query id, the entry timestamp, data source, project, record type, user, and SQL query. These query audit records detail the query run, the columns that were masked, and how the masking was enforced.
Deprecation notice
Support for this feature has been deprecated.
The Query Editor allows users to write, modify, and execute queries against data sources they are subscribed to.
Click the Query Editor icon in the left sidebar.
Select a data source in the Tables list.
Click the dropdown menu icon next to the data source and select Preview Sample Data, or click Preview Sample Data in the Table Schema panel.
View data in the Results panel.
Filter results by clicking the overflow menu next to the column name.
Rearrange and resize columns by clicking and dragging.
Run and export full results or export current results to .csv by clicking one of the corresponding download buttons in the top right corner of the table.
Click the Query Editor icon in the left sidebar.
Write your query in the Query Editor panel.
Execute your query by clicking the Run Query button. Note: Clicking this button will only run the currently highlighted query. Queries (or portions of queries) can be executed by manually highlighting the query (or portion of the query) and clicking Run Query.
View data in the Results panel.
Filter results by clicking the overflow menu next to the column name.
Rearrange and resize columns by clicking and dragging.
Export results to .csv by clicking the download button in the top right corner of the table.
The user profile page contains personal information your user account, including contact information, API keys, and pending requests. To navigate to the user profile page or quick actions, click the profile icon in the header of the Immuta UI and select Profile..
The following information about the user is displayed on their profile page. With the exception of the Databricks, Redshift, Snowflake, or Synapse username, this information may be edited by the user at any time.
Name: The user's full name.
Email: The user's email address.
Position: The user's current position.
Last Updated: The time of the user's last profile update.
About: A short description about the user.
Location: The user's work location.
Organization: The organization that a user is associated with.
Phone Number: The user's phone number.
Databricks Username: The user's Databricks username. Only an admin may set this field.
Redshift Username: The user's Redshift username. Only an admin may set this field.
Snowflake Username: The user's Snowflake username. Only an admin may set this field.
Synapse Username: The user's Synapse username. Only an admin may set this field.
Receive System Notifications as Emails: The user can opt to receive email notifications.
In order to connect to the query engine, each user must create SQL credentials. SQL credentials can be accessed by clicking the SQL Credentials tab.
For more information on SQL credentials, see Managing SQL accounts guide.
API keys allow for a secure way to communicate with the Immuta REST API without requiring the username and password. Each key can be revoked at any time and new ones generated. Once a key is revoked it can no longer be used to access the REST API, and users will need to authenticate any tool that they were using with the revoked API key with a new key.
Once in the API keys tab, a user can generate API keys or revoke API keys.
An API key can be linked to a project. By linking an API key to a project, you will be limiting that API key's visibility to only data sources associated with that project.
The requests tab allows users to view and manage all pending access requests directly from their profile page.
Audience: Application Administrators
Content Summary: The App Settings Page is visible only to Application Administrators and allows them to configure the Immuta settings, to manage license keys, and to generate a status bundle.
This tab is where the Administrator can add IAMs, external catalogs, and data providers. They can also adjust various Immuta settings to configure it better to their organization's needs.
For a tutorial on changing settings on this tab see App Settings Tutorial.
This tab includes a list of licenses and details the universally unique identifier (UUID), the features associated with specific licenses, the expiration dates, the total number of seats, and the date the keys were added. Administrators can also add and delete license keys from this page.
This tab allows Administrators to export a zip file called the Immuta status bundle. This bundle will include information helpful to assess and solve issues within an Immuta instance by providing a snapshot of Immuta, associated services, and information about the remote source backing any of the selected Data Sources. When generating the status bundle the Administrator may select the particular information that will help solve the issue at hand.
Audience: All users
Content Summary: The Policies page allows all users to view and search all policies and the data sources they apply to. Additionally, Governors and Data Owners can manage Global Policies and Restricted Global Policies on this page.
This document illustrates the basic features of the Policies page. For a tutorial, navigate to the Global Data Policy tutorial , the Global Subscription Policy tutorial or the Restricted Global Policy Builder Tutorial.
These tabs list all policies and detail the tags, purposes, and policy type; the scope and state of the policy, and when and by whom the policy was created.
The Advanced Search allows users to search for policies based on specific facets, such as policy type, rule type, purposes, conflicts, and creator.
Snowflake
Databricks Unity Catalog
Databricks Spark
Databricks SQL
Starburst (Trino)
Redshift
Azure Synapse Analytics
Native query audit type
Legacy audit and UAM
Legacy audit and UAM
Legacy audit and UAM
Legacy audit
Table and user coverage
Registered data sources and users
Registered data sources and users
All tables and users
Registered data sources and users
Object queried
Limited support
Columns returned
Limited support
Query text
Limited support
Unauthorized information
Limited support
Limited support
Policy details
Limited support
User's entitlements
Limited support
Data Sources |
Policies |
Projects |
Audit Logs and Immuta Reports |
Application Admin | APPLICATION_ADMIN |
Data Owner |
|
Data User | - |
Data Governor | GOVERNANCE |
Project Manager | PROJECT_MANAGEMENT |
User Admin | USER_ADMIN |
A data source is how users virtually expose data (that lives in a remote data platforms) across their enterprise to other users. When you expose a data source you are not copying the data; you are using metadata to tell Immuta how to expose it. Once exposed and subscribed to, the data will be accessed in a consistent manner across analytics and visualization tools, allowing reproducibility and sharing. For more information and tutorials about data sources, see .
Policies are fine-grained security controls applied to data sources by Data Owners or Data Governors, who determine the logic behind what is hidden from whom. Immuta offers two policy types: , which determine who can access a data source, and , which determine what data the user sees once they get access to a data source. Through these policies, data is hidden, masked, redacted, and anonymized in the control plane based on the attributes of the users accessing the data and the purpose under which they are acting. For more information and tutorials about policies, see .
Projects allow users to logically group work by linking data sources and can be created to efficiently organize work or to provide special access to data to specific users. The same security restrictions regarding data sources are applied to projects; project members still need to be subscribed to data sources in order to access data, and only users with appropriate attributes and credentials will be able to see the data if it contains any row-level or masking security. However, Project Owners can enable , which improves collaboration by ensuring that the data in the project looks identical to all members, regardless of their level of access to data. When enabled, this feature automatically equalizes all permissions so that no project member has more access to data than the member with the least access. For more detailed discussion and tutorials about projects, see .
All activity in Immuta is audited, and Data Owners and users with the AUDIT
permission can access audit logs that detail who subscribes to each data source, why they subscribe, when they access data, and which files they access. These logs can be used for a number of intentions, including insider threat surveillance and data access monitoring for billing purposes. Audit logs can also be shipped to your enterprise auditing capability, if desired. Similarly, Governors can build Immuta Reports to analyze how data is being used and accessed across Immuta using the Immuta Report Builder. Reports can be based on users, groups, projects, data sources, tags, purposes, policies, and connections within Immuta. For more information and tutorials about audit logs and Immuta Reports, see the and the , respectively.
without any permissions