Data Sources in Immuta

Data owners expose their data across their organization to other users by registering that data in Immuta as a data source. Data sources are collections of metadata about your tables or data objects and allow for Immuta actions like the following:

  • Apply tags to data sources to enforce access controls

  • Apply data policies to a data source's columns

  • Restrict the users who can query a data source with subscription policies

  • Gather your data sources into various domains for delegation

  • Publish data products containing your data sources in the Marketplace app

When data is registered, Immuta does not affect existing policies on those tables in the remote system for non-Immuta users, so users who had access to a table before it was registered can still access that data without interruption. However, this behavior is different for Immuta users on an integration-by-integration basis, so see the integration reference guides for more details.

Integrations

For policies to properly apply to data sources, there must be an integration configured in Immuta, and that integration's connection details must match the data source's connection details. This allows for Immuta to natively enforce policies on that table in your data platform.

  • Connections combine integration configuration and data source registration, ensuring details match.

  • For all other technologies, integration configuration and data source registration happen separately.

Connections data platforms

Use connections to create the integration and data objects with the same credentials. Then, enable the data object for your tables, views, etc. to create the data sources:

Non-connection data platforms

For all other technologies, configure your integration and then create data sources. Ensure that the host, port, and other integration details match the data source details so that policies will properly apply:

Data sources with nested columns

You can create Databricks data sources with nested columns when you enable complex data types. When complex types are enabled, Databricks data sources can have columns that are arrays, maps, or structs that can be nested. These columns get parsed into a nested data dictionary.

Data source user roles

There are various roles users and groups can play relating to each data source. These roles are managed through the members tab of the data source. Roles include the following types:

  • Owners: Those who create and manage new data sources and their users, documentation, and data dictionaries.

  • Subscribers: Those who have access to the data source data. With the appropriate data accesses and attributes, these users and groups can view files, run queries, and generate analytics against the data source data. All users and groups granted access to a data source have subscriber status.

  • Experts: Those who are knowledgeable about the data source data and can elaborate on it. They are responsible for managing the data source's documentation and data dictionary tags and descriptions.

See Manage data source members for a tutorial on modifying user roles.

Data dictionary

The data dictionary provides information about the columns within the data source, including column names and value types.

Dictionary columns are automatically generated when the data source is created. However, data owners and experts can tag columns in the data dictionary and add descriptions to these entries.

Data dictionary column icons

The data dictionary displays icons on columns that have a masking policy applied to them. The appearance of these icons varies depending on the permission of the user.

Governors and data owners

If you have the GOVERNANCE permission or are the data source owner, the data dictionary column icons will appear in these ways:

  • No icon: No masking policy applies to the column.

  • Yellow eye: A masking policy applies to the column, but the column is unmasked for the current user because they meet the exception criteria for the policy.

  • Red eye: A policy on the column masks it for the current user.

All other users

The data dictionary column icons will appear in these ways for all other users:

  • No icon: Either no masking policy applies to the column or a masking policy applies to the column, but the column is unmasked for the current user because they meet the exception criteria for the policy.

  • Red eye: A policy on the column masks it for the current user.

Audit

The following events related to data sources are audited and can be found on the audit page in the UI:

Last updated

Was this helpful?