Skip to content

Data Sources in Immuta

Data owners expose their data across their organization to other users by registering that data in Immuta as a data source.

By default, data owners can register data in Immuta without affecting existing policies on those tables in their remote system, so users who had access to a table before it was registered can still access that data without interruption. If this default behavior is disabled on the app settings page, a subscription policy that requires data owners to manually add subscribers to data sources will automatically apply to new data sources (unless a global policy you create applies), blocking access to those tables.

For information about the default subscription policy and how to manage it, see the Subscription policies guide.

Data Sources With Nested Columns

When data sources support nested columns, these columns get parsed into a nested Data Dictionary. Below is a list of data sources that support nested columns:

  • S3
  • Azure Blob
  • Databricks sources with complex data types enabled

    • When complex types are enabled, Databricks data sources can have columns that are arrays, maps, or structs that can be nested.

Data source user roles

There are various roles users and groups can play relating to each data source. These roles are managed through the members tab of the data source. Roles include the following types:

  • Owners: Those who create and manage new data sources and their users, documentation, and data dictionaries.
  • Subscribers: Those who have access to the data source data. With the appropriate data accesses and attributes, these users and groups can view files, run queries, and generate analytics against the data source data. All users and groups granted access to a data source have subscriber status.
  • Experts: Those who are knowledgeable about the data source data and can elaborate on it. They are responsible for managing the data source's documentation and data dictionary tags and descriptions.
  • Ingest: Those who are responsible for ingesting data for the data source. This role only applies to object-backed data sources (such as S3 data sources). Ingest users cannot access any data once it's inside Immuta, but they are able to verify if their data was successfully ingested or not.

See Manage data source members for a tutorial on modifying user roles.

Data dictionary

The data dictionary provides information about the columns within the data source, including column names and value types.

Dictionary columns are automatically generated when the data source is created. However, data owners and experts can tag columns in the data dictionary and add descriptions to these entries.