Skip to content

Data Sources in Immuta

Data owners expose their data across their organization to other users by registering that data in Immuta as a data source.

By default, data owners can register data in Immuta without affecting existing policies on those tables in their remote system, so users who had access to a table before it was registered can still access that data without interruption. If this default behavior is disabled on the app settings page, a subscription policy that requires data owners to manually add subscribers to data sources will automatically apply to new data sources (unless a global policy you create applies), blocking access to those tables.

For information about the default subscription policy and how to manage it, see the Subscription policies guide.

Click a link below to navigate to a tutorial that details how to create a data source:

Data sources with nested columns

You can create Databricks data sources with nested columns when you enable complex data types. When complex types are enabled, Databricks data sources can have columns that are arrays, maps, or structs that can be nested. These columns get parsed into a nested data dictionary.

Data source user roles

There are various roles users and groups can play relating to each data source. These roles are managed through the members tab of the data source. Roles include the following types:

  • Owners: Those who create and manage new data sources and their users, documentation, and data dictionaries.
  • Subscribers: Those who have access to the data source data. With the appropriate data accesses and attributes, these users and groups can view files, run queries, and generate analytics against the data source data. All users and groups granted access to a data source have subscriber status.
  • Experts: Those who are knowledgeable about the data source data and can elaborate on it. They are responsible for managing the data source's documentation and data dictionary tags and descriptions.

See Manage data source members for a tutorial on modifying user roles.

Data dictionary

The data dictionary provides information about the columns within the data source, including column names and value types. Users subscribed to the data source can post and reply to discussion threads by commenting on the data dictionary.

Dictionary columns are automatically generated when the data source is created. However, data owners and experts can tag columns in the data dictionary and add descriptions to these entries.