Data Sources in Immuta
Data owners expose their data across their organization to other users by registering that data in Immuta as a data source. Data sources are collections of metadata about your tables or data objects and allow for Immuta actions like the following:
Apply tags to data sources to enforce access controls
Apply data policies to a data source's columns
Restrict the users who can query a data source with subscription policies
Gather your data sources into various domains for delegation
Publish data products containing your data sources in the Marketplace app
When data is registered, Immuta does not affect existing policies on those tables in the remote system for non-Immuta users, so users who had access to a table before it was registered can still access that data without interruption. However, this behavior is different for Immuta users on an integration-by-integration basis, so see the integration reference guides for more details.
Integrations
For policies to properly apply to data sources, there must be an integration configured in Immuta, and that integration's connection details must match the data source's connection details. This allows for Immuta to natively enforce policies on that table in your data platform.
Connections combine integration configuration and data source registration, ensuring details match.
For all other technologies, integration configuration and data source registration happen separately.
Connections data platforms
Use connections to create the integration and data objects with the same credentials. Then, enable the data object for your tables, views, etc. to create the data sources:
Non-connection data platforms
For all other technologies, configure your integration and then create data sources. Ensure that the host, port, and other integration details match the data source details so that policies will properly apply:
Data sources with nested columns
You can create Databricks data sources with nested columns when you enable complex data types. When complex types are enabled, Databricks data sources can have columns that are arrays, maps, or structs that can be nested. These columns get parsed into a nested data dictionary.
Data source user roles
There are various roles users and groups can play relating to each data source. These roles are managed through the members tab of the data source. Roles include the following types:
Owners: Those who create and manage new data sources and their users, documentation, and data dictionaries.
Subscribers: Those who have access to the data source data. With the appropriate data accesses and attributes, these users and groups can view files, run queries, and generate analytics against the data source data. All users and groups granted access to a data source have subscriber status.
Experts: Those who are knowledgeable about the data source data and can elaborate on it. They are responsible for managing the data source's documentation and data dictionary tags and descriptions.
See Manage data source members for a tutorial on modifying user roles.
Data dictionary
The data dictionary provides information about the columns within the data source, including column names and value types.
Dictionary columns are automatically generated when the data source is created. However, data owners and experts can tag columns in the data dictionary and add descriptions to these entries.
Data dictionary column icons
The data dictionary displays icons on columns that have a masking policy applied to them. The appearance of these icons varies depending on the permission of the user.
Governors and data owners
If you have the GOVERNANCE permission or are the data source owner, the data dictionary column icons will appear in these ways:
No icon: No masking policy applies to the column.
Yellow eye: A masking policy applies to the column, but the column is unmasked for the current user because they meet the exception criteria for the policy.
Red eye: A policy on the column masks it for the current user.
All other users
The data dictionary column icons will appear in these ways for all other users:
No icon: Either no masking policy applies to the column or a masking policy applies to the column, but the column is unmasked for the current user because they meet the exception criteria for the policy.
Red eye: A policy on the column masks it for the current user.
Audit
The following events related to data sources are audited and can be found on the audit page in the UI:
DatasourceCreated: A data source is created.
DatasourceDeleted: A data source is deleted.
DatasourceDisabled: A data source is disabled.
DatasourceUpdated: A data source is updated.
DatasourceAppliedToProject: A data source is added to a project.
DatasourceRemovedFromProject: A data source is removed from a project.
DatasourceCatalogSynced: An external catalog is linked and synced for the data source.
DatasourceGlobalPolicyApplied: A global policy is applied to a data source.
DatasourceGlobalPolicyConflictResolved: A policy conflict between two global policies on a data source is resolved.
DatasourceGlobalPolicyDisabled: A global policy is disabled on a data source.
DatasourceGlobalPolicyRemoved: A global policy is removed from a data source.
LocalPolicyCreated: A local policy is created on a data source.
LocalPolicyUpdated: A local policy is updated on a data source.
SubscriptionCreated: A user is subscribed to a data source or project.
SubscriptionDeleted: A user's subscription to a data source or project is removed.
SubscriptionRequestApproved: A user's request to subscribe to a data source or project is approved.
SubscriptionRequestDenied: A user's request to subscribe to a data source or project is denied.
SubscriptionRequested: A user requests to subscribe to a data source or project.
SubscriptionUpdated: A user's subscription to a data source or project is updated.
Last updated
Was this helpful?

