In the distributed domains of a data mesh architecture, data governance and access control must be applied vertically (locally) within specific domains or data products and horizontally (globally). Global policies should be authored and applied in line with the ecosystem’s most generic and all-encompassing principles, regardless of the data’s domain (e.g., mask all PII data). Localized domain- or product-level policies should be fine-grained and applicable to only context-specific purposes or use cases (e.g., only show rows in the Sales table where the country value matches the user's office location). In Immuta we distinguish between subscription policies and data policies.
To access a data source, Immuta users must first be subscribed to that data source. A subscription policy determines who can request access to a data source or group of data sources. Alternatively, a subscription policy can also be configured to automatically provide access based on a user’s profile, such as being part of a certain group or having a particular attribute.
It is possible to build Immuta subscription policies across all data products (horizontal), as shown in the diagram above, and then have those policies merged with additional subscription policies authored at the domain level (vertical). When this occurs, those subscription policies are merged as prescribed by the two or more policies being merged.
Whether the requirements for access are merged with an AND or an OR is prescribed by this setting in the policy builder for each of the individual policies:
Always Required = AND
Share Responsibility = OR
You can find a specific example of subscription policy merging in the Automate data access control decisions use case, but in the case of data mesh, the policies are authored by completely separate users - one user at the global (horizontal) level with GOVERNANCE
permission and the second at the domain (vertical) level with the Manage policies
permission (in domain).
When building subscription policies, it can impact what a user can discover and, if desired, "put in their shopping cart" to use.
We'll discuss the "shopping" experience in the next guide, but how you are able to manage this in subscription policies is found here:
Allow Data Source Discovery: Normally, if a user does not meet the subscription policy, that data source is hidden from them in the Immuta UI. Should you check this option when building your subscription policy, the inverse is true: anyone can see this data source. This is important if you want users to understand if the data product exists, even if they don't have access.
Require Manual Subscription: Even if the user does meet the policy, instead of automatically subscribing them, they would have to discover and subscribe themselves. If they meet the policy, they will automatically be subscribed with no intervention. This is important if you want the users to maintain the list of data products they see in the data platform rather than all data products they have access to.
Request Approval to Access: This allows the user to request access, even if they don't meet the policy. Rules can determine what user manually overrides the policy to let them in.
Once a user is subscribed to a data source via Immuta or has a pre-existing direct access on the underlying data platform, the data policies that are applied to that data source determine what data the user sees. Data policy types include masking, row-level, and other privacy-enhancing techniques.
An exemplary three-step approach to managing data policies would be
Create global data policies on the tags resulting from the out-of-the-box sensitive data discovery.
Develop business specific frameworks and protection rules and controlled them via data policies.
Update the global data policies as new sensitive data is potentially released by data products and discovered using Immuta Detect.
Data mesh is a higher level use case that pulls from concepts learned across the other use cases:
Monitor and secure sensitive data platform query activity: How can you proactively monitor for data products that are leaking sensitive data?
Automate data access control decisions: How do you manage table access across your data products?
Compliantly open more sensitive data for ML and analytics: How do you manage granular access and mitigate concerns about sensitive data leaks in data products?
Our recommended strategy is that you decide if you want to automatically subscribe users to data products or if you want a workflow where they discover and subscribe to data products. In either case, you can delegate some policy ownership to the data product creators both by allowing them to tag their data with facts that drive global data policies and allowing domain-specific subscription policy authoring which merges with global subscription policies.