Compliantly Open More Sensitive Data for ML and Analytics

Who is this for?

This guide is intended for users who want to open more data for access by creating more granular and powerful policies at the data layer.

Prerequisites

Goals

Firstly, it's crucial to remember that just because a subscription policy, as described in the Automate data access control decisions use case, grants a user access to data, it doesn’t mean that they should have access to all of that data. Often, organizations stop at just granting access without considering the nuances of what specific columns or rows should be accessible to different users. It's important to see the process all the way through by masking sensitive values that are not necessary for a user's role. This ensures that while users have the data access they need, sensitive information is appropriately protected.

Secondly, when considering subscription policies in the context of global data policies, an interesting perspective emerges. A subscription policy could essentially be seen as mirroring the functionality of a global masking policy. This is because, like a global masking policy, a subscription policy can be used to mask or redact the entirety of a table. This interpretation underscores the potential of global data policies for comprehensive data protection.

One of the primary advantages is an easy and maintainable way to manage data leak risks without impeding data access, which means more data for ML and analytics. By focusing on global data policies, organizations can ensure that sensitive data, down to the row and column level, is appropriately protected, regardless of who has access to it. This means that while data remains broadly accessible for business operations and decision-making, the risk of data leaks is significantly reduced. This is because you can

  • be more specific with your policies as described above and

  • mask using advanced privacy enhancing technologies (PETs) that allow you to get utility from data in a column while still preserving privacy in that same column.

However, it's important to note that this approach does not mean that you should never create subscription policies. Subscription policies still have their place in data governance. The key point here is that the primary focus shifts away from subscription policies and towards global data policies, which offer a more comprehensive and effective approach to data protection. This shift in focus allows for more nuanced control over data access, enhancing both data security and compliance.

When is this appropriate?

This use case is particularly suitable in scenarios where you already have a process for granting access to tables. If your organization has established procedures for table access that are working effectively, introducing global data policies can enhance your data governance without disrupting existing workflows.

It's also fitting when you grant access to tables to everyone. In such cases, the focus is less on who has access and more on what they can access. Global data policies can help ensure that while data is broadly accessible, sensitive information is appropriately masked or redacted, maintaining compliance and security.

Lastly, this approach is appropriate when you have very generic subscription policies. Your native tool may refer to subscription policies as table GRANTs. If your subscription policies are not tailored to specific user attributes, roles, or data sensitivity levels, they may not provide adequate data protection. Shifting the focus to global data policies, such as data masking, allows for more nuanced and effective control over data access, enhancing both security and compliance.

In essence, this use case is appropriate when you want to maintain or improve data accessibility while ensuring robust data protection, regardless of your current table grants.

When is this not appropriate?

Data sensitivity

If the existence of certain tables, schemas, or columns is considered sensitive information within your organization, this solution pattern may not be appropriate. Revealing the existence of certain data, even without granting access to the actual data, can pose a security risk in some contexts. In such cases, a more restrictive strategy may be required.

Data navigation

With this use case, users might have to navigate through a large number of tables to find the data they need. This could potentially hinder user experience, especially in large organizations with extensive data environments.

Configuration steps

Follow these steps to learn more about and start using Immuta to compliantly open more sensitive data for ML and analytics:

  1. Complete the Monitor and secure sensitive data platform query activity use case to configure Immuta.

  2. Manage user metadata. This step is critical to building scalable policy and understanding the considerations around how and what to capture. Tag your users with attributes and groups that are meaningful for Immuta global data policies.

  3. Manage data metadata. This is the final setup step you must complete before authoring policy. Tag your columns with tags that are meaningful for Immuta global data policies.

  4. Author policy. In this step, you will define your global data policy logic for granularly masking and redacting rows and columns. Optionally test and deploy policy.

Last updated