Snowflake Access Pattern
Audience: System Administrators, Data Owners, and Data Users
Content Summary: This page describes the Snowflake access pattern, through which Immuta applies policies directly in Snowflake. Users can use the Snowflake Web UI and their existing BI tools to query protected data natively in Snowflake.
There are two integration options based on your Snowflake Edition:
- Snowflake Enterprise Edition or higher: Edition required to use the new Snowflake integration, which is currently in Public Preview.
- Snowflake Standard Edition: Edition supported by the original Snowflake integration.
See the Snowflake integration page for a tutorial on enabling Snowflake through the App Settings page.
Snowflake Enterprise Edition or Higher
In this integration, Immuta manages and applies Snowflake Row Access Policies and Column Masking Policies without requiring views, allowing users to query tables directly in Snowflake while dynamic policies are enforced.
While Immuta controls subscription and data policies restricting who can see what and when, Immuta does not manage Snowflake grants on tables. This means that users are responsible for both
- registering the data source in Immuta and
SELECTprivileges on the table in Snowflake to the relevant users.
Additionally, when a data source is disabled or removed in Immuta or when the Snowflake
integration is disabled in Immuta, these
SELECT privileges must be revoked. Because Immuta removes policies
on disabled and deleted data sources, any user who still has
SELECT privileges will be able to query the
Snowflake can be used on its own or with Snowflake Workspaces. If you are using native workspaces with this integration, you must add the default role of the credentials used to register the data sources in the project to the list of excepted users and roles, described below. Otherwise, project equalization will not work.
Note that Snowflake Workspaces use Snowflake views.
Excepted Roles and Users are assigned when the integration is installed, and no policies will apply to these users' queries, despite any Immuta policies enforced on the tables they are querying. Consequently, roles and users added to this list should be limited to service accounts.
Immuta excludes the listed roles and users from policies by wrapping all policies in a CASE statement that will check if a user is acting under one of the listed user names or roles. If a user is, then the policy will not be acted on the queried table. If the user is not, then the policy will be executed like normal. Immuta does not distinguish between role and user name, so if you have a role and user with the exact same name, both the user and any user acting under that role will have full access to the data sources and no policies will be enforced for them.
Migration from a Snowflake Standard Integration to a Snowflake Enterprise Integration (Public Preview)
- If multiple Snowflake integrations are enabled, they will all migrate together. If one fails, they will all revert to the Snowflake Standard integration.
- If an error occurs during migration and the integration cannot be reverted, the integration must be disabled and re-enabled.
You can migrate from a Snowflake Standard integration to a Snowflake Enterprise integration on the App Settings page. Once prompted, Immuta will migrate the integration, allowing users to seamlessly transition workloads from the legacy Immuta views to the direct Snowflake tables.
After the migration is complete, Immuta views will still exist for pre-existing Snowflake data sources to support existing workflows. However, disabling the Immuta data source will drop the Immuta view, and, if the data source is re-enabled, the view will not be recreated.
You can migrate back to a Snowflake Standard integration from the Snowflake Enterprise integration if any issues occur. However, this process is only intended to resolve any issues that occur during migration and regain utility of Immuta. Please consult your Immuta professional for assistance.
Access must be revoked.
Access to the Snowflake tables must be revoked when migrating from the Snowflake Enterprise to the Snowflake Standard integration to prevent users from having access to the raw tables.
Immuta policies that rely on a masked column as input cannot be natively queried in Snowflake. These policies will present a message upon creation and in the health status of any affected data sources. To avoid any data leaks, more strict masking will be enforce until the policies are changed.
- Additionally, if there is any other error in generating or applying policies natively in Snowflake, the data source will be locked and only users on the Excepted Roles/Users List and the credentials used to create the data source will be able to access the data.
Users are unable to rollback from the Snowflake Enterprise integration to the Snowflake Standard integration if Snowflake SQL-backed data sources exist. Before trying to rollback, edit the data sources to be Snowflake tables or views.
Once a Snowflake integration is disabled in Immuta, the user must remove the access that was granted in Snowflake. If that access is not revoked, users will be able to access the raw table in Snowflake.
- Migration must be done using the credentials and credential method (automatic or bootstrap) used to install the integration.
Snowflake Standard Edition
Snowflake completely eliminates role bloat, as all views are accessible through the
role and access controls are applied in the view, allowing customers to leverage Immuta's powerful set of
attribute-based policies. Additionally, users can continue using roles to enforce compute-based policies through
"warehouse" roles, without needing to grant each of those roles access to the underlying data or create
multiple views of the data for each specific business unit.
Snowflake can be used on its own or together with Snowflake Workspaces.
Sync Views and Data Sources
This access pattern leverages webhooks to keep Snowflake views up-to-date with the corresponding Immuta data sources. Whenever a data source or policy is created, updated, or disabled, a webhook will be called that will create, modify, or delete the native Snowflake view.
The SQL that makes up all views includes a join to the secure
immuta_system.user_profile. This view is a select from the
immuta_system.profile table (which contains all Immuta users and their current groups, attributes, projects, and a
list of valid tables they have access to) with a constraint
immuta__userid = current_user() to ensure it only
contains the profile row for the current user. This secure view is readable by all users and will only display the
data that corresponds to the user executing the query.
immuta_system.profile table is updated through webhooks whenever a user's groups or
attributes change, they switch projects, they acknowledge a purpose, or when their data source access is approved
or revoked. The profile table can only be read and updated by the Immuta system account.
Secure and Non-Secure Views
When creating a native Snowflake data source, users have the option to use a regular view (traditional database view) or a secure view; however, according to Snowflake's documentation , "the Snowflake query optimizer, when evaluating secure views, bypasses certain optimizations used for regular views. This may result in some impact on query performance for secure views." To use the data source with both Snowflake and Snowflake Workspaces, secure views are necessary. Note: If HIPAA compliance is required, secure views must be used.
Non-Secure View Policy Implications
When using a non-secure view, certain policies may leak sensitive information. In addition to the concerns outlined here, there is also a risk of someone exploiting the query optimizer to discover that a row exists in a table that has been excluded by row-level policies. This attack is mentioned here in the Snowflake documentation.
Policies that will not leak sensitive information
- masking by making NULL, using a constant, or by rounding (date/numeric)
- minimization row-level policies
- date-based row-level policies
- k-anonymization masking policies
Policies that could leak sensitive information
- masking using a regex will show the regex being applied. In general this should be safe, but if you have a regex
policy that removes a specific selector to redact (e.g., a regex of
/123-45-6789/gto specifically remove a single SSN from a column), then someone would be able to identify columns with that value.
- in conditional masking and custom WHERE clauses including “Right To Be Forgotten,” the custom SQL will be visible, so for a policy like "only show rows where COUNTRY NOT IN(‘UK’, ‘AUS’)," users will know that it’s possible there is data in that table containing those values.
Policies that will leak potentially sensitive information
These policies leak information sensitive to Immuta, but in most cases would require an attacker to reverse the algorithm. In general these policies should be used with secure views:
- masking using hashing will include the salt used
- numeric and categorical local differential privacy will include the salt used
- reversible masking will include both a key and an IV
- format preserving masking will include a tweak, key, an alphabet range, prefix, pad to length, and checksum id if used
The data sources themselves have all the Data policies included in the SQL through a series of CASE statements that
determine which view of the data a user will see. Row-level policies are applied as top-level WHERE clauses,
and usage policies (purpose-based or subscription-level) are applied as WHERE clauses against the
access_check function allows Immuta to throw custom errors similar to the Query Engine when a user lacks access
to a data source because they are not subscribed to the data source, they are operating under the wrong project, or
they cannot view any data because of policies enforced on the data source.
Snowflake Database Structure
By default, all native views are created within the
immuta database, which is accessible by the
PUBLIC role, so users acting under any Snowflake role can connect. All views within the database have
SELECT permission granted to the
PUBLIC role as well, and access is enforced by the
built into the individual views. Consequently, there is no need for users to manage any role-based access to any of the
database objects managed by Immuta.
Immuta is unable to create a corresponding view in Snowflake for data sources
- with an external policy handler, or
- that are using the Advanced Rules DSL.
Certain interpolation functions can also block the creation of a native view, specifically
Snowflake workspaces allow users to access protected data directly in Snowflake without having to go through the Immuta Query Engine.
Typically, Immuta applies policies by forcing users to query through the Query Engine, which acts like a proxy in front of the database Immuta is protecting. However, Snowflake secure views make this process unnecessary. Instead, Immuta enforces policy logic on data and represents it as secure views in Snowflake. However, since secure views are static, creating a secure view for every unique user in your organization for every table in your organization would result in secure view bloat. Immuta projects address this problem by virtually grouping users and tables and equalizing users to the same level of access, ensuring that all members of the project see the same view of the data. Consequently, all members share one secure view.
While interacting directly with Snowflake secure views in these workspaces, users can create derived data sources and collaborate with other project members at a common access level. Because these derived data sources will inherit all appropriate policies, that data can then be shared outside the project. Additionally, derived data sources use the credentials of the Immuta system Snowflake account, which will allow them to persist after a workspace is disconnected.
Snowflake Workspaces can be used on their own or with the Snowflake integration.
Immuta enforces policy logic on data and represents it as secure views in Snowflake. Because projects group users and tables and equalize members to the same level of access, all members will see the same view of the data and, consequently, will only need one secure view. Changes to policies immediately propagate to relevant secure views.
Mapping Projects to Secure Views
Immuta projects are represented as Session Contexts within Snowflake. As they are linked to Snowflake, projects automatically create corresponding
- roles in Snowflake: IMMUTA_[project name]
- schemas in the Snowflake IMMUTA database: [project name]
- secure views in the project schema for any table in the project
If users switch projects, they simply change their Snowflake Session Context to the appropriate Immuta project. If users are not entitled to a data source contained by the project, they will not be able to access the Context in Snowflake until they have access to all tables in the project. If changes are made to a user's attributes, the changes will immediately propagate to the Snowflake context.
Using Immuta with an Existing Snowflake Account
The following steps allow Immuta to be used with existing Snowflake accounts.
Immuta is configured to integrate with the organization’s Snowflake account and (optionally) share a single sign on (such as Okta), allowing users in Immuta to map to the same users in Snowflake. (Alternatively, that mapping can be inferred by using the same usernames in both Snowflake and Immuta.)
CREATE_DATA_SOURCE permissions are granted to specific users to allow them to expose Snowflake table metadata and enforce policies.
If tags are used to drive policies, users can manually add tags when tables are imported, Immuta can automatically tag sensitive data (if Sensitive Data Discovery is enabled), or users can pull tags from external catalogs that are mapped to the tables being exposed.
Policies are created and enforced on tables.
The CREATE_PROJECT permission is granted to specific users so they can create their own Immuta projects and create the appropriate Snowflake contexts. These users can drive what projects and hence what Snowflake contexts exist. Note: When users leave a project or a project is deleted, that Snowflake context will be removed from their Snowflake accounts.
The CREATE_DATA_SOURCE_IN_PROJECT permission is given to specific users so they can expose their derived tables in the project; the derived tables will inherit the policies, and then the data can be shared outside the project.
Users access data only through secure views in Snowflake (via Immuta projects), which significantly decreases the amount of role management for administrators in Snowflake. Organizations should also consider having a user in Snowflake who is able to create databases and make GRANTs on those databases and having separate users who are able to read and write from those tables.
- Few roles to manage in Snowflake; that complexity is pushed to Immuta, which is designed to simplify it.
- A small set of users has direct access to raw tables; most users go through secure views only, but raw database access can be segmented across departments.
- Policies are built by the individual database administrators within Immuta and are managed in a single location, and changes to policies are automatically propagated across thousands of tables’ secure views.
- Self-service access to data based on data policies.
- Users work in various contexts in Snowflake natively, based on their collaborators and their purpose, without fear of leaking data.
All policies are enforced natively in Snowflake without performance impact.
- Security is maintained through Snowflake primitives (roles and secure views).
- Performance and scalability is maintained (no proxy).
Policies can be driven by metadata, allowing massive scale policy enforcement with only a small set of actual policies.
- Derived tables can be shared back out through Immuta, improving collaboration.
- User access and removal are immediately reflected in secure views.
- Snowflake workspaces do not support differential privacy policies. Any Snowflake sources with differential privacy policies applied will not be created within the native Snowflake workspace.
- Native derived data sources can't be query-backed.
Multiple Snowflake Instances
A user can configure multiple integrations of Snowflake to a single Immuta instance and use them dynamically or with workspaces.
- There can only be one native connection per host.
- The host of the data source must match the host of the native connection for the native view to be created.
- Projects can only be configured to use one Snowflake host.
Native Query Audit
Once this feature has been enabled on the App Settings page with the Snowflake native integration, Immuta will run a query against Snowflake to retrieve the query histories. These histories provide audit records for queries against Snowflake native data sources that are queried natively in Snowflake.
This process will happen automatically every 24 hours, or can be manually prompted at any time from the Immuta Audit page. When manually prompted, it will only search for new queries that were created since the latest native query that has been audited. The job is run in the background, so the new queries will not be immediately available.
For details about the contents of these audits, see the Native Query Audit Logs page.
Prompt Native Query Audit
To manually prompt the native query audit, click Native Query Audit on the Audit page:
Alternatively, the schedule for the automatic job can be changed to fit your needs. See instructions for changing the frequency of the automatic job on the App Settings Tutorial page.
- The scheduled and manual jobs that query Snowflake are run in the background. The audit records will not update immediately.
- If you are relying on the scheduled job, any audit records for queries run in the day will not appear until the next day, at the earliest. In some cases, they could appear another day later.
- This feature is only available with Snowflake Enterprise or higher.
Snowflake External Catalog
When configuring a native Snowflake integration, you can add Snowflake as an external catalog as well. With this feature enabled, Immuta will automatically ingest Snowflake Object Tags from your Snowflake instance into Immuta and add them to the appropriate data sources.
The Snowflake tags' Key and Value pairs will be reflected in Immuta as two levels: the Key will be
the top level and the Value the second. As Snowflake tags are hierarchical, Snowflake tags applied
to a database will also be applied to all of the schemas in that database, all of the tables within
those schemas, and all of the columns within those tables. For example: If a database is
PII, all of the tables and columns in that database will also be tagged
To add Snowflake as an external catalog, follow one of the tutorials below:
- Manually Link a Snowflake Catalog: This tutorial is intended for users who want Snowflake tags to be ingested into Immuta but do not want users to query data sources natively in Snowflake.
- Automatically Link a Snowflake Catalog: Native Integration: This tutorial illustrates how to add a Snowflake catalog when configuring a native Snowflake integration.
Snowflake has some natural data latency. Even when manually refreshing external tags from the Governance page, users can experience a delay of up to two hours in updated tags. This delay can be avoided by manually refreshing tags through a data source's Health Check.
Snowflake is enabled through the App Settings page.
Once Snowflake has been enabled on an instance, all future Snowflake data sources will also be created natively
immuta database of the linked Snowflake instance. In addition to creating views, Immuta will also
periodically sync user metadata to a system table within the Snowflake instance.