Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
This guide is for users who wish to understand their data estate and where there may be security gaps or non-compliant user query activity that needs to be addressed. It also contains details for configuring Immuta.
This use case is tailored to quickly get you monitoring queries in your data platform and understanding where you may have security gaps using Immuta Discover and Immuta Detect. If you are not using Snowflake, instead move to the General Immuta configuration use case because filtering by tags and sensitivity in Immuta Detect is currently only available on Snowflake.
As part of this use case, you will learn special considerations and configurations for setting up Immuta for Immuta Detect. Upon completion, you will understand existing security gaps.
Follow these steps to configure Immuta and start using Detect:
Configure your users in Immuta, using the user identity best practices in order to review and summarize user activity and plan your first policy.
Read the native integration architecture overview and connect Immuta to your database. Consider the Snowflake roles best practices.
Register data sources in order to review and summarize data activity and plan your first policy.
Start Using Immuta Detect. To get the most out of it, consider populating sensitivity using Automate entity and sensitivity discovery (SDD) and then configure Detect with SDD.
Immuta Detect continually monitors your data environment to help answer questions about your most active data users, the most accessed data, and the events happening within your data environment. This understanding can help drive prioritization of where to place access control policies in Immuta Secure.
This section provides use cases to guide you through implementing Immuta Detect.
This reference guide discusses the benefits of Immuta Detect and how it works.
Immuta provides robust audit logging on actions within the application and on queries in native technologies like Snowflake, Databricks, and Unity Catalog. This section includes how-to guides for exporting Immuta audit logs for long-term backup and reference guides that describe Immuta's audit model.
Immuta Detect monitors your data environment and provides analytic dashboards in the Immuta UI based on your data use. This section includes how-to and reference guides about dashboards that offer visualizations of audit events and the sensitivity of queries, data sources, and columns.
Monitors allow you to gain awareness of when your users' behavior changes, maintain data availability through data platform policy changes, ensure access patterns remain consistent for your data controls to remain effective, and be notified about anomalies.
Select your use case
Immuta Detect continually monitors your data environment to help answer questions about your most active data users, the most accessed data, and the events happening within your data environment. This understanding can help drive prioritization of where to place access control policies.
Immuta Detect only supports tag filtering and sensitivity with Snowflake. So you are presented with two paths to choose from:
If you use Snowflake: Monitor and secure sensitive data platform query activity. This will guide you through the basics of Immuta configuration and culminate with findings in Detect.
If don't use Snowflake: General Immuta configuration. This will guide you through the required Immuta configurations in order to move on to Immuta Secure.
Within the Immuta product ecosystem, Immuta Detect is responsible for surfacing and indexing a wide range of security-related events, making it a rich source of data security posture insights.
In a typical deployment, Immuta Detect efficiently surfaces and processes a vast number of data security events. While these events all have security relevance, it may be challenging to understand their potential impacts without manual investigation. At the same time, the sheer volume of events typically greatly exceeds what a team can manually explore.
Enter Immuta Discover: Immuta’s data discovery and security analysis engine can identify, categorize, and classify data. Immuta Discover analyzes data available within the operational context of an event in conjunction with applicable legal, regulatory, compliance, and security frameworks to make deep inferences about the status of the data. For example, in a medical context, Immuta Discover can understand the difference between anonymized and identified medical data.
With additional classification metadata powered by Immuta Discover, Immuta Detect analyzes data security events for sensitivity, ensuring that highly significant events remain highly visible. In the context of the previous example, Immuta Detect can detect and flag the accidental identification of anonymized medical data.
With Discover, Immuta Detect can provide insightful oversight of who accesses sensitive data, where it is stored, and how it is used, enabling
Rapid and exact compliance monitoring and assessment
Insights into data usage patterns for setting data access policy
Simplified and expedient audit responses
Context-aware analysis of data flows as seen through the lens of security or regulatory compliance frameworks
Immuta Discover works in three phases: identification, categorization, and classification.
In the first phase, data is identified by its kind – for example, a name or an age. This identification can be manually performed, externally provided, or automatically determined by Immuta Discover through column-level analysis. This is commonly termed entity identification.
In the second phase, data is categorized in the context where it appears, subject to any active data compliance or security frameworks. For example, a record occurring in a clinical context containing both a name and individual health entities is Protected Health Information under HIPAA.
Though entirely customizable, for this purpose, Immuta provides a default framework known as the Immuta Data Security Framework (Immuta DSF.) Immuta DSF gives a fine-grained categorization into a consistent set of security and compliance concepts, including things like whether or not a record pertains to an individual, the composition and kinds of any identifiers that are present, the subject matter of the data, whether it belongs to any commonly controlled data categories, etc. The rules of the framework use the entities found in the table in phase 1 to drive how the data is categorized.
The categorization provided by the Immuta DSF may be used directly. Still, it is best leveraged as a starting point for purpose-built compliance frameworks implementing organization-specific compliance categories or other relevant high-level regulatory and compliance frameworks, such as those for categorizing data into categories defined under CCPA, GDPR, GLBA, HIPAA, etc.
Bottom line, think of categorization as a way to apply higher level categories to the fine-grained entities discovered in phase 1 through rules you can customize. These categories are presented as tags in Immuta, just like the entities in phase 1, and thus, can be used for Immuta Secure policies.
In the third and final phase, data is classified according to its sensitivity level (e.g., Customer Financial Data is Highly Sensitive). Again, Immuta supplies sensitivity classification defaults based on standard industry practice. Just like how categories are built from phase 1 entities, classification builds on the phase 2 categories. Customers are free to customize this classification under their respective views. These classifications are key to surfacing sensitive queries in Detect based on your definition of sensitive.
There are good reasons to automate data discovery and analysis with Immuta Discover:
It formalizes the entire process, producing a coherent set of classification rules.
It makes it possible to automatically and uniformly scale compliance to new data sources.
It enables Immuta Detect to automatically detect additional threats, such as unauthorized or attempted access to sensitive data, and for soft enforcement of organizational data access policies. (For example, that access to personal information, direct identifiers, or login credentials be masked.)
Speed. Automating data discovery and analysis with Immuta Discover enables faster access to data by removing the manual effort of tagging and classifying new tables and columns.
Be aware that
Some customization may be necessary. Although Immuta's sensitive data discovery discovers over 60 types of sensitive data, only some data elements may be relevant. Further, unique sensitive data elements may not be covered out of the box. In these cases, it is possible to create new sensitive data discovery identifiers to ensure data is properly discovered and tagged.
New global templates should be created to find only entities that are relevant to the organization. This will ensure extraneous tags are not added to data elements.
Some customers may already have an existing data catalog tagging data; Immuta’s sensitive data discovery can work in combination with the data catalog.
Because data environments are not static, it is imperative that data tagging is automatically performed with new or changed data so that policies can be enabled in real-time, lowering the risk of data leaks.
Many organizations have invested in an enterprise data catalog as part of their data governance programs. Entity tags from the data catalogs will be pulled into Immuta in a one-way sync because the catalog is the system of record for entity tags. The tags pulled in from the data catalog can later be mapped to categories in the same way that entities automatically discovered in phase 1 are mapped to categories. This in turn will associate the appropriate sensitivity via classification to the external tags.
For a concrete example, consider a scenario where the Collibra catalog has tags for Longitude
and Latitude
. The following rule assigns Immuta DSF.Longitude
to any column tagged Collibra.Location.Longitude
. These rules appear in the rules array in the framework definition.
Incorporating tags from external catalog rules is fairly straightforward. External tags are referenced in rules, except the source field identifies the external catalog. The source field generally varies depending on the external catalog system. The correct value for the field may be identified by examining tag objects listed with the tags API, which includes the source field.
Immuta is not just a location to define your policy logic; Immuta also enforces that logic in your data platform. How that occurs varies based on each data platform, but the overall architecture remains consistent and follows the NIST Zero Trust framework. The below diagram describes the recommended architecture from NIST:
Immuta lives in the middle control plane. To do this, Immuta knows details about the subjects and enterprise resources, acts as the policy decision point through policies administered by policy administrators, and makes real-time policy decisions using the internal Immuta policy engine.
Lastly, and of importance to how Immuta Secure functions, Immuta also enables the policy enforcement point by administering the policies natively in your data platform in a way that can react to policy changes and live queries.
To use Immuta, you must configure the Immuta native integration, which will require some level of privileged access to administer policies in your data platform, depending on your data platform and how the Immuta integration works. Refer to Snowflake roles best practices for Snowflake before configuring the native integration.
Intermingling your pre-existing roles in Snowflake with Immuta can be confusing at first. This guide outlines some best practices on how to think about roles in each platform.
Roles play a crucial role in Snowflake by organizing and controlling access to data, platform permissions, and data warehouses. Immuta also leverages Snowflake roles to grant users permission to read data based on subscription policies.
Users who consume data (directly in Snowflake or through other applications) need roles to access objects. But roles are also used to control write, Snowflake warehouses, and Snowflake permissions through system-defined roles.
To manage this at scale, Immuta recommends taking a 4-layer approach, where you separate the different permissions into different roles:
Roles for read access (Immuta managed)
Roles for write access (optional, soon supported by Immuta)
Roles for warehouse, internal billing
Roles for Snowflake permissions (optional)
Since Immuta leverages Snowflake roles, you can still use existing roles in Snowflake. This means you can gradually migrate to an Immuta-protected Snowflake.
Warehouses are granted to users to give them access to computing resources. Since this is directly tied to Snowflake’s consumption model, warehouses are typically linked to cost centers for (internal) billing purposes. Immuta recommends creating a role/warehouse per team/domain/cost center and granting this warehouse role to users using identity manager groups.
Snowflake permissions are granted through system-defined roles like ACCOUNTADMIN
or SECURITYADMIN
. These are high-privilege roles that are only granted to administrators. This can be done manually or using AD groups.
Snowflake allows users to select a specific role, but you can also use all roles simultaneously. Immuta recommends using all roles since that helps to separate the different roles.
Alternatively, you could create personal roles and grant the warehouse-role/immuta-read-role and possibly the snowflake-permission-role and write-role to this.
Immuta has two types of service accounts to connect to Snowflake:
Data ownership role: This role is used to register data sources. A service account/role is recommended so that when the user moves or leaves the organization, Immuta will still have the proper credentials to connect to Snowflake. You can follow one of the two best practices:
A central role for registration (recommended): It is recommended that you create a service role/user with USAGE
permissions for all objects in Snowflake. This allows Immuta to register all the objects from Snowflake, populate the Immuta catalog, and scan the objects for sensitive data using Immuta Discover. Immuta will not apply policy directly by default, so no existing access will be impacted.
Read access is managed by Immuta. By using , data access can be controlled to the table level. help you scale compared to RBAC, where access control is typically done on a schema or database level.
Write access is typically granted on a schema or database. This makes it easy to manage in Snowflake through manual grants. We recommend creating roles that give insert, update, and delete permissions to a specific schema or database and attach this role to a user. This attachment can be done manually or using your identity manager groups. (See the for details.) Note that Immuta is working towards supporting write policies, so this will not need to be separately managed for long.
This feature is called ‘’ and can be enabled using the following command in Snowflake: USE SECONDARY ROLES ALL
Policy role: This role gives Immuta the power to create and apply policy. Immuta can , or you can to create the policy role.
A role per team/domain (alternative): Alternatively, if you cannot create a role with USAGE
permissions for all objects, you can allow the different domains or teams in the organization to use a service user/role scoped to their data to register data sources. This is delegating metadata registration and aligns well with type use cases and means every team is responsible for registering their data sets in Immuta.
Public preview: This feature is public preview and available to all accounts.
Requirements:
Immuta permission AUDIT
If you will use the Immuta CLI instead of GraphQL API, install and configure the Immuta CLI. Must be CLI v1.4.0 or newer.
Before Immuta can export audit events to your Azure Data Lake Storage (ADLS) Gen2 storage account, you need to create a shared access signature (SAS) token that allows the Immuta audit service to add audit logs to your specified ADLS storage account and file system.
Follow the Azure documentation to create the following in Azure:
An ADLS Gen2 storage account with the following settings required for audit export:
Enable hierarchical namespace
Standard performance is adequate, but premium may be used
A shared access signature (SAS) for your dedicated container with at least the following permissions at the storage account or container level:
Create
Write
Save the SAS token to use in the next steps. Do not navigate away from the SAS page unless you have saved the token.
Configure the audit export to ADLS using the Immuta CLI or GraphQL API with the following fields:
interval: The interval at which audit logs will be exported to your ADLS storage. They can be sent at 2-, 4-, 6-, 12-, or 24-hour intervals.
storage account: The name of the storage account you created that your audit logs will be sent to.
file system: The name of the file system (or container) you created that your audit logs will be written to.
path: The name of the path in the file system. This will be a new folder or directory in the container where Immuta will send your audit logs for storage.
SAS token: The previously-generated SAS token.
Run the following command with the above fields in a JSON file:
Example ./your-exportConfig.json
file
For additional CLI commands, see the audit CLI reference guide.
Run the following mutation to this URL, https://your-immuta.com/api/audit/graphql
, with the above fields passed directly:
Example response
For additional GraphQL API commands, see the GraphQL API reference guide.
Requirement:
Snowflake Enterprise Edition or higher
Native SDD and classification frameworks enabled in Immuta. If you do not know if they are enabled, collaborate with your Immuta representative to turn on and in your Immuta tenant.
:
Users and Data Sources have been registered in Immuta:
Snowflake tables registered as Immuta data sources
Snowflake users registered in Immuta
Currently, Detect only supports filtering by tag and showing sensitivity of audit records for Snowflake.
This onboarding process is recommended for organizations that have not tagged any sensitive data yet. Immuta will identify, classify, and tag your data. After you are fully onboarded, you will see Detect dashboards with information on your organization's data use and data sensitivity, and the Discover data inventory dashboard will show details about the data that was scanned.
After you are happy with the Detect dashboards on the select data sources you enabled, you can integrate Detect with more of your data environment.
This guide is for anyone ready to begin their journey with Immuta who isn't using Snowflake.
If you are using Snowflake with Immuta, see the use case.
This use case provides the basics for setting up Immuta and helps you understand the logic and decisions behind the tasks.
Follow these steps to configure Immuta:
Configure your users in Immuta, using the .
Read the and connect Immuta to your databases. Consider the if using Databricks.
in order to start using Immuta features on the data sources.
This guide outlines best practices for managing user identities in Immuta with your identity manager, such as Active Directory and Okta.
Reusing information you have already captured today is a good practice. A lot of information about users is in your identity manager platform and can be used in Immuta for user onboarding and policies.
All users protected by Immuta must be registered in Immuta, even though people might not log in to Immuta.
SAML is commonly used as a single sign-on mechanism for users to log in to Immuta. This means you can use your organization's SSO, which complies with your security standards.
Every user that will be protected by Immuta needs to have a user on the platform to enforce policy, regardless of if they are logging in to Immuta. SCIM should be used to provision users from your identity manager platforms to Immuta automatically. The advantage here is that not all end-users need to log in to Immuta to create their accounts, and updates in your identity manager will be automatically reflected in Immuta, hence updating the access in your platforms.
Details on how to configure your individual identity manager's protocols can be found here:
Immuta Detect provides value from the moment the dashboards are visible, which can be enabled for organizations with Snowflake, Databricks Spark, and Databricks Unity Catalog integrations. Currently, organizations with Snowflake integrations can get even more value with data sensitivity and tagging. To determine and surface the sensitivity of your data access, enable and tune classification.
Completing all the steps below will fully onboard you with Detect and Discover:
Prerequisites:
The onboarding process assumes that these prerequisites have already been set up, but here are the Immuta features and configuration required to enable Detect. Each integration can be used alone or a Snowflake integration can be used with either Databricks Spark or Databricks Unity Catalog. Databricks Spark and Databricks Unity Catalog are not supported together with Detect:
For Snowflake integrations:
:
: This feature can be enabled when first configuring the integration or when editing the integration.
: While not required, it is recommended to enable this feature to properly audit unauthorized query events. Without it, unauthorized events will still show as successful. Project workspaces cannot be used with table grants, so if your organization relies on them, leave this feature disabled.
Benefits and limitations of enabling table grants
With enabled:
Unauthorized query events will be audited and present in the Detect dashboards.
Table grants will manage the privileges in Snowflake for Immuta tables, making it more efficient than without.
Without table grants:
Unauthorized events are unavailable because users will have successful queries of zero rows, even if they do not have access to the table.
You can use project workspaces. Table grants is not compatible with project workspaces. If your organization depends on that capability, table grants is not recommended.
Snowflake tables and users registered in Immuta: Detect only audits events by users registered in Immuta on tables registered in Immuta. If you do not register the tables and users, their actions will not appear in the audit records or on the Detect dashboards.
For Databricks Spark integrations:
For Databricks Unity Catalog integrations:
Recommended:
This setting is not required for Detect, but can be used for better functionality:
Requirement:
Immuta permission USER_ADMIN
Actions:
To see sensitivity information using a Snowflake integration, proceed with the steps below.
Only available with Snowflake integrations: Discover classification is supported with Databricks and Snowflake integrations; however, the sensitivity can only be visualized in Detect dashboards with Snowflake integrations.
There are two options to tag data and activate classification frameworks to determine the sensitivity of your data:
After completing either of the tutorials above, data sources are tagged with entity tags and classification tags. Once users start querying data, and after the data latency with Snowflake, the Detect dashboards will show audit information with sensitivity information and the Discover data inventory dashboard will show details about the data that was scanned.
If you notice some sensitivity types are not appearing as you expect, proceed with the step below.
Only available with Snowflake integrations: Discover classification is supported with Databricks and Snowflake integrations; however, the sensitivity can only be visualized in Detect dashboards with Snowflake integrations.
Requirement:
Immuta permissions AUDIT
and GOVERNANCE
Actions:
After Discover has run SDD and the classification frameworks, it may be necessary to adjust the resulting tags based on your organization's data, security, and compliance needs:
After completing the tutorials above, all data appears as the appropriate sensitivity type on the Detect dashboards with Snowflake data sources.
Detect supports the following integration for activity pages with dynamic query sensitivity that will determine and visualize the sensitivity of user queries:
Detect supports the following integrations for activity pages, but will not visualize any sensitivity:
Databricks Spark
Check your data source tags
Navigate to the data source dictionary page.
Confirm one of the following tags is applied to one of the queried data columns:
RAF.Confidentiality.Very High
RAF.Confidentiality.High
RAF.Confidentiality.Medium
Detect uses the sensitivity scores associated with these tags to classify a query's sensitivity. When the queried columns have these tags and the associated classification rules in RAF or Data Security Framework (DSF) are enabled at the time of audit query processing, the query event will indicate the proper classification.
If there are no RAF tags applied, check if there are any DSF or Discovered tags applied. These tags are necessary for RAF tags to be applied.
Activate the frameworks
If you do not see any RAF tags, ensure the Data Security Framework and Risk Assessment Framework are active:
Navigate to the classification frameworks page.
Check the status of the Data Security Framework and the Risk Assessment Framework.
Run sensitive data discovery
Immuta is not just a location to define your policy logic; Immuta also enforces that logic in your data platform. How that occurs varies based on each data platform, but the overall architecture remains consistent and follows the . The below diagram describes the recommended architecture from NIST:
Immuta lives in the middle control plane. To do this, Immuta knows details about the subjects and enterprise resources, acts as the policy decision point through policies administered by policy administrators, and makes real-time policy decisions using the internal Immuta policy engine.
Lastly, and of importance to how Immuta Secure functions, Immuta also enables the policy enforcement point by administering the policies natively in your data platform in a way that can react to policy changes and live queries.
This guide outlines best practices for managing user identities in Immuta with your identity manager, such as Active Directory and Okta.
Reusing information you have already captured today is a good practice. A lot of information about users is in your identity manager platform and can be used in Immuta for user onboarding and policies.
All users protected by Immuta must be registered in Immuta, even though people might not log in to Immuta.
SAML is commonly used as a single sign-on mechanism for users to log in to Immuta. This means you can use your organization's SSO, which complies with your security standards.
Every user that will be protected by Immuta needs to have a user on the platform to enforce policy, regardless of if they are logging in to Immuta. SCIM should be used to provision users from your identity manager platforms to Immuta automatically. The advantage here is that not all end-users need to log in to Immuta to create their accounts, and updates in your identity manager will be automatically reflected in Immuta, hence updating the access in your platforms.
Details on how to configure your individual identity manager's protocols can be found here: