Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
This guide is for users who wish to understand their data estate and where there may be security gaps or non-compliant user query activity that needs to be addressed. It also contains details for configuring Immuta.
This use case is tailored to quickly get you monitoring queries in your data platform and understanding where you may have security gaps using Immuta Discover and Immuta Detect. If you are not using Snowflake, instead move to the General Immuta configuration use case because filtering by tags and sensitivity in Immuta Detect is currently only available on Snowflake.
As part of this use case, you will learn special considerations and configurations for setting up Immuta for Immuta Detect. Upon completion, you will understand existing security gaps.
Follow these steps to configure Immuta and start using Detect:
Ensure you have the Immuta software available to you. For the best experience, follow the steps below on Immuta SaaS because of the many SaaS benefits.
Configure your users in Immuta, using the user identity best practices in order to review and summarize user activity and plan your first policy.
Read the native integration architecture overview and connect Immuta to your database. Consider the Snowflake roles best practices.
Register data sources in order to review and summarize data activity and plan your first policy.
Start Using Immuta Detect. To get the most out of it, consider populating sensitivity using Automate entity and sensitivity discovery (SDD) and then configure Detect with SDD.
This guide outlines best practices for managing user identities in Immuta with your identity manager, such as Active Directory and Okta.
Reusing information you have already captured today is a good practice. A lot of information about users is in your identity manager platform and can be used in Immuta for user onboarding and policies.
All users protected by Immuta must be registered in Immuta, even though people might not log in to Immuta.
SAML is commonly used as a single sign-on mechanism for users to log in to Immuta. This means you can use your organization's SSO, which complies with your security standards.
Every user that will be protected by Immuta needs to have a user on the platform to enforce policy, regardless of if they are logging in to Immuta. SCIM should be used to provision users from your identity manager platforms to Immuta automatically. The advantage here is that not all end-users need to log in to Immuta to create their accounts, and updates in your identity manager will be automatically reflected in Immuta, hence updating the access in your platforms.
Details on how to configure your individual identity manager's protocols can be found here:
There are several different combinations of supported protocol configurations, so consider those as you plan your user synchronization.
In Immuta, permissions control what actions a user is allowed to take through the API and UI. The different permissions can be found in the Immuta permissions guide.
We recommend using identity manager groups to manage permissions. When you configure the identity manager integration, you can enable group permissions. This allows you to control the permissions via identity manager groups and use the group assignment and approval process currently in place.
Intermingling your pre-existing roles in Snowflake with Immuta can be confusing at first. This guide outlines some best practices on how to think about roles in each platform.
Roles play a crucial role in Snowflake by organizing and controlling access to data, platform permissions, and data warehouses. Immuta also leverages Snowflake roles to grant users permission to read data based on subscription policies.
Users who consume data (directly in Snowflake or through other applications) need roles to access objects. But roles are also used to control write, Snowflake warehouses, and Snowflake permissions through system-defined roles.
To manage this at scale, Immuta recommends taking a 4-layer approach, where you separate the different permissions into different roles:
Roles for read access (Immuta managed)
Roles for write access (optional, soon supported by Immuta)
Roles for warehouse, internal billing
Roles for Snowflake permissions (optional)
Read access is managed by Immuta. By using subscription policies, data access can be controlled to the table level. Attribute-based table GRANTS help you scale compared to RBAC, where access control is typically done on a schema or database level.
Since Immuta leverages Snowflake roles, you can still use existing roles in Snowflake. This means you can gradually migrate to an Immuta-protected Snowflake.
Write access is typically granted on a schema or database. This makes it easy to manage in Snowflake through manual grants. We recommend creating roles that give insert, update, and delete permissions to a specific schema or database and attach this role to a user. This attachment can be done manually or using your identity manager groups. (See the Snowflake documentation for details.) Note that Immuta is working towards supporting write policies, so this will not need to be separately managed for long.
Warehouses are granted to users to give them access to computing resources. Since this is directly tied to Snowflake’s consumption model, warehouses are typically linked to cost centers for (internal) billing purposes. Immuta recommends creating a role/warehouse per team/domain/cost center and granting this warehouse role to users using identity manager groups.
Snowflake permissions are granted through system-defined roles like ACCOUNTADMIN
or SECURITYADMIN
. These are high-privilege roles that are only granted to administrators. This can be done manually or using AD groups.
Snowflake allows users to select a specific role, but you can also use all roles simultaneously. Immuta recommends using all roles since that helps to separate the different roles.
This feature is called ‘secondary roles’ and can be enabled using the following command in Snowflake: USE SECONDARY ROLES ALL
Alternatively, you could create personal roles and grant the warehouse-role/immuta-read-role and possibly the snowflake-permission-role and write-role to this.
Immuta has two types of service accounts to connect to Snowflake:
Policy role: This role gives Immuta the power to create and apply policy. Immuta can create this policy role automatically, or you can run the provided bootstrap script manually to create the policy role.
Data ownership role: This role is used to register data sources. A service account/role is recommended so that when the user moves or leaves the organization, Immuta will still have the proper credentials to connect to Snowflake. You can follow one of the two best practices:
A central role for registration (recommended): It is recommended that you create a service role/user with USAGE
permissions for all objects in Snowflake. This allows Immuta to register all the objects from Snowflake, populate the Immuta catalog, and scan the objects for sensitive data using Immuta Discover. Immuta will not apply policy directly by default, so no existing access will be impacted.
A role per team/domain (alternative): Alternatively, if you cannot create a role with USAGE
permissions for all objects, you can allow the different domains or teams in the organization to use a service user/role scoped to their data to register data sources. This is delegating metadata registration and aligns well with data mesh type use cases and means every team is responsible for registering their data sets in Immuta.
In order to take advantage of all the capabilities of Immuta, you must make Immuta aware of your data metadata. This is done by with Immuta as data sources. It’s important to remember that Immuta is not reading your actual data at all; it is simply discovering your information schemas and pulling that information back as the foundation for everything else.
This section offers the best practices when onboarding data sources into Immuta.
If you have an external data catalog, first; then register your data in Immuta. This process will automatically tag your data with the external catalog tags as you register it.
Find more on this topic in the guide.
Use Immuta's setting to onboard metadata without affecting your users' access. This means you onboard all metadata in Immuta without any impact on current accesses which gives you time to fully convert your operations to Immuta without causing unnecessary data downtime. Immuta will only take control when the first policies are applied. Because of this, register all tables.
While it can be tempting to start small and register only the pieces of data that you intend to protect, you must remember that Immuta is not just about access control. It’s important to register your data metadata so that Immuta can also track activity and understand where that sensitive data lies (with Immuta Detect). In other words, Immuta can’t tell you where you have problems unless you first tell it to look at your metadata.
Without the no default subscription policy, Immuta will set each data source's subscription policy to the most restrictive option which automatically locks data down during onboarding. To unlock the data and give your users access again, new subscription policies must be set.
If you are delegating the registration and control of data, then read our use case for more information.
Use the to register a schema; then use schema monitoring to find new data sources and automatically register them.
One of the greatest benefits of a modern data platform is that you can manage all your data transformations at the data tier. This means that data is constantly changing in the data platform, which may result in the need for access control changes as well. This is why it is critical that you when registering metadata with Immuta. This will allow Immuta to constantly monitor and update for these changes.
It’s also important to understand that many data engineering tools make changes by destructively recreating tables and views, which results in all policies being dropped in the data platform. This is actually a good thing, because this gives to update the access as the changes are found (policy uptime) while the only user that can see the data being recreated is the creator of that change (data downtime for all other users). This is why schema monitoring and column detection are so critical.
Requirement:
Snowflake Enterprise Edition or higher
Classification frameworks enabled in Immuta. If you do not know if they are enabled, collaborate with your Immuta representative to turn on in your Immuta tenant.
:
Users and Data Sources have been registered in Immuta:
Snowflake tables registered as Immuta data sources
Snowflake users registered in Immuta
Currently, Detect only supports filtering by tag and showing sensitivity of audit records for Snowflake.
This onboarding process is recommended for organizations that have not tagged any sensitive data yet. Immuta will identify, classify, and tag your data. After you are fully onboarded, you will see Detect dashboards with information on your organization's data use and data sensitivity.
After you are happy with the Detect dashboards on the select data sources you enabled, you can integrate Detect with more of your data environment.
: SDD will sample and tag your data based on the sensitive data detected. These tags are necessary for the classification framework tags in step 2 to be applied.
: Once you create and activate a framework, it will tag your data with classification tags. Specific classification tags contain the metadata required to assign your data sensitivity levels.
: After SDD and classification frameworks have been enabled and run, it may be necessary to adjust the output tags based on your organization's data, security, and compliance needs.
: Grant the appropriate users the AUDIT
permission to view Immuta Detect dashboards.
: Once all tags are correctly applied, the Detect dashboards will reflect accurate audit information. Navigate through Immuta Detect and explore the dashboards that visualize the sensitive data in your data environment.
: If you already had SDD enabled before starting Detect onboarding, skip this step. Once you are satisfied with the SDD tags and classification tags applied to your selected data sources, and the classification tags look correct, you should enable SDD for all data sources. This will add entity and classification tags to the rest of the data sources within your environment. You can choose to run SDD on all data sources, or run another payload with just a select few to gradually onboard the rest of your tables.
SaaS data security platforms are becoming an increasingly popular way to protect data at the speed of the cloud. In this guide, we'll explore seven ways the Immuta SaaS platform provides a reliable and versatile solution to data security complexity. You can skip this overview if you are already happily using Immuta SaaS, but if curious, read on!
Speed is an inherent ability in SaaS offerings because they are meant to be turnkey operations – you should just need to plug and play to leverage SaaS software to solve a specific problem. Immuta’s SaaS offering helps you to get to value fast without worrying about the IT department finding the time and energy to stand the software up. This helps organizations focus more on productive work and increases the organizations’ overall efficiency.
Self-managed solutions require long-term planning to scale operations and are often not the best option for growing businesses, as the IT staff has to constantly struggle in the upgrade loop. This could lead to significant restructuring costs as performance and functional demands increase. Additionally, upgrading or modifying existing systems can become costly due to potential downtime or other expenses associated with transitioning from one platform to another.
By contrast, Immuta’s SaaS platform software offers a more convenient way of optimizing operations across large corporate structures with minimal lead time needed for additional licenses or functionality additions. This allows you to easily scale according to business needs, so you don’t have to worry about how your internal IT team will keep up with future growth.
Organizations today are extremely cautious about avoiding data leaks and data or privacy breaches, and, therefore, need to invest in a powerful data security platform and trustworthy security solutions provider. Immuta’s processes, policies, and management system have been certified under the ISO 27001 and 27701 standards and SOC 2 Type 2 attestation, demonstrating that data security and privacy are important to Immuta.
The Immuta SaaS platform is deployed using industry-leading technologies, security controls, and deployment methodologies. The platform undergoes continuous vulnerability scanning and is penetration tested at least twice annually. Additionally, Immuta’s deploy-as-code methodology ensures that every system meets our baseline requirements before a system can be moved to production. Immuta’s SDLC requires that container images are continually hardened and tested several times a year in addition to our comprehensive penetration tests, ensuring the Immuta SaaS platform is continually evolving to face emerging threats.
In addition to our technical security controls, Immuta strives to minimize the data needed to deliver our services to clients. Only metadata is stored in Immuta’s SaaS platform to make policy enforcement decisions, meaning Immuta is not in the data path between your users and the data sources it protects, nor does it pull any of your data back at all. For the metadata Immuta does need, it is encrypted in transit using TLS 1.2 or newer and at rest using AES256 encryption.
Immuta offers automated backups with encryption at rest, eliminating the need for someone to own or set up the backups manually. This helps us guarantee 99.9% uptime for Immuta’s SaaS service.
One of the biggest benefits SaaS solutions offer is cost savings. Self-managed software, on the other hand, requires a cloud engineer or IT person to ensure that software is running well and upgraded on time. Immuta’s SaaS platform can provide notable savings in several different ways, including eliminating the cost of IT resources for installation. With Immuta, organizations can benefit from both short-term savings (i.e., installation costs) and long-term savings (i.e., operational costs).
SaaS also helps to maximize the return on investment (ROI) in the first year post-purchase by reducing the overall overhead and reaching value faster. To see for yourself how this works, check out our ROI calculator.
Leveraging the Immuta SaaS platform solution will improve availability and reliability by providing always-on data access while complying with data localization and sovereignty regulations. With a guaranteed uptime SLA of 99.9% and 24/7 monitoring by a world-class Site Reliability Engineering team, you’ll get the benefit of verified security and compliance capabilities, along with cost savings. Immuta also provides round-the-clock case support for enterprise customers, ensuring that experts are available to answer any product-related questions or educate users about features and use cases.
New features developed by Immuta’s Engineering team are deployed and available on SaaS first. This allows customers to get access to features faster, without worrying about planning major version upgrades and dealing with the hassle of downtime. With Immuta’s SaaS platform, customers can also access private preview features. SaaS also enables you to have security vulnerability patches as quickly as possible, without requiring self-managed manual upgrades.
Immuta’s SaaS platform eliminates the need to spend time and money on updating software by allowing customers to log on to already upgraded services. Upgrades and maintenance are done off hours with typically no downtime to ensure that Immuta’s capabilities are always available. The ability to monitor the dynamic threat landscape and to deliver patches directly through Immuta allows organizations to use data to focus on business.
Within the Immuta product ecosystem, Immuta Detect is responsible for surfacing and indexing a wide range of security-related events, making it a rich source of data security posture insights.
In a typical deployment, Immuta Detect efficiently surfaces and processes a vast number of data security events. While these events all have security relevance, it may be challenging to understand their potential impacts without manual investigation. At the same time, the sheer volume of events typically greatly exceeds what a team can manually explore.
Enter Immuta Discover: Immuta’s data discovery and security analysis engine can identify, categorize, and classify data. Immuta Discover analyzes data available within the operational context of an event in conjunction with applicable legal, regulatory, compliance, and security frameworks to make deep inferences about the status of the data. For example, in a medical context, Immuta Discover can understand the difference between anonymized and identified medical data.
With additional classification metadata powered by Immuta Discover, Immuta Detect analyzes data security events for sensitivity, ensuring that highly significant events remain highly visible. In the context of the previous example, Immuta Detect can detect and flag the accidental identification of anonymized medical data.
With Discover, Immuta Detect can provide insightful oversight of who accesses sensitive data, where it is stored, and how it is used, enabling
Rapid and exact compliance monitoring and assessment
Insights into data usage patterns for setting data access policy
Simplified and expedient audit responses
Context-aware analysis of data flows as seen through the lens of security or regulatory compliance frameworks
Immuta Discover works in three phases: identification, categorization, and classification.
In the first phase, data is identified by its kind – for example, a name or an age. This identification can be manually performed, externally provided, or automatically determined by Immuta Discover through column-level analysis. This is commonly termed entity identification.
In the second phase, data is categorized in the context where it appears, subject to any active data compliance or security frameworks. For example, a record occurring in a clinical context containing both a name and individual health entities is Protected Health Information under HIPAA.
Use frameworks to implement organization-specific compliance categories or other relevant high-level regulatory and compliance frameworks, such as those for categorizing data into categories defined under CCPA, GDPR, GLBA, HIPAA, etc. Think of categorization as a way to apply higher level categories to the fine-grained entities discovered in phase 1 through rules you can customize. These categories are presented as tags in Immuta, just like the entities in phase 1 and, thus, can be used for Immuta policies.
In the third and final phase, data is classified according to its sensitivity level (e.g., Customer Financial Data is Highly Sensitive). Just like how categories are built from phase 1 entities, classification builds on the phase 2 categories. Customize this classification under your respective views and organizational needs. These classifications are key to surfacing sensitive queries in Detect based on your definition of sensitive.
There are good reasons to automate data discovery and analysis with Immuta Discover:
It formalizes the entire process, producing a coherent set of classification rules.
It makes it possible to automatically and uniformly scale compliance to new data sources.
It enables Immuta Detect to automatically detect additional threats, such as unauthorized or attempted access to sensitive data, and for soft enforcement of organizational data access policies. (For example, that access to personal information, direct identifiers, or login credentials be masked.)
Speed. Automating data discovery and analysis with Immuta Discover enables faster access to data by removing the manual effort of tagging and classifying new tables and columns.
Be aware that
Some customization may be necessary. Although Immuta's sensitive data discovery discovers over 60 types of sensitive data, only some data elements may be relevant. Further, unique sensitive data elements may not be covered out of the box. In these cases, it is possible to create new sensitive data discovery identifiers to ensure data is properly discovered and tagged.
New global identification frameworks should be created to find only entities that are relevant to the organization. This will ensure extraneous tags are not added to data elements.
Some customers may already have an existing data catalog tagging data; Immuta’s sensitive data discovery can work in combination with the data catalog.
Because data environments are not static, it is imperative that data tagging is automatically performed with new or changed data so that policies can be enabled in real-time, lowering the risk of data leaks.
Many organizations have invested in an enterprise data catalog as part of their data governance programs. Entity tags from the data catalogs will be pulled into Immuta in a one-way sync because the catalog is the system of record for entity tags. The tags pulled in from the data catalog can later be mapped to categories in the same way that entities automatically discovered in phase 1 are mapped to categories. This in turn will associate the appropriate sensitivity via classification to the external tags.
For a concrete example, consider a scenario where the Collibra catalog has tags for Longitude
and Latitude
. The following example framework rule assigns Immuta Framework.Longitude
to any column tagged Collibra.Location.Longitude
.
Incorporating tags from external catalog rules is fairly straightforward. External tags are referenced in rules, except the source field identifies the external catalog. The source field generally varies depending on the external catalog system. The correct value for the field may be identified by examining tag objects listed with the tags API, which includes the source field.
Immuta is not just a location to define your policy logic; Immuta also enforces that logic in your data platform. How that occurs varies based on each data platform, but the overall architecture remains consistent and follows the NIST Zero Trust framework. The below diagram describes the recommended architecture from NIST:
Immuta lives in the middle control plane. To do this, Immuta knows details about the subjects and enterprise resources, acts as the policy decision point through policies administered by policy administrators, and makes real-time policy decisions using the internal Immuta policy engine.
Lastly, and of importance to how Immuta Secure functions, Immuta also enables the policy enforcement point by administering the policies natively in your data platform in a way that can react to policy changes and live queries.
To use Immuta, you must configure the Immuta native integration, which will require some level of privileged access to administer policies in your data platform, depending on your data platform and how the Immuta integration works. Refer to Snowflake roles best practices for Snowflake before configuring the native integration.
Immuta Detect provides value from the moment the dashboards are visible, which can be enabled for organizations with Snowflake, Databricks Spark, and Databricks Unity Catalog integrations. Currently, organizations with Snowflake integrations can get even more value with data sensitivity and tagging. To determine and surface the sensitivity of your data access, enable and tune classification.
Completing all the steps below will fully onboard you with Detect and Discover:
Prerequisites:
The onboarding process assumes that these prerequisites have already been set up, but here are the Immuta features and configuration required to enable Detect. Each integration can be used alone or a Snowflake integration can be used with either Databricks Spark or Databricks Unity Catalog. Databricks Spark and Databricks Unity Catalog are not supported together with Detect:
For Snowflake integrations:
:
: This feature can be enabled when first configuring the integration or when editing the integration.
: While not required, it is recommended to enable this feature to properly audit unauthorized query events. Without it, unauthorized events will still show as successful. Project workspaces cannot be used with table grants, so if your organization relies on them, leave this feature disabled.
Snowflake tables and users registered in Immuta: Detect only audits events by users registered in Immuta on tables registered in Immuta. If you do not register the tables and users, their actions will not appear in the audit records or on the Detect dashboards.
Benefits and limitations of enabling table grants
With enabled:
Unauthorized query events will be audited and present in the Detect dashboards.
Table grants will manage the privileges in Snowflake for Immuta tables, making it more efficient than without.
Without table grants:
Unauthorized events are unavailable because users will have successful queries of zero rows, even if they do not have access to the table.
You can use project workspaces. Table grants is not compatible with project workspaces. If your organization depends on that capability, table grants is not recommended.
For Databricks Spark integrations:
For Databricks Unity Catalog integrations:
Recommended:
This setting is not required for Detect, but can be used for better functionality:
Requirement:
Immuta permission USER_ADMIN
Actions:
To see sensitivity information using a Snowflake integration, proceed with the steps below.
Only available with Snowflake integrations: Discover classification is supported with Databricks and Snowflake integrations; however, the sensitivity can only be visualized in Detect dashboards with Snowflake integrations.
There are two options to tag data and activate classification frameworks to determine the sensitivity of your data:
After completing either of the tutorials above, data sources are tagged with entity tags and classification tags. Once users start querying data, and after the data latency with Snowflake, the Detect dashboards will show audit information with sensitivity information.
If you notice some sensitivity types are not appearing as you expect, proceed with the step below.
Only available with Snowflake integrations: Discover classification is supported with Databricks and Snowflake integrations; however, the sensitivity can only be visualized in Detect dashboards with Snowflake integrations.
Requirement:
Immuta permissions AUDIT
and GOVERNANCE
Actions:
After Discover has run SDD and the classification frameworks, it may be necessary to adjust the resulting tags based on your organization's data, security, and compliance needs:
After completing the tutorials above, all data appears as the appropriate sensitivity type on the Detect dashboards with Snowflake data sources.
Detect supports the following integration for activity pages with dynamic query sensitivity that will determine and visualize the sensitivity of user queries:
Detect supports the following integrations for activity pages, but will not visualize any sensitivity:
Databricks Spark
To do this, see the guide.
with Note that it is enabled by default when configuring the integration.
: This feature sets the subscription policy of all new data sources to none when they are registered. Using this feature, allows for organizations to register all Snowflake tables in Immuta. Their audit information will appear in the Detect dashboards, but users' access to them will not be impacted by Immuta until a subscription policy is set.
permission to see the Detect dashboards.
Navigate through Immuta Detect and that visualize user and query audit information for your data environment.
These actions will result in users seeing the containing information on the audit events in your data environment. These dashboards will not contain any information on the sensitivity of your data.
: This option is the smoothest onboarding experience because it is the most automated process. You will not need to manually tag your data, and the framework to determine sensitivity is already set to use the SDD tags.
: This option requires more manual configuration, but is best for organizations that have already configured tags for their tables. Contact your Immuta representative for guidance.
Detect activity pages will have active charts when configured correctly with supported integrations after audit logs have been ingested. The user viewing must have the .
Snowflake with
Databricks Unity Catalog with
See the for more information on the required configuration for each integration.
Query events sensitivity is determined by the tags with sensitivity metadata on the columns queried from Snowflake data sources. You must have a classification framework active with classification tags with the sensitivity metadata to see sensitivity in the Detect dashboards. Ensure you have completed the .