This guide is for anyone ready to begin their journey with Immuta who isn't using Snowflake. Specifically, it provides details for configuring Immuta that must be accomplished before moving on to the Secure your data use cases.
If you are using Snowflake with Immuta, see the Monitor and secure sensitive data platform query activity use case.
This use case provides the basics for setting up Immuta and helps you understand the logic and decisions behind the tasks.
Follow these steps to configure Immuta:
Ensure you have the Immuta software available to you. For the best experience, follow the steps below on Immuta SaaS because of the many SaaS benefits.
Configure your users in Immuta, using the user identity best practices.
Read the native integration architecture overview and connect Immuta to your databases. Consider the Databricks roles best practices if using Databricks.
Register data sources in order to start using Immuta features on the data sources.
SaaS data security platforms are becoming an increasingly popular way to protect data at the speed of the cloud. In this guide, we'll explore seven ways the Immuta SaaS platform provides a reliable and versatile solution to data security complexity. You can skip this overview if you are already happily using Immuta SaaS, but if curious, read on!
Speed is an inherent ability in SaaS offerings because they are meant to be turnkey operations – you should just need to plug and play to leverage SaaS software to solve a specific problem. Immuta’s SaaS offering helps you to get to value fast without worrying about the IT department finding the time and energy to stand the software up. This helps organizations focus more on productive work and increases the organizations’ overall efficiency.
Self-managed solutions require long-term planning to scale operations and are often not the best option for growing businesses, as the IT staff has to constantly struggle in the upgrade loop. This could lead to significant restructuring costs as performance and functional demands increase. Additionally, upgrading or modifying existing systems can become costly due to potential downtime or other expenses associated with transitioning from one platform to another.
By contrast, Immuta’s SaaS platform software offers a more convenient way of optimizing operations across large corporate structures with minimal lead time needed for additional licenses or functionality additions. This allows you to easily scale according to business needs, so you don’t have to worry about how your internal IT team will keep up with future growth.
Organizations today are extremely cautious about avoiding data leaks and data or privacy breaches, and, therefore, need to invest in a powerful data security platform and trustworthy security solutions provider. Immuta’s processes, policies, and management system have been certified under the ISO 27001 and 27701 standards and SOC 2 Type 2 attestation, demonstrating that data security and privacy are important to Immuta.
The Immuta SaaS platform is deployed using industry-leading technologies, security controls, and deployment methodologies. The platform undergoes continuous vulnerability scanning and is penetration tested at least twice annually. Additionally, Immuta’s deploy-as-code methodology ensures that every system meets our baseline requirements before a system can be moved to production. Immuta’s SDLC requires that container images are continually hardened and tested several times a year in addition to our comprehensive penetration tests, ensuring the Immuta SaaS platform is continually evolving to face emerging threats.
In addition to our technical security controls, Immuta strives to minimize the data needed to deliver our services to clients. Only metadata is stored in Immuta’s SaaS platform to make policy enforcement decisions, meaning Immuta is not in the data path between your users and the data sources it protects, nor does it pull any of your data back at all. For the metadata Immuta does need, it is encrypted in transit using TLS 1.2 or newer and at rest using AES256 encryption.
Immuta offers automated backups with encryption at rest, eliminating the need for someone to own or set up the backups manually. This helps us guarantee 99.9% uptime for Immuta’s SaaS service.
One of the biggest benefits SaaS solutions offer is cost savings. Self-managed software, on the other hand, requires a cloud engineer or IT person to ensure that software is running well and upgraded on time. Immuta’s SaaS platform can provide notable savings in several different ways, including eliminating the cost of IT resources for installation. With Immuta, organizations can benefit from both short-term savings (i.e., installation costs) and long-term savings (i.e., operational costs).
SaaS also helps to maximize the return on investment (ROI) in the first year post-purchase by reducing the overall overhead and reaching value faster. To see for yourself how this works, check out our ROI calculator.
Leveraging the Immuta SaaS platform solution will improve availability and reliability by providing always-on data access while complying with data localization and sovereignty regulations. With a guaranteed uptime SLA of 99.9% and 24/7 monitoring by a world-class Site Reliability Engineering team, you’ll get the benefit of verified security and compliance capabilities, along with cost savings. Immuta also provides round-the-clock case support for enterprise customers, ensuring that experts are available to answer any product-related questions or educate users about features and use cases.
New features developed by Immuta’s Engineering team are deployed and available on SaaS first. This allows customers to get access to features faster, without worrying about planning major version upgrades and dealing with the hassle of downtime. With Immuta’s SaaS platform, customers can also access private preview features. SaaS also enables you to have security vulnerability patches as quickly as possible, without requiring self-managed manual upgrades.
Immuta’s SaaS platform eliminates the need to spend time and money on updating software by allowing customers to log on to already upgraded services. Upgrades and maintenance are done off hours with typically no downtime to ensure that Immuta’s capabilities are always available. The ability to monitor the dynamic threat landscape and to deliver patches directly through Immuta allows organizations to use data to focus on business.
This guide outlines best practices for managing user identities in Immuta with your identity manager, such as Active Directory and Okta.
Reusing information you have already captured today is a good practice. A lot of information about users is in your identity manager platform and can be used in Immuta for user onboarding and policies.
All users protected by Immuta must be registered in Immuta, even though people might not log in to Immuta.
SAML is commonly used as a single sign-on mechanism for users to log in to Immuta. This means you can use your organization's SSO, which complies with your security standards.
Every user that will be protected by Immuta needs to have a user on the platform to enforce policy, regardless of if they are logging in to Immuta. SCIM should be used to provision users from your identity manager platforms to Immuta automatically. The advantage here is that not all end-users need to log in to Immuta to create their accounts, and updates in your identity manager will be automatically reflected in Immuta, hence updating the access in your platforms.
Details on how to configure your individual identity manager's protocols can be found here:
There are several different combinations of supported protocol configurations, so consider those as you plan your user synchronization.
In Immuta, permissions control what actions a user is allowed to take through the API and UI. The different permissions can be found in the Immuta permissions guide.
We recommend using identity manager groups to manage permissions. When you configure the identity manager integration, you can enable group permissions. This allows you to control the permissions via identity manager groups and use the group assignment and approval process currently in place.
Immuta lives in the middle control plane. To do this, Immuta knows details about the subjects and enterprise resources, acts as the policy decision point through policies administered by policy administrators, and makes real-time policy decisions using the internal Immuta policy engine.
Lastly, and of importance to how Immuta Secure functions, Immuta also enables the policy enforcement point by administering the policies natively in your data platform in a way that can react to policy changes and live queries.
Immuta is not just a location to define your policy logic; Immuta also enforces that logic in your data platform. How that occurs varies based on each data platform, but the overall architecture remains consistent and follows the . The below diagram describes the recommended architecture from NIST:
To use Immuta, you must configure the Immuta native integration, which will require some level of privileged access to administer policies in your data platform, depending on your data platform and how the Immuta integration works. If using Databricks, please refer to for Databricks before configuring the native integration.
Intermingling your pre-existing roles in Databricks with Immuta can be confusing at first. Below outlines some best practices on how to think about roles in each platform.
Access to data, platform permissions, and the ability to use clusters and data warehouses are controlled in Databricks Unity Catalog with permissions to individual users or groups. Immuta can control those permissions to grant users permission to read data based on subscription policies.
This section discusses best practices for Databricks Unity Catalog permissions for end-users.
Users who consume data (directly in your Databricks workspace or through other applications) need permission to access objects. But permissions are also used to control write, Databricks clusters and warehouses, and other object types that can be registered in Databricks Unity Catalog.
To manage this at scale, Immuta recommends taking a 3-layer approach, where you separate the different permissions into different privileges:
Privileges for read access (Immuta managed)
Privileges for write access (optional, soon supported by Immuta)
Privileges for warehouse and clusters, internal billing
Read access is managed by Immuta. By using subscription policies, data access can be controlled to the table level. Attribute-based table GRANTS help you scale compared to RBAC, where access control is typically done on a schema or catalog level.
Since Immuta leverages native Databricks Unity Catalog GRANTs, you can combine Immuta’s grants with grants done manually in Databricks Unity Catalog. This means you can gradually migrate to an Immuta-protected Databricks workspace.
Write access is typically granted on a schema, catalog, or volume level. This makes it easy to manage in Databricks Unity Catalog through manual grants. We recommend creating groups that give INSERT
, UPDATE
, or DELETE
permissions to a specific schema or catalog and attach this group to a user. This attachment can be done manually or using your identity manager groups. (See the Databricks documentation for details.) Note that Immuta is working toward supporting write policies, so this will not need to be separately managed for long.
Warehouses and clusters are granted to users to give them access to computing resources. Since this is directly tied to Databricks’ consumption model, warehouses and clusters are typically linked to cost centers for (internal) billing purposes. Immuta recommends creating a group per team/domain/cost center, applying this group for cluster/warehouse privileges, and granting this group to users using identity manager groups.
Immuta has two types of service accounts to connect to Databricks:
Policy role: Immuta needs to use a service principal to be able to push policies to Databricks Unity Catalog and to pull audits to Immuta (optional). This principal needs USE CATALOG
and USE SCHEMA
on all catalogs and schemas, and SELECT
and MODIFY
on all tables in the metastore managed by Immuta.
Data ownership role: You will also need a user/principal for the data source registration. A service account/principal is recommended so that when the user moves or leaves the organization, Immuta still has the proper credentials to connect to Databricks Unity Catalog. You can follow one of the two best practices:
A central role for registration (recommended): It is recommended that you create a service role/user with SELECT
permissions for all objects in your metastore. Immuta can register all the tables and views from Databricks, populate the Immuta catalog, and scan the objects for sensitive data using Immuta Discover. Immuta will not apply policy directly by default, so no existing access will be impacted.
A service principal per domain (alternative): Alternatively, if you cannot create a service principal with SELECT
permissions for all objects, you can allow the different domains or teams in the organization to use a service user/principal scoped to their data. This is delegating metadata registration and aligns well with data mesh type use cases and means every team is responsible for registering their data sets in Immuta.
In order to take advantage of all the capabilities of Immuta, you must make Immuta aware of your data metadata. This is done by registering your data with Immuta as data sources. It’s important to remember that Immuta is not reading your actual data at all; it is simply discovering your information schemas and pulling that information back as the foundation for everything else.
This section offers the best practices when onboarding data sources into Immuta.
If you have an external data catalog, like Collibra or Alation, configure the catalog integration first; then register your data in Immuta. This process will automatically tag your data with the external catalog tags as you register it.
Use Immuta's no default subscription policy setting to onboard metadata without affecting your users' access. This means you onboard all metadata in Immuta without any impact on current accesses which gives you time to fully convert your operations to Immuta without causing unnecessary data downtime. Immuta will only take control when the first policies are applied. Because of this, register all tables.
While it can be tempting to start small and register only the pieces of data that you intend to protect, you must remember that Immuta is not just about access control. It’s important to register your data metadata so that Immuta can also track activity and understand where that sensitive data lies (with Immuta Detect). In other words, Immuta can’t tell you where you have problems unless you first tell it to look at your metadata.
Without the no default subscription policy, Immuta will set each data source's subscription policy to the most restrictive option which automatically locks data down during onboarding. To unlock the data and give your users access again, new subscription policies must be set.
If you are delegating the registration and control of data, then please read our Data mesh use case for more information.
Use the /api/v2/data
endpoint to register a schema; then use schema monitoring to find new data sources and automatically register them.
One of the greatest benefits of a modern data platform is that you can manage all your data transformations at the data tier. This means that data is constantly changing in the data platform, which may result in the need for access control changes as well. This is why it is critical that you enable schema monitoring and column detection when registering metadata with Immuta. This will allow Immuta to constantly monitor and update for these changes.
It’s also important to understand that many data engineering tools make changes by destructively recreating tables and views, which results in all policies being dropped in the data platform. This is actually a good thing, because this gives Immuta a chance to update the access as the changes are found (policy uptime) while the only user that can see the data being recreated is the creator of that change (data downtime for all other users). This is why schema monitoring and column detection are so critical.