Most users will not log in to Immuta directly; they will log in to a system in which you want to enforce data governance. However, Immuta needs to know who the users are in order to enforce access controls. To do so,
Use an IAM (such as Okta) as an SSO or SAML solution.
Use SCIM to push a user's account (including their groups or attributes) to Immuta. SCIM automatically syncs user accounts in Immuta. However, users will still need to log into Immuta to use the web interface.
To learn more about Immuta’s IAM and SCIM integration click here.
You probably already have groups configured in your IAM, but attributes can make your policies more dynamic. Groups and roles in your existing IAM likely grant access to data, but they typically don’t describe that data or the access.
Use tags and attributes in Immuta that describe your data and your users, and then write a single policy that grants access to data automatically. For example, you could create tags that describe the domain of the table and the security classification. You could then write a policy that states users with attributes that match tags on the tables or columns would automatically get access. Writing a single policy like this one simplifies your overall policy creation and execution.
Analyze your policies and determine which user metadata is critical to understanding how policies should act, and then decorate your users with those descriptive attributes.
Leverage the systems you use for approval workflows for group assignments that are already a part of your organization to have approvals for attributes.
Manage attributes through Immuta or your IAM.
Schema monitoring checks for new tables that get added to your schema. This is a powerful tool to ensure that tables are all being governed by Immuta. Immuta will run a daily job to pick up and add any new tables.
Consider using Schema Monitoring later in your onboarding process, not during your initial setup and configuration when tables are not in a stable state.
Consider using Immuta’s API to either run the schema monitoring job when your ETL process adds new tables or to add new tables.
Activate the New Column Added templated Global Policy to protect potentially sensitive data. This policy will NULL the new columns until a Data Owner reviews new columns that have been added. This protects your data and avoids data leaks on new columns getting added without being reviewed first.
You may already have an external data catalog tool like Collibra or Alation. Immuta can easily integrate with those catalogs and use your existing metadata to scale policy creation quickly.
If you do not already have a sensitive data tagging solution, allow Immuta to discover and tag sensitive data with our Sensitive Data Discovery capabilities. Should you have data types Immuta doesn’t discover out of the box, you can customize Immuta’s sensitive data discovery to do so.
Add your first data sources and policies through the Immuta UI. Don't automate a process you haven’t executed manually successfully once.
When going to production with a large number of data sources, write a script to use Immuta’s API or CLI to automate onboarding of data sources.
The CLI is the best way to manage a single file for onboarding data sources and applying tags. Storing those file versions in a repository allows you to track versions and automate your pipeline.
Always register data sources with a system account, not a user account. If a user changes roles or leaves the organization, your data sources will continue to function. If the original owner of the data source is a user account, then the data source could get locked out when that user leaves the company or gets a new role, and an account lock can leave data sources unhealthy.
Add a secondary data owner to your data sources for easier management. The data owner(s) can be added to a data source after using a system account to register the data source.
Since ETL processes are typically completed through system accounts, Immuta should not be in your ETL process. Immuta governs user accounts, not system accounts that have full access to data.
Remove Immuta from your ETL process. For example, if you're using Databricks, your ETL process should use non-Immuta clusters.
Use Immuta’s native write capabilities to share derived data within a team setting, not as proxies for an ETL strategy.
Consult your Immuta representative on the best practice for your technology stack.
These suggestions provide general guidance for implementing scalable policies across your organization. Although every use case is different and each organization has unique needs and complexities, consider these best practices for your organization, regardless of scale or size.
Use Immuta's SaaS Platform: Use Immuta SaaS to automate installation, backups, and disaster recovery.
Write policies using Attribute-Based Access Control (ABAC): Write a single policy that enforces access controls based on who users are and why they're accessing data.
Use the power of your IAM and SCIM: Most users will not log in to Immuta directly; they will log in to a native system that you want to enforce data governance on (Snowflake, Databricks, etc.). SCIM provisions users in Immuta and adds their attributes and groups so that Immuta can enforce policies for these users.
Allow Immuta to discover and tag sensitive data with our Sensitive Data Discovery capabilities.
Integrate Immuta with your existing data catalog to scale policy creation quickly.
Coordinate closely with teams controlling this metadata. Writing policies relies on a good understanding of the tags that will be on the data. The same goes for the team that is tagging data. If they remove a tag, what effect does that have on access?
Weigh policy complexity against performance. It’s a trade-off:
Consider using null (or a static value such as "REDACTED") in your masking policies instead of hashing. Nulling is a much simpler operation than hashing, so it will actually reduce overhead from a query with no policy at all. Hashing, on the other hand, will increase overhead slightly because of its complexity.
Avoid row-level security that relies on thousands or millions of attributes per user for filtering. Policies will be more difficult to understand, you will have the complexity of managing those attributes, and potential for performance impacts during query run-time.
Consider hashing when a “join” condition is required. Hashing allows for an end-user to join on that column when using Immuta projects.
For more best practices around securing your data, process workflows, and organizing your data governance team, visit
Without Immuta, you may have been manually managing user access to tables, which increases time to data access. Address this challenge first by building automated Subscription Policies to grant access to tables. After you have built your Subscription Policies, start building Data Policies to restrict what data users can see once they've accessed and queried a table.
Global Policies can be applied to data sources across your organization, which prevents you from having to write or rewrite single policies for every data source added to Immuta. If you need to mask PII, you want it to apply to all tables that have PII, not just a single table. Global Policies enable scalability across your organization and across all data platforms.
Write policies based on who users are (in other words, their attributes: the state they live in, the project they are working on, etc.) and their purpose for accessing data.
For example, consider this policy: “a user is only allowed to see records for a state they live in.” Since there are 50 states,
With RBAC you’d have to have one group for each state (50 groups) and put people into those groups, and then write 50 policies that say “if someone is in the OHIO group they are allowed to see rows that match OHIO in the column tagged STATE."
With ABAC you would pull the attributes from the user from your IAM, and then write a single policy in Immuta: “Only show rows where user possesses an attribute STATE that matches the value in the column tagged STATE."
For details about the differences between ABAC and RBAC, see the Immuta blog What is ABAC? Attribute Based Access Control 101.
Immuta contains a Domain Specific Language (DSL) capability that pairs well with ABAC and provides a means of implementing more complex policy logic within a single statement (e.g., matching arrays of user attributes to table or column tagging). Contact your Immuta representative for guidance in writing these complex policies.
Match attributes for users with a value in the table. Cross-table lookup is not currently supported because of the additional time required to look up values at query time.
Immuta is not in the data path at all. Immuta simply takes the ABAC policies you define and pushes those down to your compute system. The more complex the policy that is pushed to that system, the more the potential impact on performance. For example, nulling (or using a static value such as “REDACTED”) is a simpler operation than hashing, so it will actually reduce overhead from a query with no policy at all. Hashing, on the other hand, will increase overhead slightly because of its complexity.
Use hashing when you need to join on a value. Hashing allows for an end-user to join on that column when using Immuta projects. However, hashing does take compute power, so consider milliseconds per column X the rows in your table.
Avoid row-level security that relies on thousands or millions of attributes per user for filtering. You should try to think of other schemes that either generalize those into fewer coarser lookups or partitioning your data in a way that aligns to those lookups so they run faster.
For details about specific policies and their tradeoffs, see Immuta Masking Functions.
Use Immuta SaaS to automate installation, backups, and disaster recovery. Other benefits include
receiving new features as they are released.
Immuta managing scaling and infrastructure.
To use Kubernetes effectively,
Identify a team to set up, configure, and maintain a Kubernetes environment. Immuta will help you with the installation of our platform, but the Kubernetes environment is your company's responsibility. Review Kubernetes best practices here.
Use persistent volumes.
Only use the Immuta-provided default Nginx Ingress Controller if you are using the Immuta query engine. Otherwise, opt to use your own ingress controller or no controller at all.
For on-prem instances,
Configure backups to run daily.
Test your backups at least once a month.
Use cloud storage (e.g., S3, ADLS, GCS).
Create the proper IAM roles and IAM permissions (if using IAM roles). Your backups will fail if this is not configured correctly.
Monitoring and compliance is a valuable result of implementing Immuta, providing rich query and access logs about your data.
Pull logs from Immuta on a daily basis. Whether you use Immuta SaaS or Immuta Self-Managed, you are responsible for pulling your audit logs. These logs contain all of the information you will need to support auditing and compliance for access, queries, and changes in your environment.
Use a tool like DataDog or Splunk for log aggregation. These tools are enterprise best practice for monitoring applications and can be easily deployed with Immuta.
Store logs for at least 30 days in a log aggregator for monitoring and compliance.
Discuss with your compliance group or lines of business which fields you want to monitor or report on from the Immuta logs. Immuta captures a wealth of information each time a user logs in, changes a policy, or runs a query. Discuss with your team which items you want to capture in a log aggregation tool or store long-term.