Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Click the App Settings icon in the left sidebar.
Click the link in the App Settings panel to navigate to that section.
See the identity manager pages for a tutorial to connect an Microsoft Entra ID, Okta, or OneLogin identity manager.
To configure Immuta to use all other existing IAMs,
Click the Add IAM button.
Complete the Display Name field and select your IAM type from the Identity Provider Type dropdown: LDAP/Active Directory, SAML, or OpenID.
Once you have selected LDAP/Active Directory from the Identity Provider Type dropdown menu,
Adjust Default Permissions granted to users by selecting from the list in this dropdown menu, and then complete the required fields in the Credentials and Options sections. Note: Either User Attribute OR User Search Filter is required, not both. Completing one of these fields disables the other.
Opt to have Case-insensitive user names by clicking the checkbox.
Opt to Enable Debug Logging or Enable SSL by clicking the checkboxes.
In the Profile Schema section, map attributes in LDAP/Active Directory to automatically fill in a user's Immuta profile. Note: Fields that you specify in this schema will not be editable by users within Immuta.
Opt to Link SQL Account.
Opt to Enable scheduled LDAP Sync support for LDAP/Active Directory and Enable pagination for LDAP Sync. Once enabled, confirm the sync schedule written in Cron rule; the default is every hour. Confirm the LDAP page size for pagination; the default is 1,000.
Opt to Sync groups from LDAP/Active Directory to Immuta. Once enabled, map attributes in LDAP/Active Directory to automatically pull information about the groups into Immuta.
Opt to Sync attributes from LDAP/Active Directory to Immuta. Once enabled, add attribute mappings in the attribute schema. The desired attribute prefix should be mapped to the relevant schema URN.
Opt to enable External Groups and Attributes Endpoint, Make Default IAM, or Migrate Users from another IAM by selecting the checkbox.
Then click the Test Connection button.
Once the connection is successful, click the Test User Login button.
Click the Test LDAP Sync button if scheduled sync has been enabled.
See the SAML protocol configuration guide.
Once you have selected OpenID from the Identity Provider Type dropdown menu,
Take note of the ID. You will need this value to reference the IAM in the callback URL in your identity provider with the format <base url>/bim/iam/<id>/user/authenticate/callback
.
Note the SSO Callback URL shown. Navigate out of Immuta and register the client application with the OpenID provider. If prompted for client application type, choose web.
Adjust Default Permissions granted to users by selecting from the list in this dropdown menu.
Back in Immuta, enter the Client ID, Client Secret, and Discover URL in the form field.
Configure OpenID provider settings. There are two options:
Set Discover URL to the /.well-known/openid-configuration
URL provided by your OpenID provider.
If you are unable to use the Discover URL option, you can fill out Authorization Endpoint, Issuer, Token Endpoint, JWKS Uri, and Supported ID Token Signing Algorithms.
If necessary, add additional Scopes.
Opt to Enable SCIM support for OpenID by clicking the checkbox, which will generate a SCIM API Key.
In the Profile Schema section, map attributes in OpenID to automatically fill in a user's Immuta profile. Note: Fields that you specify in this schema will not be editable by users within Immuta.
Opt to Allow Identity Provider Initiated Single Sign On or Migrate Users from another IAM by selecting the checkboxes.
Click the Test Connection button.
Once the connection is successful, click the Test User Login button.
To set the default permissions granted to users when they log in to Immuta, click the Default Permissions dropdown menu, and then select permissions from this list.
See the External Catalogs page.
Select Add Workspace.
Use the dropdown menu to select the Databricks Workspace Type.
Before creating a workspace, the cluster must send its configuration to Immuta; to do this, run a simple query on the cluster (i.e., show tables
). Otherwise, an error message will occur when users attempt to create a workspace.
The Databricks API Token used for native workspace access must be non-expiring. Using a token that expires risks losing access to projects that are created using that configuration.
Use the dropdown menu to select the Schema and refer to the corresponding tab below.
Enter the Name.
Click Add Workspace.
Enter the Hostname, Workspace ID, Account Name, Databricks API Token, and Storage Container.
Enter the Workspace Base Directory.
Click Test Workspace Directory.
Once the credentials are successfully tested, click Save.
Enter the Name.
Click Add Workspace.
Enter the Hostname, Workspace ID, Account Name, and Databricks API Token.
Use the dropdown menu to select the Google Cloud Region.
Enter the GCS Bucket.
Opt to enter the GCS Object Prefix.
Click Test Workspace Directory.
Once the credentials are successfully tested, click Save.
Databricks API Token Expiration
The Databricks API Token used for native workspace access must be non-expiring. Using a token that expires risks losing access to projects that are created using that configuration.
Select Add Native Integration.
Use the dropdown menu to select the Integration Type. Follow one of the guides below to finish configuring your integration:
To configure Immuta to protect data in a kerberized Hadoop cluster,
Upload your Kerberos Configuration File, and then you can add modify the Kerberos configuration in the window pictured below.
Upload your Keytab File.
Enter the principal Immuta will use to authenticate with your KDC in the Username field. Note: This must match a principal in the Keytab file.
Adjust how often (in milliseconds) Immuta needs to re-authenticate with the KDC in the Ticket Refresh Interval field.
Click Test Kerberos Initialization.
Click the Generate Key button.
Save this API key in a secure location.
To enable Sensitive Data Discovery and configure its settings, see the Sensitive Data Discovery page.
By default, query text is included in native query audit events from Snowflake, Databricks, and Starburst (Trino).
When query text is excluded from audit events, Immuta will retain query event metadata such as the columns and tables accessed. However, the query text used to make the query will not be included in the event. This setting is a global control for all configured integrations.
To exclude query text from audit events,
Scroll to the Audit section.
Check the box to Exclude query text from audit events.
Click Save.
Deprecation notice
The ability to configure the behavior of the default subscription policy has been deprecated. Once this configuration setting is removed from the app settings page, Immuta will not apply a subscription policy to registered data sources unless an existing global policy applies to them. To set an "Allow individually selected users" subscription policy on all data sources, create a global subscription policy with that condition that applies to all data sources or apply a local subscription policy to individual data sources.
Click the App Settings icon in the navigation menu.
Scroll to the Default Subscription Policy section.
Select the radio button to define the behavior of subscription policies when new data sources are registered in Immuta:
None: When this option is selected, Immuta will not apply any subscription policies to data sources when they are registered. Changing the default subscription policy to none will only apply to newly created data sources. Existing data sources will retain their existing subscription policies.
Allow individually selected users: When a data source is created, Immuta will apply a subscription policy to it that requires users to be individually selected to access the underlying table. In most cases, users who were able to query the table before the data source was created will no longer be able to query the table in the remote data platform until they are subscribed to the data source in Immuta.
Click Save and confirm your changes.
Immuta merges multiple Global Subscription policies that apply to a single data source; by default, users must meet all the conditions outlined in each policy to get access (i.e., the conditions of the policies are combined with AND
). To change the default behavior to allow users to meet the condition of at least one policy that applies (i.e., the conditions of the policies are combined with OR
),
Click the Default Subscription Merge Options text in the left pane.
Select the Default "allow shared policy responsibility" to be checked checkbox.
Click Save.
Note: Even with this setting enabled, Governors can opt to have their Global Subscription policies combined with AND
during policy creation.
These options allow you to restrict the power individual users with the GOVERNANCE and USER_ADMIN permissions have in Immuta. Click the checkboxes to enable or disable these options.
You can create custom permissions that can then be assigned to users and leveraged when building subscription policies. Note: You cannot configure actions users can take within the console when creating a custom permission, nor can the actions associated with existing permissions in Immuta be altered.
To add a custom permission, click the Add Permission button, and then name the permission in the Enter Permission field.
To create a custom questionnaire that all users must complete when requesting access to a data source, fill in the following fields:
Opt for the questionnaire to be required.
Key: Any unique value that identifies the question.
Header: The text that will display on reports.
Label: The text that will display in the questionnaire for the user. They will be prompted to type the answer in a text box.
To create a custom message for the login page of Immuta, enter text in the Enter Login Message box. Note: The message can be formatted in markdown.
Opt to adjust the Message Text Color and Message Background Color by clicking in these dropdown boxes.
Without fingerprints, some policies will be unavailable
These policies will be unavailable until a data owner manually generates a fingerprint:
Masking with format preserving masking
Masking with K-Anonymization
Masking using randomized response
To disable the automatic collection of statistics with a particular tag,
Use the Select Tags dropdown to select the tag(s).
Click Save.
Query engine and legacy fingerprint required
K-anonymization policies require the query engine and legacy fingerprint service, which are disabled by default. If you need to use k-anonymization policies, work with your Immuta representative to enable the query engine and legacy fingerprint service when you deploy Immuta.
When a k-anonymization policy is applied to a data source, the columns targeted by the policy are queried under a fingerprinting process that generates rules enforcing k-anonymity. The results of this query, which may contain data that is subject to regulatory constraints such as GDPR or HIPAA, are stored in Immuta's metadata database.
The location of the metadata database depends on your deployment:
Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.
To ensure this process does not violate your organization's data localization regulations, you need to first activate this masking policy type before you can use it in your Immuta tenant.
Click Other Settings in the left panel and scroll to the K-Anonymization section.
Select the Allow users to create masking policies using K-Anonymization checkbox to enable k-anonymization policies for your organization.
Click Save and confirm your changes.
Query engine and legacy fingerprint required
For all data platforms except Snowflake, randomized response policies require the query engine and legacy fingerprint service, which are disabled by default. If you need to use these policies, work with your Immuta representative to enable the query engine and legacy fingerprint service when you deploy Immuta.
When a randomized response policy is applied to a data source, the columns targeted by the policy are queried under a fingerprinting process. To enforce the policy, Immuta generates and stores predicates and a list of allowed replacement values that may contain data that is subject to regulatory constraints (such as GDPR or HIPAA) in Immuta's metadata database.
The location of the metadata database depends on your deployment:
Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.
To ensure this process does not violate your organization's data localization regulations, you need to first activate this masking policy type before you can use it in your Immuta tenant.
Click Other Settings in the left panel and scroll to the Randomized Response section.
Select the Allow users to create masking policies using Randomized Response checkbox to enable use of these policies for your organization.
Click Save and confirm your changes.
If you enable any Preview features, provide feedback on how you would like these features to evolve.
Click Advanced Settings in the left panel, and scroll to the Preview Features section.
Check the Enable Policy Adjustments checkbox.
Click Save.
Click Advanced Settings in the left panel, and scroll to the Preview Features section.
Check the Allow Complex Data Types checkbox.
Click Save.
For instructions on enabling this feature, navigate to the Global Subscription Policies Advanced DSL Tutorial.
When you are ready to finalize your configuration changes, click the Save button at the bottom of the left panel, and then click Confirm to deploy your changes.
This section contains information about private connectivity options for Databricks integrations.
The Immuta SaaS platform supports private connectivity to and the . This allows customers to meet security and compliance controls by ensuring that traffic to data sources from Immuta SaaS only traverses private networks, never the public internet.
Support for AWS PrivateLink is available in most regions across Immuta's Global Segments (NA, EU, and AP); contact your Immuta account manager if you have questions about availability.
Support for Azure Private Link is available in .
provides private connectivity from the Immuta SaaS platform, hosted on AWS, to customer-managed Azure Databricks accounts. It ensures that all traffic to the configured endpoints only traverses private networks over the Immuta Private Cloud Exchange.
This front-end Private Link connection allows users to connect to the Databricks web application, REST API, and Databricks Connect API over an Azure Private Endpoint. For details about Azure Private Link for Databricks and the network flow in a typical implementation, explore the .
Support for Azure Private Link is available in .
Ensure that your accounts meet the following requirements:
You have an Immuta SaaS tenant.
Contact your Immuta representative, and provide the following information for each Azure Databricks Workspace you wish to connect to:
Azure Region
Azure Databricks hostname
Azure Databricks Resource ID or Alias
This section contains information about application-wide settings and configurations.
: Control application-wide settings on the Immuta app settings page.
Configure private networking: Leverage AWS PrivateLink or Azure Private Link for communication between the Immuta SaaS platform and an integration.
Databricks
Snowflake
Starburst (Trino)
: Configure Immuta to enforce policies on the dashboards of your BI tools.
: Filter the IP addresses users can log in from.
: Export and download a system status bundle to assess and solve issues with the Immuta application.
AWS PrivateLink provides private connectivity from the Immuta SaaS platform to customer-managed Redshift Clusters hosted on AWS. It ensures that all traffic to the configured endpoints only traverses private networks.
This feature is supported in most regions across Immuta's Global Segments (NA, EU, and AP); contact your Immuta account manager if you have questions about availability.
You have an Immuta SaaS tenant.
If you are using TLS, the presented certificate must have the Fully-Qualified Domain Name (FQDN) of your cluster as a Subject Alternative Name (SAN).
When creating the service, make sure that the Require Acceptance option is checked (this does not allow anyone to connect, all connections will be blocked until the Immuta Service Principal is added).
AWS Region
AWS Subnet Availability Zones IDs (e.g. use1-az3
; these are not the account-specific identifiers like us-east-1a
or eu-west-2c
)
VPC Endpoint Service ID (e.g., vpce-0a02f54c1d339e98a
)
Ports Used
provides private connectivity from the Immuta SaaS platform to customer-managed Snowflake accounts hosted on AWS. It ensures that all traffic to the configured endpoints only traverses private networks.
This feature is supported in most regions across Immuta's Global Segments (NA, EU, and AP); contact your Immuta account manager if you have questions about availability.
You have an Immuta SaaS tenant.
Your Snowflake account is hosted on AWS.
You have ACCOUNTADMIN
role on your Snowflake account to configure the Private Link connection.
In your Snowflake environment, run the following SQL query, which will return a JSON object with the connection information you will need to include in your support ticket:
Note that the privatelink-account-url
from the JSON object in step one will be the Server when registering data sources.
This section contains information about private connectivity options for Snowflake integrations.
The Immuta SaaS platform supports private connectivity to Snowflake accounts hosted in both and . This allows customers to meet security and compliance controls by ensuring that traffic to data sources from Immuta SaaS only traverses private networks, never the public internet.
Support for AWS PrivateLink is available in most regions across Immuta's Global Segments (NA, EU, and AP); contact your Immuta account manager if you have questions about availability.
Support for Azure Private Link is available in .
Your Azure Databricks workspace must be on the .
has been configured and enabled.
You have your Databricks account ID from the .
Your representative will inform you when the two Azure Private Link connections have been made available. Accept them in your .
using your standard azuredatabricks.net
URL.
. Note that the privatelink-account-url
from the JSON object in step one will be the Server when registering data sources.
: This reference guide describes the SaaS and self-managed deployment options, the global segments for deploying Immuta, and the IP addresses to authorize in your network firewall configuration to allow Immuta to connect to databases running in closed networks.
: This reference guide describes the types of metadata and sample data Immuta processes.
: This reference guide describes the processing data Immuta uses to enact policies, generate data source fingerprints, and the encryption practices on that data.
You have set up an for your Redshift Cluster endpoints.
If you have configured on your PrivateLink Service, the domain ownership must be verifiable via a public DNS zone. This means that you cannot use a Top-Level Domain (TLD) that is not publicly resolvable, e.g. redshift.mycompany.internal
.
Open a support ticket with with the following information:
provided by your representative so that Immuta can complete the VPC Endpoint configuration.
.
.
Your Snowflake account is on the .
You have enabled .
Copy the returned JSON object into a support ticket with to request for the feature to be enabled on your Immuta SaaS tenant.
.
.
This section contains information about private connectivity options for Starburst (Trino) integrations.
The Immuta SaaS platform supports private connectivity to Starburst (Trino) clusters hosted in both AWS and Azure. This allows customers to meet security and compliance controls by ensuring that traffic to data sources from Immuta SaaS only traverses private networks, never the public internet.
Support for AWS PrivateLink is available in most regions across Immuta's Global Segments (NA, EU, and AP); contact your Immuta account manager if you have questions about availability.
Support for Azure Private Link is available in all Azure regions.
When creating a data source in Power BI, specify Microsoft Account as the authentication method, if available. This setting allows you to use your enterprise SSO to connect to your compute platform.
After connecting to the compute platform and the tables to use for your data source, select DirectQuery to connect to the data source. This setting is required for Immuta to enforce policies.
After you publish the datasets to the Power BI service, force users to use their personal credentials to connect to the compute platform by following the steps below.
Enable SSO in the tenant admin portal under Settings -> Admin portal -> Integration settings.
Find the option to manage Data source credentials under Settings -> Datasets.
For most connectors you can enable OAuth2 as the authentication method to the compute platform.
Enable the option Report viewers can only access this data source with their own Power BI identities using DirectQuery. This forces end-users to use their personal credentials.
The system status tab allows administrators to export a zip file called the Immuta status bundle. This bundle includes information helpful to assess and solve issues within an Immuta tenant by providing a snapshot of Immuta, associated services, and information about the remote source backing any of the selected data sources.
Click the App Settings icon.
Select the System Status tab.
Select the checkboxes for the information you want to export.
Click Generate Status Bundle to download the file.
Immuta SaaS supports private connectivity to customer data platforms over both AWS PrivateLink and Azure Private Link. Customers with security and/or compliance requirements to ensure that their data platforms are not routable over the public internet (even with a firewall in place) can have private networking configured to ensure that their standards are met.
Although AWS PrivateLink and Azure Private Link differ in their implementation details, they are fundamentally similar offerings. Customers can expose private services on AWS or Azure networks that Immuta can establish a connection to. How this is done can vary significantly by both data platform and hosting cloud provider, which is why this documentation has been broken down into specific instructions for each combination in the support matrix below.
Snowflake
Databricks
Starburst (Trino)
Amazon Redshift
N/A
Amazon S3
N/A
Azure Synapse Analytics
N/A
Not Yet Supported
Over time, the breadth and depth of private networking support will continue to grow. If there are specific data platforms and/or cloud providers that you require, which are either not listed or not yet supported, please contact your Immuta representative.
Immuta SaaS's global network is divided into large geographic regions called global segments. All Immuta SaaS tenants are deployed into an AWS region inside their chosen segment.
Occasionally, customers require that they be able to connect to data sources outside of that region. To meet those needs, Immuta SaaS supports both cross-region and cross-global-segment connectivity.
This involves connecting to data sources in a different region within a given global segment.
Examples:
a tenant in us-east-1
needs to connect to a Snowflake account in AWS'sus-east-2
region.
a tenant in us-west-2
needs to connect to an Azure Databricks workspace in the westus2
region.
This involves connecting to data sources in a region outside of the tenant's global segment.
Examples:
a tenant in the EU Global Segment needs to connect to a Snowflake account in us-east-2
.
a tenant in the AP Global Segment needs to connect to a Starburst instance hosted in Azure's eastus2
region.
Azure Private Link provides private connectivity from the Immuta SaaS platform, hosted on AWS, to customer-managed Snowflake Accounts on Azure. It ensures that all traffic to the configured endpoints only traverses private networks over the Immuta Private Cloud Exchange.
Support for Azure Private Link is available in all Snowflake-supported Azure regions.
You have an Immuta SaaS tenant.
Your Snowflake account is hosted on Azure.
Your Snowflake account is on the Business Critical Edition.
You have ACCOUNTADMIN
role on your Snowflake account to configure the Private Link connection.
Snowflake requires that an Azure temporary access token be used when configuring the Azure Private Link connection. Due to the constraint imposed by the 1-hour token expiration, your Immuta representative will ask for a time window in which you can accept the connection in your Snowflake account. During this window, the token will be generated by Immuta and provided to you when you're ready to run the following SQL query.
In your Snowflake environment, run the following SQL query, which will return a JSON object with the connection information you will need to include in your support ticket:
Copy the returned JSON object into a support ticket with Immuta Support to request for the feature to be enabled on your Immuta SaaS tenant.
Your Immuta representative will work with you to schedule a time in which to accept the connection in your Snowflake account. They will provide you with a SQL query to run using the ACCOUNTADMIN
role. The SQL query will be in this format:
The query should return the following response: Private link access authorized.
Register your tables as Immuta data sources. Note that the privatelink-account-url
from the JSON object in step one will be the Server when registering data sources.
AWS PrivateLink provides private connectivity from the Immuta SaaS platform to customer-managed Starburst (Trino) Clusters hosted on AWS. It ensures that all traffic to the configured endpoints only traverses private networks.
This feature is supported in most regions across Immuta's Global Segments (NA, EU, and AP); contact your Immuta account manager if you have questions about availability.
You have an Immuta SaaS tenant.
Your Starburst (Trino) Cluster is hosted on AWS.
You have set up an AWS PrivateLink Service for your Starburst Cluster endpoints.
If you have configured Private DNS Hostnames on your PrivateLink Service, the domain ownership must be verifiable via a public DNS zone. This means that you cannot use a Top-Level Domain (TLD) that is not publicly resolvable, e.g. starburst.mycompany.internal
.
If you are using TLS, the presented certificate must have the Fully-Qualified Domain Name (FQDN) of your cluster as a Subject Alternative Name (SAN).
When creating the service, make sure that the Require Acceptance option is checked (this does not allow anyone to connect; all connections will be blocked until the Immuta Service Principal is added).
Only TCP connections over IPv4
are supported.
Open a support ticket with Immuta Support with the following information:
AWS Region
AWS Subnet Availability Zones IDs (e.g. use1-az3
; these are not the account-specific identifiers like us-east-1a
or eu-west-2c
)
VPC Endpoint Service ID (e.g., vpce-0a02f54c1d339e98a
)
DNS Hostname
Ports Used
Authorize the Service Principal provided by your representative so that Immuta can complete the VPC Endpoint configuration.
Private preview: This feature is only available to select accounts.
Azure Private Link provides private connectivity from the Immuta SaaS platform, hosted on AWS, to customer-managed Starburst (Trino) clusters on Azure. It ensures that all traffic to the configured endpoints only traverses private networks over the Immuta Private Cloud Exchange.
Support for Azure Private Link is available in all Azure regions.
You have an Immuta SaaS tenant.
Your Starburst (Trino) cluster is hosted on Azure.
You have set up an Azure Private Link Service for your Starburst cluster.
The Private Link Service's Access Security should be set to Restricted by Subscription.
Open a support ticket with Immuta Support with the following information:
Azure Region
Azure Private Link Service Resource ID or Alias
DNS Hostname
Your Immuta representative will provide you with the Immuta Subscription ID that needs to be authorized to consume the service.
Once the Immuta Azure Subscription is authorized, inform your representative so that Immuta can complete Private Link Endpoint configuration.
Your representative will inform you when the two Azure Private Link connections have been made available. Accept them in the Private Link Center of your Azure Portal.
Immuta can enforce policies on data in your dashboards when your BI tools are connected directly to your compute layer.
This page provides recommendations for configuring the interaction between your database, BI tools, and users.
To ensure that Immuta applies access controls to your dashboards, connect your BI tools directly to the compute layer where Immuta enforces policies without using extracts. Different tools may call this feature different names (such as live connections in Tableau or DirectQuery in Power BI).
Connecting your tools directly to the compute layer without using extracts will not impact performance and provides host of other benefits. For details, see Moving from legacy BI extracts to modern data security and engineering.
Personal credentials need to be used to query data from the BI tool so that Immuta can apply the correct policies for the user accessing the dashboard. Different authentication mechanisms are available, depending on the BI tool, connector, and compute layer. However, Immuta recommends to use one of the following methods:
Use OAuth single sign (SSO) on when available, as it offers the best user experience.
Use username and password authentication or personal access tokens as an alternative if OAuth is not supported.
Use impersonation if you cannot create and authenticate individual users in the compute layer. Native impersonation allows users to natively query data as another Immuta user. For details, see the user impersonation guide.
For configuration guidance, see Power BI configuration example and Tableau configuration example.
Immuta has verified several popular BI tool and compute platform combinations. The table below outlines these combinations and their recommended authentication methods. However, since these combinations depend on tools outside Immuta, consult the platform documentation to confirm these suggestions.
AWS Databricks + Power BI Service: The Databricks Power BI Connector does not work with OAuth or personal credentials. Use a Databricks PAT (personal access token) as an alternative.
Redshift + Tableau: Use username and password authentication or impersonation.
Starburst + Power BI Service: The Power BI connector for Starburst requires a gateway that shares credentials, so this combination is not supported.
Starburst + Tableau: Use username and password authentication or impersonation.
QuickSight: A shared service account is used to query data, so this tool is not supported.
AWS PrivateLink provides private connectivity from the Immuta SaaS platform to customer-managed Databricks accounts hosted on AWS. It ensures that all traffic to the configured endpoints only traverses private networks.
This front-end PrivateLink connection allows users to connect to the Databricks web application, REST API, and Databricks Connect API over a VPC interface endpoint. For details about AWS PrivateLink in Databricks and the network flow in a typical implementation, explore the Databricks documentation.
This feature is supported in most regions across Immuta's Global Segments (NA, EU, and AP); contact your Immuta account manager if you have questions about availability.
Ensure that your accounts meet the following requirements:
Your Databricks account is on the E2 version of the platform.
Your Databricks account is on the Enterprise pricing tier.
You have your Databricks account ID from the account console.
You have an Immuta SaaS tenant.
AWS PrivateLink for Databricks has been enabled.
Ensure that your workspace meets the following requirements:
Your workspace must be in an AWS region that supports the E2 version of the platform.
Your Databricks workspace must use Customer-managed VPC to add any PrivateLink connection.
Your workspaces must be configured with private_access_settings
objects.
You cannot configure a connection to your workspace over the public internet if PrivateLink is enabled.
If you have PrivateLink configured on your workspace, Databricks will update the DNS records for that workspace URL to resolve to <region>.privatelink.cloud.databricks.com
. Immuta SaaS uses these publicly-resolvable records to direct traffic to a PrivateLink endpoint on our network.
This means that if you have PrivateLink enabled on your workspace, you must follow these instructions to configure your integration. Even if your workspace is also publicly-routable, Databricks's DNS resolution forces the traffic over PrivateLink.
The two supported configurations are
A workspace with no PrivateLink configuration, which resolves to public IP addresses.
A workspace with PrivateLink configuration, which allows access from the Immuta SaaS regional endpoint (listed below).
Contact your Databricks representative to enable AWS PrivateLink on your account.
Register the Immuta VPC endpoint for the applicable AWS region with your Databricks workspaces. The Immuta VPC endpoint IDs are listed in the table below.
ap-northeast-1
Asia Pacific (Tokyo)
vpce-08cadda15f0f70462
ap-south-1
Asia Pacific (Mumbai)
vpce-0efef886a4fbd9532
ap-southeast-1
Asia Pacific (Singapore)
vpce-07e9890053f5084b2
ap-southeast-2
Asia Pacific (Sydney)
vpce-0d363d9ea82658bec
ca-central-1
Canada (Central)
vpce-01933bcf30ac4ed19
eu-central-1
Europe (Frankfurt)
vpce-0048e36edfb27d0aa
eu-west-1
Europe (Ireland)
vpce-0783d9412b046df1f
eu-west-2
Europe (London)
vpce-0f546cc413bf70baa
us-east-1
US East (Virginia)
vpce-0c6e8f337e0753aa9
us-east-2
US East (Ohio)
vpce-00ba42c4e2be20721
us-west-2
US West (Oregon)
vpce-029306c6a510f7b79
Identify your private access level (either ACCOUNT
or ENDPOINT
) and configure your Databricks workspace accordingly.
If the private_access_level
on your private_access_settings
object is set to ACCOUNT
, no additional configuration is required.
If the private_access_level
on your private_access_settings
object is set to ENDPOINT
, using the table above, you will need to add it to the allowed_vpc_endpoint_ids
list inside your private_access_settings
object in Databricks. For example,
This feature allows customers to limit the IP addresses from which a user can access an Immuta SaaS tenant, providing an additional layer of security. If IP filtering has been enabled for a tenant, requests from IP addresses not on the allowlist will be blocked and will instead receive a 403 Forbidden
HTTP error code as a response.
To configure IP Filtering, contact .
When creating the data source in Tableau, specify the authentication method as Sign in using OAuth. This setting will allow you to use your enterprise SSO to connect to your compute platform.
After connecting to the compute platform, select the tables you will use for your data source. Then, select Live connection. This setting is required for Immuta to enforce policies.
To share your dashboard to your organization, publish your data sources. During this process, set the authentication method to Prompt user. This option ensures that dashboard viewers will see the data according to their personal policies.
Snowflake guide:
Databricks guides:
Redshift guide:
Terminology: Local Region
The Local Region is the customer's operating region, which determines where an Immuta tenant is deployed and the Immuta Metadata Database lives. Immuta SaaS can deploy in these AWS regions.
To understand how Immuta processes data, it's imperative to understand the purpose of the Immuta components (illustrated in the diagram below) deployed in the Immuta Cloud infrastructure:
Fingerprint Service: When enabled, additional statistical queries made during the health check are distilled into summary statistics, called fingerprints. During this process, statistical query results and data samples (which may contain PII) are temporarily held in memory by the Fingerprint Service.
Immuta Tenant Metadata Database: The database specific to a customer's tenant that contains the tenant's metadata that powers the core functionality of Immuta, including policy data and attributes about data sources (tags, audit data, etc.).
Immuta Web Service: This component includes the Immuta UI and API and is responsible for all web-based user interaction with Immuta, metadata ingest, and the data fingerprinting process.
Immuta tenants are localized to the customer
The Immuta tenants and its components (Metadata Database, Fingerprint Service, and Web Service) are localized to the customer.
Data processed by Immuta falls into one of the following categories. For additional details, click a category to navigate to that section.
Immuta communicates with remote databases over a TCP connection.
Audit data includes metadata (e.g., who subscribes to a data source, when they access data, potentially what SQL queries were run, etc.) that is generated by a variety of actions and processes in Immuta. The most common processes are illustrated in the diagram below.
All audit logs flow from the Web Service to the Metadata Database (local to the customer's region) and are stored for 90 days.
This process is only relevant to customers using an external identity provider service to manage user accounts in Immuta.
The initial Immuta user account is created on the Immuta SaaS tenant, and this data is stored in the tenant's Metadata Database.
A System Administrator configures an external IAM with Immuta.
User account information is collected from the external IAM and stored in the tenant's Metadata Database.
This data is processed to support data source creation, health checks, policy enforcement, and dictionary features.
A System Administrator configures the integration in Immuta.
A Data Owner registers data sources from their remote data platform with Immuta. Note: Data Owners can see sample data when editing a data source. However, this action requires the database password, and the small sample of data visible is only displayed in the UI and is not stored in Immuta.
When a data source is created or updated, the Metadata Database pulls in and stores statistics about the data source, including row count and high cardinality calculations.
The data source health check runs daily to ensure existing tables are still valid.
If an external catalog is enabled, the daily health check will pull in data source attributes (e.g., tags and definitions) and store them in the Metadata Database.
Policy decision data is transmitted to ensure end users querying data are limited to the appropriate access as defined by the policies in Immuta.
Spark plugin
In the Databricks Spark integration, the user, data source information, and query are sent to Immuta through the Spark Plugin to determine what policies need to be applied while the query is being processed. Data that travels from Immuta to the Databricks cluster could include
user attributes.
what columns to mask.
the entire predicate itself (for row-level policies).
A user runs a query against data in their environment.
The query is sent to the Immuta Web Service.
The Web Service queries the Metadata Database to obtain the policy definition, which includes data source metadata (tags, column names, etc.) and user entitlements (groups and attributes).
The policy information is transmitted to the remote data system for native policy enforcement.
Query results are displayed based on what policy definition was applied.
When enabled, statistical queries made during data source registration are distilled into summary statistics, called fingerprints. Fingerprinting allows Immuta to implement advanced privacy enhancing masking and data policies.
During this process, statistical query results and data samples (which may contain PII) are temporarily held in memory by the Fingerprint Service only for the amount of time it takes to calculate the statistics needed. For Snowflake, no data sample is needed, and only statistics about the data are returned to Immuta (no PII).
The fingerprinting process checks for new tables through schema monitoring (when enabled) and captures summary statistics of changes to data sources, including when policies were applied, external views were created, or sensitive data elements were added.
Immuta does not sample data for row-level policies
Immuta does not sample data for row-level policies; Immuta only pulls samples of data to determine if a column is a candidate for randomized response and aggregates of user-defined cohorts for k-anonymization. Both datasets only exist in memory during the computation.
Sample data is processed when k-anonymization or randomized response policies are applied to data sources.
Sample data exists temporarily in memory in the Fingerprint Service during the computation.
Raw data is processed for masking, producing either a distinct set of values or aggregated groups of values.
Immuta collects a variety of metrics and details about app usage that is stored in a single US-based region.
Data about activity within the tenant is aggregated nightly.
Aggregates create metrics (the number of policies created, number of users authenticated, number of tags created, etc.). This data is stored in our data warehouse, which resides in a single, US-based region (AWS us-east-1).
Telemetry Data (session ID, length, event properties, page views, etc.) is collected using Segment and Heap.
SaaS: This deployment option provides data access control through Immuta's integrations with automatic software updates and no infrastructure or maintenance costs.
Self-Managed: Immuta supports self-managed deployments for users who store their data on-premises or in private clouds, such as VPC. Users can connect to on-premises data sources and cloud data platforms that run on Amazon Web Services, Microsoft Azure, and Google Cloud Platform.
Immuta SaaS tenants are deployed into global segments, which are groups of cloud provider regions. Global segments are designed to help customers with data locality restrictions meet their compliance needs, as data stored for a given tenant does not leave its global segment. Each global segment is built using multiple regions for disaster recovery purposes.
The IP addresses below must be authorized in your network firewall configuration to allow Immuta to connect.
3.106.147.24
13.114.123.176
13.238.102.94
13.55.39.43
13.55.159.66
18.176.33.125
35.74.60.182
35.79.88.51
52.63.167.163
52.68.224.114
54.79.122.121
52.196.249.32
3.9.86.222
3.72.210.214
3.74.143.208
3.126.62.66
13.41.25.70
18.158.34.136
18.169.27.181
18.169.63.243
18.169.76.225
18.195.8.208
35.158.87.229
35.179.66.5
52.16.117.91
52.211.82.12
54.171.9.121
63.34.27.228
63.34.155.116
108.129.45.8
34.192.38.214
34.223.179.107
35.155.223.131
35.163.162.139
44.205.48.68
44.237.62.198
52.2.174.14
54.68.252.84
54.88.42.98
54.205.215.251
54.225.122.15
100.20.168.64
Sample data is processed and aggregated or reduced during Immuta's and specific . Note: Data Owners can see sample data when editing a data source. However, this action requires the database password, and the small sample of data visible is only displayed in the UI and is not stored in Immuta.
k-Anonymization Policies: At the time of its application, the columns of a k-anonymization policy are queried under a separate fingerprinting process that generates rules enforcing k-anonymity. The results of this query, which may contain PII, are temporarily held in memory by the Fingerprint Service. The final rules are stored in the Metadata Database as the policy definition for enforcement. Immuta requires that you .
Randomized Response Policies: If the list of substitution values for a categorical column is not part of the policy specification (e.g., when specified via the API), a list is obtained via query and merged into the policy definition in the Metadata Database. Immuta requires that you .
Audit logs include details about data access, such as who subscribes to a data source, when they access the data, and the queries they've run.
This data is stored in the tenant's Metadata Database.
This data includes user account data, such as email addresses, names, and entitlements.
This data is stored in the tenant's Metadata Database, unless a customer has opted to use an external identity provider.
This data includes column names, tags, free-text descriptions of columns, and health check results, such as row counts and high cardinality checks. Additionally, this data source metadata may include the schema, column data types, and information about the host.
This data is stored in the tenant's Metadata Database.
This data includes summary statistics regarding changes to data sources, including when policies have been applied, when external views have been created, when sensitive data elements have been added, and when users have enabled checks for new tables through schema monitoring.
This data is stored in the tenant's Metadata Database.
This data includes the metadata (such as usernames, group information, or other kinds of personal identifiers) sent to the Immuta Web Service to determine if a user has access. When such information is relevant for access determination, it may be retained as part of the policy definition.
This data is stored in the tenant's Metadata Database.
Data that is processed and aggregated/reduced as a part of the Immuta fingerprinting process and specific policy processes.
Data exists temporarily in memory in the Fingerprint Service.
This data includes tenant metrics -- statistics about activities occurring within Immuta, such as how many policies, projects, or tags have been created and how many users are authenticated within Immuta -- and user metrics, such as the user and session and event properties (user and session IDs, page views, and clicks).
This data is stored in a single, US-based region.
Asia Pacific (AP)
Sydney
Tokyo
Europe (EU)
Frankfurt
Ireland
London
North America (NA)
N. Virginia
Oregon
Immuta captures metadata and stores it in an internal PostgreSQL database (Metadata Database). Customers can encrypt the volumes backing the database using an external Key Management Service to ensure that data is encrypted at rest.
To encrypt data in transit, Immuta uses TLS protocol, which is configured by the customer.
Immuta encrypts values with data encryption keys, either those that are system-generated or managed using an external key management service (KMS). Immuta recommends a KMS to encrypt or decrypt data keys and supports the AWS Key Management Service; however, if no KMS is configured, Immuta will generate a data encryption key on a user-defined rollover schedule, using the most recent data key to encrypt new values while preserving old data keys to decrypt old values.
Immuta employs three families of functions in its masking policies:
One-way Hashing: One-way (irreversible) hashing is performed via a salted SHA256 hash. A consistent salt is used for values throughout the data source, so users can count or track the specific values without revealing the true value. Since hashed values are different across data sources, users are unable to join on hashed values. Note: Joining on masked values can be enabled in Immuta Projects.
Reversible Masking: For reversible masking, values are encrypted using AES-256 CBC encryption. Encryption is performed using a cell-specific initialization vector. The resulting values can be unmasked by an authorized user. Note that this is dynamic encryption of individual fields as results are streamed to the querying system; Immuta is not modifying records in the data store.
Reversible Format Preserving Masking: Format preserving masking maintains the format of the data while masking the value and is achieved by initializing and applying the NIST standard method FF1 at the column level. The resulting values can be unmasked by an authorized user.