Immuta Detect is a tool that monitors your data environment and provides analytic dashboards in the Immuta UI based on your data use. These dashboards offer visualizations of audit events, including user queries and when Discover classification is enabled, the sensitivity of those queries, data sources, and columns. It can work within your current Immuta integration.
Immuta Detect continually monitors your data environment to help answer questions about your most active data users, the most accessed data, and the events happening within your data environment. Detect can provide even more value with Discover classification enabled to answer questions about the sensitive data accessed by your users and the tables that contain sensitive data. Because of this information, your organization can do the following:
Meet compliance requirements more effectively
Quickly decide what data access is allowed for what purposes
Reduce the effort and time to respond to auditors about data access in your company
Reduce the effort of classifying data within the scope of security or regulatory compliance frameworks
Recommended: Use Discover classification when using a Snowflake integration.
You have the option to use Immuta Detect on its own or, if you are using a Snowflake integration, to enable Discover to classify your data. There are benefits to both, but for the fullest functionality, greatest value, and best experience, it is recommended to enable and tune classification.
Only available with Snowflake integrations.
Dashboards with data activity patterns for data sources and users
Dynamic query sensitivity on joined tables calculate sensitivity based on the columns queried and their toxicity when joined
Dashboards to help users find the most recently accessed data sources and active columns
Immuta Detect uses several features of the Immuta platform to create user-friendly dashboards that are always available in the UI and do not need to be generated like Immuta reports. These dashboards are created by combining Snowflake audit events from registered users and the sensitivity of your data. Audit information and events are gathered from the Snowflake ACCOUNT_USAGE
views into Immuta Detect. Additionally, Immuta Discover calculates the sensitivity of your data using Immuta built-in frameworks: the Data Security Framework and Risk Assessment Framework, which find sensitive data on a column-by-column basis using tags applied by SDD. Once Immuta does this work behind the scenes, users with the AUDIT
permission will see dashboards that show the sensitive data within your organization’s data environment and what users are accessing that data.
With Discover classification enabled, Immuta qualifies both columns and queries as the following sensitivity types in the dashboards:
Highly sensitive: Includes data that can cause severe harm or loss with inappropriate access or misuse.
Sensitive: Includes personal data and data that could cause harm or loss with inappropriate access or misuse.
Non-sensitive: Includes publicly available information or data that would not typically cause harm or loss if disclosed.
Indeterminate: The sensitivity of the data is unknown. Immuta deems sensitivity indeterminate because of an error in the query or because the sensitive data discovery (SDD) or classification has not completed processing at the time the query was run.
How does Immuta determine column sensitivity?
Column sensitivity is determined by the classification tags applied to the columns by the frameworks. The classification tags contain sensitivity metadata.
How does Immuta determine query sensitivity?
For queries that read from a single table, query sensitivity is determined by the column with the highest sensitivity in the query
For a query that joins tables, Immuta uses the same classification rules applied to tables and applies those rules to columns of the query. Immuta applies a new set of classification tags to the query columns and calculates sensitivity for the query event in the audit record. These query classification tags are not included on the tables' data dictionary.
Quicker and easier onboarding experience
Dashboards with data activity patterns for data sources and users
Dashboards to help users find the most recently accessed data sources and active columns
Immuta Detect uses several features of the Immuta platform to create user-friendly dashboards that are always available in the UI and do not need to be generated like Immuta reports. These dashboards are created from audit information and events gathered from Snowflake, Databricks Spark, and Databricks Unity Catalog into Immuta Detect. Immuta pulls audit information from Snowflake and Databricks Spark for data sources and users registered in Immuta; for Databricks Unity Catalog, Immuta pulls in audit information for all users and tables. Users with the AUDIT
permission will see dashboards that show the data events within your organization’s data environment and what users are accessing that data.
Immuta Detect provides at-a-glance dashboards to monitor change in user activity, data access, and security posture.
[Immuta’s universal audit model (UAM) and export features allow you to export the full audit logs to S3 and ADLS for long-term backup and processing with log data processors and tools. This capability fosters convenient integrations with log monitoring services and data pipelines.
With UAM, you can specify an S3 bucket destination where Immuta will periodically export audit logs. The events captured are only events relevant to user and system actions that affect Immuta or the integrated data platforms, such as creating data sources and running queries.
The Detect dashboard shows near real-time events for Immuta events, such as login, policy changes, and data platform policy changes. Query events are ingested from Snowflake and Databricks once a day, but you can manually trigger an immediate query retrieval by using the ↻Native Query Audit button on the Audit page or the Load Audit Events button on the Audit page. To update your automatic query retrieval, edit your integration.
The most recent query history that is available to Immuta Detect depends on the underlying data platform latency. For example, there is up to three hours of latency between an executed query and recording the event on the Snowflake data platform side.
Detect with Databricks Spark and Databricks Unity Catalog does not support using Discover classification to determine query sensitivity at this time.
Unity Catalog native query audit brings in audit information for all tables and data sources, so some audit logs are created from activity by users not registered in Immuta. These audit records will appear in Immuta, providing valuable information of activity, with the username Unknown. This can be seen on the audit page or in user and data activity dashboards.
While the Immuta user is unknown, the user's Databricks Unity Catalog username can be found within the audit log. To view the user's data platform username:
Navigate to the event page.
Select View JSON.
The username can be found in the auditPayload.technologyContext.account.
username
field.
To improve your future audit records, ensure these users are properly registered and can be named in the logs:
If you have not registered any users, pull in users from your IAM.
If you have registered users but this user was missed, manually create the Immuta user.
If this user is in Immuta but not appearing in the audit record, map the user's Databricks username into Immuta.
The Immuta Detect dashboards are the visual answers to the questions about who is accessing data. Additionally, with data classification enabled, you will have the visual answers of how much of your data and the data being accessed is sensitive. Immuta Detect offers several dashboards to help you find the information you need which can be filtered or set to a specific date range by the viewer.
Note that data classification to determine sensitivity is not supported with the Databricks Spark or Databricks Unity Catalog integrations.
The overview dashboard is available in the data tab of the UI. It provides a quick overview of your global data environment that allows you to determine the number of queries and data sources at a glance and includes the following:
A graph of the number of queries on all data sources for a set time
The total active data sources in your environment
The total tags that were applied to the columns of the data sources in your environment within the selected time frame
The total number of queries for the set date range
The total number of users who have made queries during the dashboard's specified date range
A table of the most accessed data sources in your environment, the number of users who were accessing them, and the total queries on them
With classification enabled to determine the sensitivity, the dashboard will reflect the sensitivity of the queries and data:
The graph will reflect not just the number of queries by the sensitivity of them as well.
An activity indicator to visualize the sensitivity of the data within data source is added to the table of the most accessed data sources.
Each data source has a dashboard available when you select the data source on the data tab of the UI. This dashboard allows you to quickly determine what data has been accessed in a single data source over time and includes the following:
A graph of the number of queries on the data source for a set time
The current active policies on the data source
The number of column tags that were applied to the data dictionary within the selected time frame
The total queries on the data source
The total number of users who have made queries on the data source during the dashboard's specified date range
A table of the most recent queries, the user making the queries, and the number of columns they saw in the query
Dashboards for Snowflake data sources will also include a table of the columns most actively queried.
With classification enabled to determine the sensitivity, the dashboard will reflect the sensitivity of the queries and columns:
The graph will reflect not just the number of queries by the sensitivity of them as well.
An activity indicator to visualize the sensitivity of the data within queries is added to the table of the most recent queries.
An activity indicator to visualize the sensitivity of the data within columns is added to the table of the most actively queried columns.
The audit dashboard is available in the audit tab of the UI. It provides a single table with all of the authentication and query audit events for your Immuta instance. Each row of the table represents a single audit event and includes the following:
A link to the individual event dashboard through the Event Id
The actor, which is the user who completed the event and a link to their activity summary
The action, which is the event the audit record represents
The target, which is the data source that was affected by the event
The time the event happened
The outcome of whether the event was successful or not
With classification enabled to determine the sensitivity, information about the sensitivity of the events will be added to the dashboard through an activity indicator that will visualize the sensitivity of the events.
Query audits each have a dashboard when you select the event ID listed in the audit page. This dashboard allows you to quickly understand information about the data accessed by the query and includes the following:
Details about the audit event
Details about the actor or user who made the query including a link to their user page, their username, and the user agent where they accessed data
If the actor is not registered in Immuta and the actor is listed as "unknown," their data platform username is included.
Details about the query itself: the data platform, the text, how long the query took, and when the query was made
A data source tab with information on the data sources queried, the tags applied to the data sources, etc.
Dashboards for Snowflake query audits will also include a column tab with information on the data sources queried, the columns within the data sources, and the tags applied to the columns.
With classification enabled to determine the sensitivity, the dashboard will reflect the sensitivity of the query and data through a visualization of the sensitivity of the data found by the query.
The following events are captured in UAM and will also show a detailed audit page with information about the event, the users involved, and the targets affected by the event:
Attribute events
Data source events
License events
Purpose events
Tag events
User management events
Webhook events
The people overview dashboard is available in the people tab of the UI. It allows you to view when users are most active in your data environment and easily see spikes or anomalies. This dashboard includes the following:
A graph of how many users are active during specific times
The total number of queries
The total number of users who have made queries during the dashboard's specified date range
A table of the most active users in your data environment with details about the number of queries they are running.
With classification enabled to determine the sensitivity, the dashboard will reflect the sensitivity of the queries:
A count of the total number of sensitive queries
The addition of the number of sensitive queries to the most active users table
Each user has a dashboard when you select the full name from the people overview dashboard. This dashboard allows you to quickly determine how frequently users are querying data and includes the following:
A graph of the number of queries the user has made for a set time range
The total number of queries that user has made in the time range
The number of tables the user has queried
A table of the most recent queries the user has made, which will have different content based on the integration type:
For organizations using a Snowflake integration, the table will include a link to the audit event dashboards, a link to the data source dashboard, the number of rows and columns in each query, and a timestamp.
For organizations using a Databricks Spark or Databricks Unity Catalog integration, the table will include a link to the audit event dashboards, a link to the data source dashboard, and a timestamp.
With classification enabled to determine the sensitivity, the dashboard will reflect the sensitivity of the queries:
The graph will reflect not just the number of queries by the sensitivity of them as well.
A count of the total number of sensitive queries is added to the dashboard.
An activity indicator to visualize the sensitivity of the data within queries is added to the table of the most recent queries.