Skip to content

Immuta Data Access Patterns

Audience: Data Owners, Data Users, and System Administrators

Content Summary: The Immuta data control plane does not require users to learn a new API or language to access data exposed there. Immuta plugs into existing tools and ongoing work while remaining completely invisible to downstream consumers by exposing the data through these foundational access patterns.

Azure Synapse Analytics

The Dynamic Azure Synapse Analytics access pattern allows Immuta to apply policies directly in Azure Synapse Analytics dedicated SQL pools without the need for users to go through the Immuta Query Engine. Users can work within their existing Synapse Studio and have per user policies dynamically applied at query time.

Users can configure multiple native Azure Synapse Analytics integrations, which give users direct access to views in a Dedicated SQL Pool in Synapse Studio.

Databricks

This native integration makes Databricks data sources exposed in Immuta available as tables in a Databricks cluster, and users can then query these data sources through their Notebook. Like other integrations, policies are applied to the plan that Spark builds for a user's query and all data access is native.

Databricks SQL (Public Preview)

Databricks SQL provides a simple experience for SQL users who want to run quick ad hoc queries on their data lake, create multiple visualization types to explore query results from different perspectives, and build and share dashboards.

Immuta's native Databricks SQL integration creates policy-enforced views in users' Databricks SQL environment that they can access.

For an overview of this access pattern and its architecture, see this Databricks SQL Integration page. For installation instructions, see the Databricks SQL Integration tutorial.

Immuta Query Engine

Users are provided a basic Immuta PostgreSQL connection. The tables within this connection represent all the connected data across your organization. Those tables, however, are virtual tables, completely empty until a query is run. At query time the SQL is proxied through the virtual Immuta table down to the native database while enforcing the policy automatically. The Immuta SQL connection can be used within any Business Intelligence (BI) tool or integrated directly into code for interactive analysis.

Redshift

With the native Redshift access pattern, Immuta applies policies directly in Redshift. This allows data analysts query their data natively with Redshift instead of going through the Immuta Query Engine.

S3

Immuta supports an S3-style REST API, which allows users to communicate with Immuta the same way they would with S3. Consequently, Immuta easily integrates with tools users may already be using to work with S3.

S3 Access in Spark and Databricks

Immuta supports accessing object-backed data sources using an is3a file system with Spark and Databricks.

Snowflake

The Snowflake integration is different based on your Snowflake Edition:

  • Snowflake Integration Using Snowflake Governance Features: With this integration, policies administered in Immuta are pushed down into Snowflake as Snowflake Governance features (row access policies and masking policies). This integration is Public Preview and requires Snowflake Enterprise Edition or higher.
  • Snowflake Integration Without Snowflake Governance Features: With this integration, policies administered by Immuta are pushed down into Snowflake as views with a 1-to-1 relationship to the original table and all policy logic is contained in that view.

SparkSQL

Users are able to access subscribed data sources within their Spark Jobs by utilizing Spark SQL with the ImmutaContext class. All tables are virtual and are not populated until a query is materialized. When a query is materialized, data from metastore backed data sources, such as Hive and Impala, will be accessed using standard Spark libraries to access the data in the underlying files. All other data source types will access data using the Query Engine which will proxy the query to the native database technology. Policies for each data source will be enforced automatically.

Trino

The Trino (previously PrestoSQL) access pattern enables Immuta to apply policies directly in Starburst and other Trino based clusters without users going through the Immuta Query Engine. This means users can use their existing Trino tooling (querying, reporting, etc.) and have per-user policies dynamically applied at query time.

dbt Cloud Integration

While it is not a data access pattern, Immuta's dbt Cloud integration allows users to connect their data sources from various access patterns into Immuta using dbt Cloud. Integrating your data sources through dbt Cloud allows Immuta to keep your data sources in sync, while also populating the data source details through jobs run in dbt Cloud.