Skip to content

Databricks Access Pattern

Audience: Data Owners and Data Users

Content Summary: This page provides an overview of the Databricks access pattern.

Installation Details: If you are using the Immuta free trial, please configure Databricks using the Databricks quickstart button in the left sidebar of your Immuta instance:

Databricks Quickstart Button

Otherwise, see the Databricks Installation Guide.

Overview

This native integration makes Databricks data sources exposed in Immuta available as tables directly in Databricks under the 'immuta' database on the cluster, and users can then query these data sources through their Notebook. Like other integrations, policies are applied to the plan that Spark builds for a user's query and enforced live on cluster and does NOT create a new copy of the data.

Using Immuta with Databricks

Mapping Users

Usernames in Immuta must match usernames in Databricks. It is best practice to use the same identity manager for Immuta that you use for Databricks (Immuta supports all common identity manager protocols); however, for free trial users, it’s easiest to just ensure usernames match between systems.

Configuring Tables

The cluster must have Databricks table ACLs enabled with only administrators having access to the original tables or views. For example, the following table would only be available to the cluster administrator due to table ACLs:

Table ACLs

Then, using the Immuta UI (or API), users would register this taxi_trip table in Immuta. For more details, see this Databricks Data Source Creation Tutorial.

Note: In the Immuta July 2020 release, we will no longer require table ACLs or the separate immuta database; you will be able to manage and query the tables in-place in their original database without changes to downstream queries or requiring that first manual GRANT step.

Database Access

After the data source is created in Immuta and you have granted users access (through table ACLs) to the immuta database, users can access that table directly in Databricks. This is just an initial one-time opening of the database; actual table controls (described below) are managed through Immuta. For example, to provide access to the immuta database to all users, you would run the following command (Note that you can do this for individual users, too):

Database Access

Table Access

Users can access tables once they are provided access to the data source through Immuta Subscription Policies configured in the UI. Additionally, users can be manually added to the data source from the Members tab.

Fine-grained Access Control

After the data source has been created in Immuta, you can build other Data Policies, such as column masking techniques or row-level security, on the table to restrict what the user sees in the table.

Limitations

  • Databricks Connect is not currently supported.