Skip to content

Native Databricks SQL Analytics Integration (Public Preview)

Audience: System Administrators

Content Summary: This page provides a tutorial for enabling the native Databricks SQL Analytics integration in Immuta. For an overview of the access pattern, see the Data Access Patterns documentation. Native SQL with Databricks SQL Analytics is currently in Public Preview. Please provide feedback on any issues you encounter, as well as insight regarding how you would like this feature to evolve in the future.

Prerequisites

  • A functional Databricks SQL Analytics environment: For guidance in setting up and using a Databricks SQL Analytics environment, see the Get started with Databricks SQL guide in the Databricks documentation.
  • Databricks personal access token: Your organization's SQL Analytics administrator must generate a Databricks personal access token that will allow users to authenticate to the Databricks REST API and Immuta to connect to SQL endpoints. Databricks will only display this personal access token once, so be sure to copy and save it. If an administrator does not generate the token, it will not carry appropriate privileges to allow Immuta to create the Immuta database inside Databricks SQL Analytics when the integration is enabled and an error will be displayed in the Immuta UI.

1 - Enable Databricks SQL Analytics in Immuta

  1. Log in to Immuta and click the App Settings icon in the left sidebar.
  2. Click Native Integrations in the Configuration panel on the left.

    Native Integration Button

  3. Click + Add Native Integration and select Databricks SQL Analytics (Public Preview) from the dropdown menu.

    Add Databricks SQL

  4. In Databricks, navigate to the Databricks SQL Analytics page in your Databricks workspace, click Endpoints, and then click the name of the SQL Analytics endpoint you want to configure in Immuta.

  5. Use the information on the Connection Details page to fill in the following information in the Immuta UI:

    • Host: Use the Server Hostname from Databricks (e.g., https://company.cloud.databricks.com)
    • HTTP Path: Use the HTTP Path from Databricks (e.g., /sq/1.0/endpoints/fff6d6eb3a9718cf9)
  6. The value in the Immuta Database field will be the name of the database that Immuta creates in Databricks SQL Analytics. Opt to change the default name, provided it doesn’t introduce a naming collision in your Databricks environment.

  7. Enter the personal access token that was generated by a SQL Analytics administrator (not a user), and then click Test Databricks SQL Analytics Connection.

    Personal Access Token

  8. Click Save. Note that if you enter a personal access token that was generated by a SQL Analytics user, you won't be able to save the configuration successfully.

Once Databricks SQL Analytics has been successfully enabled in Immuta, Immuta will perform the following automated tasks:

  • Create an Immuta database.
  • Revoke all privileges from users on the Immuta database.
  • Grant usage and select privileges to users on the Immuta database.
  • Create a system table on the Immuta database called <immuta_database_name>.__immuta_profiles.
  • Deny SELECT on <immuta_database_name>.__immuta_profiles to users.
  • Create a view called <immuta_database_name>.__immuta_profiles, which is equivalent to SELECT * FROM <immuta_database_name>.__immuta_profiles WHERE immuta__userid = current_user.

2 - Add Databricks SQL Analytics Users

Add your SQL Analytics user accounts in Databricks SQL Analytics and give them access to the SQL Analytics endpoint as you normally would in Databricks.

3 - Add a Databricks Data Source in Immuta

Warning

Immuta requires an underlying data source in SQL Analytics to have an owner. To test if an object has an owner, run SHOW GRANT ON <object-name>. If you do not see an entry with ActionType OWN, the object does not have an owner. When table access control is disabled on a cluster or SQL endpoint, owners are not registered when a database, table, or view is created. You must either enable table access control on your cluster and SQL endpoint, or an admin must assign an owner to the object.

To assign an owner to the object, run the following command:

ALTER TABLE <object-name> OWNER TO `<user-name>@<user-domain>.com`;

To add Databricks data sources in Immuta, follow this tutorial.