Getting Started with Databricks Spark

The how-to guides linked on this page illustrate how to integrate Databricks Spark with Immuta.

Requirements

If Databricks Unity Catalog is enabled in a Databricks workspace, you must use an Immuta cluster policy when you set up the Databricks Spark integration to create an Immuta-enabled cluster.
If Databricks Unity Catalog is not enabled in your Databricks workspace, you must disable Unity Catalog in your Immuta tenant before proceeding with your configuration of Databricks Spark:
1. Navigate to the App Settings page and click Integration Settings.
2. Uncheck the Enable Unity Catalog checkbox.
3. Click Save.

Connect your technology

These guides provide instructions for getting your data set up in Immuta.

Register your users

These guides provide instructions on setting up your users in Immuta.

Integrate an IAM with Immuta: Connect the IAM your organization already uses and allow Immuta to register your users for you.
Map external user IDs from Databricks to Immuta: Ensure the user IDs in Immuta, Databricks, and your IAM are aligned so that the right policies impact the right users.

Add data metadata

These guides provide instructions on getting your data metadata set up in Immuta for use in policies.

Connect an external catalog: Connect the external catalog your organization already uses and allow Immuta to continually sync your tags with your data sources for you.
Run identification: Identification allows you to automate data tagging using identifiers that detect certain data patterns.

Protect and monitor data access

These guides provide instructions on authoring policies and auditing data access.

Author a global subscription policy: Once you add your data metadata to Immuta, you can immediately create policies that utilize your tags and apply to your tables. Subscription policies can be created to dictate access to data sources.
Author a global data policy: Data metadata can also be used to create data policies that apply to data sources as they are registered in Immuta. Data policies dictate what data a user can see once they are granted access to a data source. Using catalog and identification tags you can create proactive policies, knowing that they will apply to data sources as they are added to Immuta with the automated tagging.
Configure audit: Once you have your data sources and users, and policies granting them access, you can set up audit export. This will export the audit logs from user queries, policy changes, and tagging updates.

Last updated 1 month ago

Was this helpful?