Getting Started with Databricks Spark
The how-to guides linked on this page illustrate how to integrate Databricks Spark with Immuta.
Requirements
If Databricks Unity Catalog is enabled in a Databricks workspace, you must use an Immuta cluster policy when you set up the Databricks Spark integration to create an Immuta-enabled cluster.
If Databricks Unity Catalog is not enabled in your Databricks workspace, you must disable Unity Catalog in your Immuta tenant before proceeding with your configuration of Databricks Spark:
Navigate to the App Settings page and click Integration Settings.
Uncheck the Enable Unity Catalog checkbox.
Click Save.
Connect your technology
These guides provide instructions for getting your data set up in Immuta.
Organize your data sources into domains and assign domain permissions to accountable teams (recommended): Use domains to segment your data and assign responsibilities to the appropriate team members. These domains will then be used in policies, audit, and sensitive data discovery.
Register your users
These guides provide instructions on setting up your users in Immuta.
Integrate an IAM with Immuta: Connect the IAM your organization already uses and allow Immuta to register your users for you.
Map external user IDs from Databricks to Immuta: Ensure the user IDs in Immuta, Databricks, and your IAM are aligned so that the right policies impact the right users.
Add data metadata
These guides provide instructions on getting your data metadata set up in Immuta for use in policies.
Connect an external catalog: Connect the external catalog your organization already uses and allow Immuta to continually sync your tags with your data sources for you.
Run sensitive data discovery: Sensitive data discovery (SDD) allows you to automate data tagging using identifiers that detect certain data patterns.
Protect and monitor data access
These guides provide instructions on authoring policies and auditing data access.
Author a global subscription policy: Once you add your data metadata to Immuta, you can immediately create policies that utilize your tags and apply to your tables. Subscription policies can be created to dictate access to data sources.
Author a global data policy: Data metadata can also be used to create data policies that apply to data sources as they are registered in Immuta. Data policies dictate what data a user can see once they are granted access to a data source. Using catalog and SDD tags you can create proactive policies, knowing that they will apply to data sources as they are added to Immuta with the automated tagging.
Configure audit: Once you have your data sources and users, and policies granting them access, you can set up audit export. This will export the audit logs from user queries, policy changes, and tagging updates.
Last updated
Was this helpful?