Skip to content

Getting Started with Databricks Unity Catalog

The how-to guides linked on this page illustrate how to integrate Databricks Unity Catalog with Immuta and gain value from the Immuta modules: Detect, Discover, and Secure.

While each module can be used on its own, together they provide a thorough and effective data security platform that secures your data through governance policies and discovers what data types and sensitive data should be secured, driving more effective governance. Additionally, once your data is secure, your users' activity can be detected and monitored to ensure risky user access is caught and addressed through better policies. Complete all the sections below to onboard with all three modules, or see the Detect use case as an entry point to configuring Immuta.

Requirements:

  • Unity Catalog metastore created and attached to a Databricks workspace. Immuta supports configuring a single metastore for each configured integration, and that metastore may be attached to multiple Databricks workspaces.
  • Unity Catalog enabled on your Databricks cluster or SQL warehouse. All SQL warehouses have Unity Catalog enabled if your workspace is attached to a Unity Catalog metastore.

Configure your Databricks Unity Catalog integration

Configuring a Databricks Unity Catalog integration is required for Detect, Discover, and Secure. These guides provide information on the recommended features to enable with Databricks Unity Catalog, or see the Detect use case for a comprehensive guide on the benefits of these features and other recommendations.

  1. Configure your Unity Catalog integration with the following feature enabled: Native query audit (enabled by default)

  2. Select None as your default subscription policy.

  3. Integrate an IAM with Immuta.

  4. Map external user IDs from Unity Catalog to Immuta.

Detect your user activity

These guides provide step-by-step instructions for auditing and detecting your users' activity, or see the Detect use case for a comprehensive guide on the benefits of these features and other recommendations.

  1. Set up audit export to S3 or ADLS Gen2 for your Databricks Unity Catalog audit logs.
  2. View the Detect dashboards to see the activity of your users on Databricks Unity Catalog tables.
  3. Set up monitors to monitor user activity and send notifications when users' activity passes a certain threshold.

Discover your data

These guides provide step-by-step instructions for discovering, classifying, and tagging your data.

  1. Enable sensitive data discovery (SDD).
  2. Register a subset of your tables to configure and validate SDD.
  3. Configure SDD to discover entities of interest for your policy needs.
  4. Validate that the SDD tags are applied correctly.
  5. Register your remaining tables at the schema level with schema monitoring turned on.
  6. Implement classification to categorize and tag sensitive data.

Secure your data

These guides provide step-by-step instructions for configuring and securing your data with governance policies, or see the Secure use cases for a comprehensive guide on creating policies to fit your organization's use case.

  1. Create a global subscription policy.
  2. Create a global data policy.
  3. Validate the policies. You do not have to validate every policy you create in Immuta; instead, examine a few to validate the behavior you expect to see.
  4. Once all Immuta policies are in place, remove or alter old permissions and revoke access to the ungoverned tables.