Connect Data to Your Cluster
Immuta clusters use the configured metastore owner personal access token (PAT) to interact with the Unity Catalog metastore. Before registering the table as a data source in Immuta, the catalog, schema, and table being registered must be granted to the configured Unity Catalog metastore owner using one of two methods so that the table is visible to Immuta:
automatically grant access to everything with Privilege Model 1.0. Immuta recommends upgrading the Privilege Model for Unity Catalog to 1.0. This upgrade allows administrators and owners to quickly grant access to everything in a given catalog or schema using a single grant statement. See the Databricks documentation for instructions on enabling Privilege Model 1.0.
Automatically Grant Access in Privilege Model 1.0
Automatically grant select access to everything in a catalog by running the SQL statement below as the metastore owner or catalog owner:
Manually Grant Access
If you are not using Privilege Model 1.0, manually grant access to specific tables by running the SQL statements below as the administrator or table owner:
Register Data Sources
To register a Databricks table as an Immuta data source, Immuta requires a running Databricks cluster that it can use to determine the schema and metadata of the table in Databricks. This cluster can be either
a non-Immuta cluster: Use a non-Immuta cluster if you have over 1,000 tables to register as Immuta data sources. This is the fastest and least error-prone method to add many data sources at a time.
an Immuta-enabled cluster: Use an Immuta-enabled cluster if you have a few tables to register as Immuta data sources.
Limited enforcement (available until protected by policy access model) is not supported
You must set IMMUTA_SPARK_DATABRICKS_ALLOW_NON_IMMUTA_READS
and IMMUTA_SPARK_DATABRICKS_ALLOW_NON_IMMUTA_WRITES
to false
in your cluster policies manually or by selecting Protected until made available by policy in the Databricks integration section of the App Settings page. See the Databricks Spark integration with Unity Catalog support limitations for details.
Once your cluster is running,
Register your data from your non-Immuta or Immuta-enabled cluster.
If you used a non-Immuta cluster, convert the cluster to an Immuta cluster with Immuta cluster policies once data sources have been created.
Note: When the Unity Catalog integration is enabled, a schema must be specified when registering data sources backed by tables in the legacy hive_metastore
.