Connect Data to Your Cluster
Immuta clusters use the configured metastore owner personal access token (PAT) to interact with the Unity Catalog metastore. Before registering the table as a data source in Immuta, the catalog, schema, and table being registered must be granted to the configured Unity Catalog metastore owner using one of two methods so that the table is visible to Immuta:
- automatically grant access to everything with Privilege Model 1.0. Immuta recommends upgrading the Privilege Model for Unity Catalog to 1.0. This upgrade allows administrators and owners to quickly grant access to everything in a given catalog or schema using a single grant statement. See the Databricks documentation for instructions on enabling Privilege Model 1.0.
- manually grant access to specific tables.
Automatically Grant Access in Privilege Model 1.0
Automatically grant select access to everything in a catalog by running the SQL statement below as the metastore owner or catalog owner:
GRANT USE CATALOG, USE SCHEMA, SELECT ON CATALOG mycatalog TO `email@example.com`;
Manually Grant Access
If you are not using Privilege Model 1.0, manually grant access to specific tables by running the SQL statements below as the administrator or table owner:
GRANT USE CATALOG ON CATALOG mycatalog TO `firstname.lastname@example.org`;
GRANT USAGE ON SCHEMA myschema TO `email@example.com`;
GRANT SELECT ON TABLE myschema.mytable TO `firstname.lastname@example.org`;
Register Data Sources
To register a Databricks table as an Immuta data source, Immuta requires a running Databricks cluster that it can use to determine the schema and metadata of the table in Databricks. This cluster can be either
- a non-Immuta cluster: Use a non-Immuta cluster if you have over 1,000 tables to register as Immuta data sources. This is the fastest and least error-prone method to add many data sources at a time.
- an Immuta-enabled cluster: Use an Immuta-enabled cluster if you have a few tables to register as Immuta data sources.
Limited enforcement (available until protected by policy access model) is not supported
You must set
false in your cluster policies manually or
by selecting Protected until made available by policy
in the Databricks integration section of the App Settings page. See the
Databricks Spark integration with Unity Catalog support limitations
Once your cluster is running,
- Register your data from your non-Immuta or Immuta-enabled cluster.
- If you used a non-Immuta cluster, convert the cluster to an Immuta cluster with Immuta cluster policies once data sources have been created.
Note: When the Unity Catalog integration is enabled, a schema must be specified when registering data sources
backed by tables in the legacy