# Databricks Spark Integration Configuration

The [Databricks Spark integration is one of two integrations](#user-content-fn-1)[^1] Immuta offers for Databricks.

In this integration, Immuta installs an Immuta-maintained Spark plugin on your Databricks cluster. When a user queries data that has been registered in Immuta as a data source, the plugin injects policy logic into the plan Spark builds so that the results returned to the user only include data that specific user should see.

The reference guides in this section are written for Databricks administrators who are responsible for setting up the integration, securing Databricks clusters, and setting up users:

* [Installation and compliance](/SaaS/configuration/integrations/databricks/databricks-spark/reference-guides/databricks/installation-and-compliance.md): This guide includes information about what Immuta creates in your Databricks environment and securing your Databricks clusters.
* [Customizing the integration](/SaaS/configuration/integrations/databricks/databricks-spark/reference-guides/databricks/customizing-the-integration.md): Consult this guide for information about customizing the Databricks Spark integration settings.
* [Setting up users](/SaaS/configuration/integrations/databricks/databricks-spark/reference-guides/databricks/setting-up-users.md): Consult this guide for information about connecting data users and setting up user impersonation.
* [Spark environment variables](/SaaS/configuration/integrations/databricks/databricks-spark/reference-guides/databricks/configuration.md): This guide provides a list of Spark environment variables used to configure the integration.
* [Ephemeral overrides](/SaaS/configuration/integrations/databricks/databricks-spark/reference-guides/databricks/ephemeral-overrides.md): This guide describes [ephemeral overrides](#user-content-fn-2)[^2] and how to configure them to reduce the risk that a user has overrides set to a cluster (or multiple clusters) that aren't currently up.

[^1]: The Databricks Spark integration should be used if your data still exists in the Hive metastore. See the [Which integration should you use](/SaaS/configuration/integrations/databricks.md#which-integration-should-you-use) section for an overview of the two integrations and when to use each one.

[^2]: In the context of the Databricks Spark integration, Immuta uses the term *ephemeral* to describe data sources where the associated compute resources can vary over time. This means that the compute bound to these data sources is not fixed and can change. All Databricks data sources in Immuta are ephemeral.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.immuta.com/SaaS/configuration/integrations/databricks/databricks-spark/reference-guides/databricks.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
