> For the complete documentation index, see [llms.txt](https://documentation.immuta.com/2024.2/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://documentation.immuta.com/2024.2/data-and-integrations/databricks-spark.md).

# Databricks Spark

This integration enforces policies on Databricks tables registered as data sources in Immuta, allowing users to query policy-enforced data on Databricks clusters (including job clusters). Immuta policies are applied to the plan that Spark builds for users' queries, all executed directly against Databricks tables.

The guides in this section outline how to integrate Databricks with Immuta.

## How-to guides

* [Databricks configuration](/2024.2/data-and-integrations/databricks-spark/how-to-guides/configuration/simplified.md): Configure the Databricks Spark integration.
* [DBFS access](/2024.2/data-and-integrations/databricks-spark/how-to-guides/access-dbfs.md): Access DBFS in Databricks for non-sensitive data.
* [Limited enforcement in Databricks](/2024.2/data-and-integrations/databricks-spark/how-to-guides/limited-enforcement-scope.md): Allow Immuta users to access tables that are not protected by Immuta.
* [Hiding the Immuta database in Databricks](/2024.2/data-and-integrations/databricks-spark/how-to-guides/hide-immuta-database.md): Hide the Immuta database from users in Databricks, since user queries do not need to reference it.
* [Run spark-submit jobs on Databricks](/2024.2/data-and-integrations/databricks-spark/how-to-guides/spark-submit.md): Run R and Scala `spark-submit` jobs on your Databricks cluster.
* [Project UDFs cache settings](/2024.2/data-and-integrations/databricks-spark/how-to-guides/project-udfs.md): Raise the caching on-cluster and lower the cache timeouts for the Immuta web service to allow use of project UDFs in Spark jobs.
* [External metastores](/2024.2/data-and-integrations/databricks-spark/how-to-guides/external-metastores.md): Use an existing Hive external metastore instead of the built-in metastore.

## Reference guides

* [Databricks Spark integration reference guide](/2024.2/data-and-integrations/databricks-spark/reference-guides/databricks.md): This guide describes the design and components of the integration.
* Configuration settings: These guides describe various integration settings that can be configured, including [environment variables](/2024.2/data-and-integrations/databricks-spark/reference-guides/configuration-settings/configuration.md), cluster policies, and [performance](/2024.2/data-and-integrations/databricks-spark/reference-guides/configuration-settings/security-config-for-performance.md).
* [Databricks change data feed](/2024.2/data-and-integrations/databricks-spark/reference-guides/change-data-feed.md): This guide describes Immuta's support of Databricks change data feed.
* [Databricks libraries](/2024.2/data-and-integrations/databricks-spark/reference-guides/databricks-libraries.md): The trusted libraries feature allows Databricks cluster administrators to avoid Immuta security manager errors when using third-party libraries. This guide describes the feature and its configuration.
* [Delta Lake API](/2024.2/data-and-integrations/databricks-spark/reference-guides/delta-lake-api.md): When using Delta Lake, the API does not go through the normal Spark execution path. This means that Immuta's Spark extensions do not provide protection for the API. To solve this issue and ensure that Immuta has control over what a user can access, the Delta Lake API is blocked. This reference guide outlines the Spark SQL options that can be substituted for the Delta Lake API.
* [Spark direct file reads](/2024.2/data-and-integrations/databricks-spark/reference-guides/direct-file-reads.md): Immuta allows direct file reads in Spark for file paths. This guide describes that process.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://documentation.immuta.com/2024.2/data-and-integrations/databricks-spark.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.