> For the complete documentation index, see [llms.txt](https://documentation.immuta.com/latest/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://documentation.immuta.com/latest/configuration/integrations/databricks/databricks-spark/how-to-guides/access-dbfs.md).

# DBFS Access

This page outlines how to enable access to DBFS in Databricks for non-sensitive data. Databricks administrators should place the desired configuration in the Spark environment variables.

## DBFS FUSE mount

This Databricks feature mounts DBFS to the local cluster filesystem at `/dbfs`. Although disabled when using process isolation, this feature can safely be enabled if raw, unfiltered data is not stored in DBFS and all users on the cluster are authorized to see each other’s files. When enabled, the entirety of DBFS essentially becomes a scratch path where users can read and write files in `/dfbs/path/to/my/file` as though they were local files.

{% hint style="info" %}
**DBFS FUSE mount limitation**: This feature cannot be used in environments with E2 Private Link enabled.
{% endhint %}

For example,

```bash
%sh echo "I'm creating a new file in DBFS" > /dbfs/my/newfile.txt
```

In Python,

```python
%python
with open("/dbfs/my/newfile.txt", "w") as f:
  f.write("I'm creating a new file in DBFS")
```

*Note: This solution also works in R and Scala.*

### Enable DBFS FUSE mount

To enable the DBFS FUSE mount, set this configuration in the Spark environment variables: IMMUTA`_SPARK_DATABRICKS_DBFS_MOUNT_ENABLED=true`.

{% hint style="info" %}
**Mounting a bucket**

* Users can [mount additional buckets to DBFS](https://docs.databricks.com/data/data-sources/aws/amazon-s3.html#mount-aws-s3) that can also be accessed using the FUSE mount.
* Mounting a bucket is a one-time action, and the mount will be available to all clusters in the workspace from that point on.
* Mounting must be performed from a non-Immuta cluster.
  {% endhint %}

## Scala DBUtils (and %fs magic) with scratch paths

Scratch paths will work when performing arbitrary remote filesystem operations with fs magic or Scala dbutils.fs functions. For example,

```bash
%fs put -f s3://my-bucket/my/scratch/path/mynewfile.txt "I'm creating a new file in S3"
%scala dbutils.fs.put("s3://my-bucket/my/scratch/path/mynewfile.txt", "I'm creating a new file in S3")
```

### Configure Scala DBUtils (and %fs magic) with scratch paths

To support %fs magic and Scala DBUtils with scratch paths, configure

```xml
<property>
   <name>immuta.spark.databricks.scratch.paths</name>
   <value>s3://my-bucket/my/scratch/path</value>
</property>
```

### Configure DBUtils in Python

To use `dbutils` in Python, set this configuration: `immuta.spark.databricks.py4j.strict.enabled=false`.

#### Example workflow

This section illustrates the workflow for getting a file from a remote scratch path, editing it locally with Python, and writing it back to a remote scratch path.

```python
%python
import os
import shutil

s3ScratchFile = "s3://some-bucket/path/to/scratch/file"
localScratchDir = os.environ.get("IMMUTA_LOCAL_SCRATCH_DIR")
localScratchFile = "{}/myfile.txt".format(localScratchDir)
localScratchFileCopy = "{}/myfile_copy.txt".format(localScratchDir)
```

1. Get the file from remote storage:

   ```python
   dbutils.fs.cp(s3ScratchFile, "file://{}".format(localScratchFile))
   ```
2. Make a copy if you want to explicitly edit `localScratchFile`, as it will be read-only and owned by root:

   ```python
   shutil.copy(localScratchFile, localScratchFileCopy)
   with open(localScratchFileCopy, "a") as f:
       f.write("Some appended file content")
   ```
3. Write the new file back to remote storage:

   ```python
   dbutils.fs.cp("file://{}".format(localScratchFileCopy), s3ScratchFile)
   ```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://documentation.immuta.com/latest/configuration/integrations/databricks/databricks-spark/how-to-guides/access-dbfs.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.