> For the complete documentation index, see [llms.txt](https://documentation.immuta.com/2024.3/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://documentation.immuta.com/2024.3/integrations/legacy-integrations/security-without-sentry.md).

# Securing Hive and Impala Without Sentry

Immuta offers both fine- and coarse-grained protection for Hive and Impala tables. However, additional protections are required to ensure that users cannot gain unauthorized access to data by connecting to Hive or Impala directly. Cloudera recommends using the Sentry service to secure access to Hive and Impala. As an alternative, this guide details steps that CDH cluster administrators can take to lock down Hive and Impala access without running the Sentry service.

{% hint style="info" %}
Each section in this guide is a **required** step to ensure that access to Hive and Impala is secured.
{% endhint %}

## Restricting Access to Hive

After installing Immuta on your cluster, users will still be able to connect to Hive via the hive shell, `beeline`, or JDBC/ODBC connections. To prevent users from circumventing Immuta and gaining unauthorized access to data, you can leverage HDFS Access control lists (ACLs) without running Sentry.

### Enable HDFS Access Control Lists in Cloudera Manager

See the official [Cloudera Documentation](https://www.cloudera.com/documentation/enterprise/5-9-x/topics/cdh_sg_hdfs_ext_acls.html#concept_ifd_1nm_jw) to complete this step.

### Enable Hive Impersonation in Cloudera Manager

In order to leverage ACLs to secure Hive, Hive impersonation must be enabled. To enable Hive impersonation in Cloudera manager, set `hive.server2.enable.impersonation, hive.server2.enable.doAs` to `true` in the Hive service configuration.

### Configure Access Control Lists

{% hint style="info" %}
*Group* in this context refers to Linux groups, not Sentry groups.
{% endhint %}

You must configure ACLs for each location in HDFS that Hive data will be stored in to restrict access to `hive`, `impala`, and data owners that belong to a particular group. You can accomplish this by running the commands below.

```bash
hadoop fs -setfacl -m other::--- /user/hive/warehouse
hadoop fs -setfacl -m user::rwx /user/hive/warehouse
hadoop fs -setfacl -m group::rwx /user/hive/warehouse
hadoop fs -setfacl -m group:hive:rwx /user/hive/warehouse
hadoop fs -setfacl -m group:examplegroup:rwx /user/hive/warehouse
```

In this example, we are allowing members of the `hive` and `examplegroup` to select & insert on tables in hive. Note that the `hive` group only contains the `hive` and `impala` users, while `examplegroup` contains the privileged users who would be considered potential data owners in Immuta.

By default, Hive stores data in HDFS under `/user/hive/warehouse`. However, you can change this directory in the above example if you are using a different data storage location on your cluster.

## Restricting Access to Impala

After installing Immuta on your cluster, users will still be able to connect to Impala via `impala-shell` or JDBC/ODBC connections. To prevent users from circumventing Immuta and gaining unauthorized access to data, you can leverage policy configuration files for Impala without running Sentry.

### Create Policy Configuration File

{% hint style="info" %}
*Group* in this context refers to Linux groups, not Sentry groups.
{% endhint %}

The policy configuration file that will drive Impala's security must be in `.ini` format. The example below will grant users in group `examplegroup` the ability to read and write data in the `default` database. You can add additional groups and roles that correspond to different databases or tables.

```ini
[groups]
examplegroup = example_insert_role, example_select_role

[roles]
example_insert_role = server=server1->db=default->table=*->action=insert
example_select_role = server=server1->db=default->table=*->action=select
```

This policy configuration file assigns the group called `examplegroup` to the roles `example_insert_role` and `example_select_role`, which grant insert and select (read and write) privileges on all tables in the `default` database.

See the official [Impala documentation](https://impala.apache.org/docs/build/html/topics/impala_authorization.html) for a detailed guide on policy configuration files. Note that while the guide mentions Sentry, running the Sentry service is not required to leverage policy configuration files.

Next, place the policy configuration file (we will call it `policy.ini`) in HDFS. The policy file should be owned by the `impala` user, and should only be accessible by the `impala` user. See below for an example.

```bash
hadoop fs -copyFromLocal /tmp/policy.ini /user/impala/
hadoop fs -chown impala:impala /user/impala/policy.ini
hadoop fs -chmod o-rwx /user/impala/policy.ini
```

### Configure Impala to use Policy Configuration File

You can configure Impala to leverage your new policy file by navigating to Impala's configuration in Cloudera Manager and modifying `Impala Daemon Command Line Argument Advanced Configuration Snippet (Safety Valve)` with the snippet below.

```bash
-server_name=server1
-authorization_policy_file=/user/impala/policy.ini
```

You must restart the Impala service in Cloudera Manager to implement the policy changes. Note that `server_name` should correspond to the `server` that you define in your policy roles. Also note that each key-value pair should be placed on its own line in the configuration snippet.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://documentation.immuta.com/2024.3/integrations/legacy-integrations/security-without-sentry.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
