Redshift Pre-Configuration Details

This page describes the Redshift integration, configuration options, and features. For a tutorial to enable this integration, see the installation guide.

Feature Availability

Project Workspaces

Tag Ingestion

User Impersonation

Query Audit

Multiple Integrations

❌

✅

❌

✅

Prerequisite

For automated installations, the credentials provided must be a Superuser or have the ability to create databases and users and modify grants.

Supported Features

Redshift datashares
Redshift Serverless
Redshift Spectrum For configuration and data source registration instructions, see the configuration page.

Authentication Methods

The Redshift integration supports the following authentication methods to configure the integration and create data sources:

Username and Password: Users can authenticate with their Redshift username and password.
AWS Access Key: Users can authenticate with an AWS access key.

Tag Ingestion

Immuta cannot ingest tags from Redshift, but you can connect any of these supported external catalogs to work with your integration.

User Impersonation

Required Redshift privileges

Setup User:

OWNERSHIP ON GROUP IMMUTA_IMPERSONATOR_ROLE
CREATE GROUP

Immuta System Account:

GRANT EXECUTE ON PROCEDURE grant_impersonation
GRANT EXECUTE ON PROCEDURE revoke_impersonation

Impersonation allows users to query data as another Immuta user in Redshift. To enable user impersonation, see the User Impersonation page.

Multiple Integrations

Users can enable multiple Redshift integrations with a single Immuta tenant.

Redshift Limitations

The host of the data source must match the host of the connection for the view to be created.
When using multiple Redshift integrations, a user has to have the same user account across all hosts.
Case sensitivity of database, table, and column identifiers is not supported. The enable_case_sensitive_identifier parameter must be set to false (default setting) for your Redshift cluster to configure the integration and register data sources.

Python UDF Specific Limitations

For most policy types in Redshift, Immuta uses SQL clauses to implement enforcement logic; however Immuta uses Python UDFs in the Redshift integration to implement the following masking policies:

Masking using a regular expression
Reversible masking
Format-preserving masking
Randomized response

The number of Python UDFs that can run concurrently per Redshift cluster is limited to one-fourth of the total concurrency level for the cluster. For example, if the Redshift cluster is configured with a concurrency of 15, a maximum of three Python UDFs can run concurrently. After the limit is reached, Python UDFs are queued for execution within workload management queues.

The SVL_QUERY_QUEUE_INFO view in Redshift, which is visible to a Redshift superuser, summarizes details for queries that spent time in a workload management (WLM) query queue. Queries must be completed in order to appear as results in the SVL_QUERY_QUEUE_INFO view.

If you find that queries on Immuta-built views are spending time in the workload management (WLM) query queue, you should either edit your Redshift cluster configuration to increase concurrency, or use fewer of the masking policies which leverage Python UDFs. For more information on increasing concurrency, see the Redshift docs on implementing workload management.

Last updated 5 months ago

Was this helpful?