Redshift Pre-Configuration Details
Audience: System Administrators
Content Summary: This page describes the Redshift integration, configuration options, and features.
For a tutorial to enable this integration, see the installation guide.
|Project Workspaces||Tag Ingestion||User Impersonation||Query Audit||Multiple Integrations|
A Redshift cluster with an RA3 node. You must use a Redshift RA3 instance type because Immuta requires cross-database views, which are only supported in Redshift RA3 instance types.
For automated installations, the credentials provided must be a Superuser or have the ability to create databases and users and modify grants.
Immuta cannot ingest tags from Redshift, but you can connect any of these supported external catalogs to work with your integration.
Impersonation allows users to query data as another Immuta user in Redshift. To enable user impersonation, see the User Impersonation page.
Users can enable multiple Redshift integrations with a single Immuta instance.
Redshift Feature Availability
- Immuta supports Redshift Serverless.
- The host of the data source must match the host of the native connection for the native view to be created.
- When using multiple Redshift integrations, a user has to have the same user account across all hosts.
Python UDF Specific Limitations
For most policy types in Redshift, Immuta uses SQL clauses to implement enforcement logic; however Immuta uses Python UDFs in the Redshift integration to implement the following masking policies:
- Masking using a regular expression
- Reversible masking
- Format-preserving masking
- Randomized response
The number of Python UDFs that can run concurrently per Redshift cluster is limited to one-fourth of the total concurrency level for the cluster. For example, if the Redshift cluster is configured with a concurrency of 15, a maximum of three Python UDFs can run concurrently. After the limit is reached, Python UDFs are queued for execution within workload management queues.
SVL_QUERY_QUEUE_INFO view in Redshift, which is visible to a Redshift superuser, summarizes details for queries
that spent time in a workload management (WLM) query queue. Queries must be completed in order to appear as results
If you find that queries on Immuta-built views are spending time in the workload management (WLM) query queue, you should either edit your Redshift cluster configuration to increase concurrency, or use fewer of the masking policies which leverage Python UDFs. For more information on increasing concurrency, see the Redshift docs on implementing workload management.