Bulk Create Snowflake Data Sources
Private preview
This feature is only available to select accounts. Reach out to your Immuta representative to enable this feature.
Requirements
Snowflake Enterprise Edition
Snowflake X-Large or Large warehouse is strongly recommended
Create Snowflake data sources
Set the default subscription policy to None for bulk data source creation. This will simplify the data source creation process by not automatically applying policies.
Make a request to the Immuta V2 API create data source endpoint, as the Immuta UI does not support creating more than 1000 data sources. The following options must be specified in your request to ensure the maximum performance benefits of bulk data source creation. The
Skip Stats Job
tag is only required if you are using specific policies that require stats; otherwise, Snowflake data sources automatically skip the stats job.
Specifying disableSensitiveDataDiscovery
as true
ensures that sensitive data discovery will not be applied when the new data sources are created in Immuta, regardless of how it is configured for the Immuta tenant. Disabling sensitive data discovery improves performance during data source creation.
Applying the Skip Stats Job
tag using the tableTag
value will ensure that some jobs that are not vital to data source creation are skipped, specifically the fingerprint and high cardinality check jobs.
When the Snowflake bulk data source creation feature is configured, the create data source endpoint operates asynchronously and responds immediately with a bulkId
that can be used for monitoring progress.
Monitor progress
To monitor the progress of the background jobs for the bulk data source creation, make the following request using the bulkId
from the response of the previous step:
The response will contain a list of job states and the number of jobs currently in each state. If errors were encountered during processing, a list of errors will be included in the response:
With these recommended configurations, bulk creating 100,000 Snowflake data sources will take between six and seven hours for all associated jobs to complete.