All pages
Powered by GitBook
1 of 1

Loading...

Bulk Create Snowflake Data Sources

Private preview

This feature is only available to select accounts. Reach out to your Immuta representative to enable this feature.

Requirements

  • Snowflake Enterprise Edition

  • Snowflake X-Large or Large warehouse is strongly recommended

Create Snowflake data sources

  1. Set the to None for bulk data source creation. This will simplify the data source creation process by not automatically applying policies.

  2. Make a request to the Immuta V2 API , as the Immuta UI does not support creating more than 1000 data sources. The following options must be specified in your request to ensure the maximum performance benefits of bulk data source creation. The Skip Stats Job tag is only required if you are using ; otherwise, Snowflake data sources automatically skip the stats job.

Specifying disableSensitiveDataDiscovery as true ensures that will not be applied when the new data sources are created in Immuta, regardless of how it is configured for the Immuta tenant. Disabling sensitive data discovery improves performance during data source creation.

Applying the Skip Stats Job tag using the tableTag value will ensure that some jobs that are not vital to data source creation are skipped, specifically the fingerprint and high cardinality check jobs.

When the Snowflake bulk data source creation feature is configured, the create data source endpoint operates asynchronously and responds immediately with a bulkId that can be used for monitoring progress.

Monitor progress

To monitor the progress of the background jobs for the bulk data source creation, make the following request using the bulkId from the response of the previous step:

The response will contain a list of job states and the number of jobs currently in each state. If errors were encountered during processing, a list of errors will be included in the response:

With these recommended configurations, bulk creating 100,000 Snowflake data sources will take between six and seven hours for all associated jobs to complete.

default subscription policy
create data source endpoint
specific policies that require stats
sensitive data discovery
"options": {
    "disableSensitiveDataDiscovery": true,
    "tableTags": [
        "Skip Stats Job"
    ]
}
curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example_payload.json
    https://your-immuta-url.com/jobs?bulkId=<your-bulkId>
    {
      "total":"99893",
      "completed":"99892",
      "failed":"0",
      "pending":"1",
      "errors":null
    }