1 of 6

Immuta V2 API

Policy as code benefits

Reduces complexity: The data source API has been simplified to only require the connection information in most instances and one endpoint for all database technologies.
Maintains less state: Whether updating or creating, the same endpoint is used, and the same data is passed. No ids are required, so no additional state is required.
Requires fewer steps: Only an API key is required; no additional authentication step is required before using the API.
Integrates with Git: Define data sources and policies in files that can be tracked in Git and easily pushed to Immuta. Both JSON and YAML are supported for more flexibility. (For example, use YAML to add comments in files.)

Authentication

Before using the Immuta API, users need to authenticate with an API key. To generate an API key, complete the following steps in the Immuta UI.

Click the user icon and select Profile from the menu.
Go to the API Keys tab and then click Generate Key.
Complete the required fields in the modal and click Create.

Pass the key that is provided in the Authorization header:

curl --data-binary project.yaml -H "Authorization: ${IMMUTA_API_KEY}" -H "Content-Type: text/plain" https:/<immuta-url>/api/v2/project

Endpoints and details

All of the API endpoints described below take either JSON or YAML, and the endpoint and payload are the same for both creating and updating data sources, policies, projects, and purposes.

Create a data source

The V2 API is built to easily enable an “as-code” approach to managing your data sources, so each time you POST data to this endpoint, you are expected to provide complete details of what you want in Immuta. The two examples below illustrate this design:

If you POST once explicitly defining a single table under sources, and then POST a second time with a different table, this will result in a single data source in Immuta pointing to the second table and the first data source will be deleted or disabled (depending on the value specified for hardDelete).
If you POST once with two tableTags specified (e.g., Tag.A and Tag.B) and do a follow-up POST with tableTags: [Tag.C], only Tag.C will exist on all of the tables specified; tags Tag.A and Tag.B will be removed from all the data sources. Note: If you are frequently using the v2 API to update data tags, consider using the custom REST catalog integration instead.

Through this endpoint, you can create or update all data sources for a given schema or database.

POST /api/v2/data

Query parameters

Parameter

Description

dryRun

boolean If true, no updates will actually be made. Default: false

wait

number The number of seconds to wait for data sources to be created before returning. Anything less than 0 will wait indefinitely. Default: 0

Payload

Attribute

Description

connectionKey

string A key/name to uniquely identify this collection of data sources.

object Connection information.

object Supply a template to override naming conventions. Immuta will use the system default if not supplied.

object Override options for these sources. If not provided, system defaults will all be used.

object Specify owners for all data sources created. If an empty array is provided, all data owners (other than the calling user) will be removed from the data source. To allow for an external process (or the UI) to control data owners, if the element is completely missing from the payload, data owners will not be modified.

object Configure which sources are created. If not provided, all sources from the given connection will be created.

Note: See Create Data Source Payload Attribute Details for more details about these attributes.

Request payload examples

Basic Data Source
Databricks Data Source (Override Naming Convention)
Impala Data Source (with userFile)
Basic Postgres Data Source
Snowflake Data Source (Specifies Sources)

Create a policy

POST /api/v2/policy

Requirements:

Immuta permission GOVERNANCE

Query parameters

Parameter

Description

dryRun

boolean If true, no updates will actually be made. Default: false

reCertify

boolean If true (and if the certification has changed), someone will need to re-certify this policy on all impacted data sources. Default: false

Payload

Attribute

Description

policyKey

string A key/name to uniquely identify this policy.

name

string The name of the policy.

type

subscription or data The type of policy.

actions

ownerRestrictions (optional)

object[] Object identifying the entities to which this global policy should be restricted.

circumstances (optional)

object When this policy should get applied

circumstanceOperator (optional)

all or any Specify whether "all" of the circumstances must be met for the policy to be applied, or just "any" of them.

staged (optional)

boolean Whether or not this global policy is in a staged. status. Default: false

certification (optional)

object Certification information for the global policy.

Note: See Policy Request Payload Examples for payload details.

Create a project

POST /api/v2/project

Query parameters

Parameter

Description

dryRun

boolean If true, no updates will actually be made. Default: false

deleteDataSourcesOnWorkspaceDelete

boolean If true, will delete all data and the data sources associated with a project workspace when the workspace is deleted. Default: false

Payload

Attribute

Description

projectKey

string A key/name to uniquely identify this project.

name

string The name of the project.

description (optional)

string A short description for the project.

documentation (optional)

object Markdown-supported documentation for this project.

allowedMaskedJoins (optional)

boolean If true, will allow joining on masked columns between data sources in this project. Only certain policies allow masked join. Default: false

purposes (optional)

string[] The list of purposes to add to this project.

datasources (optional)

string[] The list of data sources to add to this project.

subscriptionPolicy (optional)

object The policy for which users can subscribe to this project. Default: manual subscription policy

workspace (optional)

object If this is a workspace project, this is the workspace configuration. The project will automatically be equalized.

equalization (optional)

boolean If true, will normalize all users to the same entitlements so that everyone sees the same data. Default: false

tags (optional)

string[] Tags to add to the project.

Note: See Project Request Payload Examples for payload details.

Request payload examples

Basic Project
Project: Anyone Can Subscribe
Project: Anyone Who is Approved
Project: Users with Specific Groups or Attributes
Project with Databricks Spark Workspace
Project with Snowflake Workspace

Create a purpose

POST /api/v2/purpose

Query parameters

Parameter

Description

dryRun

boolean If true, no updates will actually be made. Default: false

reAcknowledgeRequired

boolean If true, will require all users of any projects using this purpose to re-acknowledge any updated acknowledgement statements. Default: false

Payload

Attribute

Description

name

string The name of the purpose.

description (optional)

string A short description for the purpose.

acknowledgement (optional)

string The acknowledgement that users must agree to when joining a project with this purpose. If not provided, the system default will be used.

kAnonNoiseReduction (optional)

string The level of reduction allowed when doing policy adjustments on data sources in projects with this purpose.

Note: See Purposes Request Payload Examples for payload details.

Request payload examples

Basic Purpose
Advanced Purpose
Sub-Purpose

Best practices

Register all tables in a schema by enabling schema monitoring. Schema monitoring will negate the need to re-call the V2 /data endpoint when you have new tables because schema monitoring will automatically recognize and register them.
To frequently update data tags on a data source, use the custom REST catalog integration instead of the V2/data endpoint.
Use the Data engineering with limited policy downtime guide. Rather than relying on re-calling the V2 /data endpoint after a dbt run to update your data sources, follow the dbt and transform workflow and use schema monitoring to recognize changes to your data sources and reapply policies.

Data Source Payload Attribute Details

`connectionKey`

The connectionKey is a unique identifier for the collection of data sources being created. If an existing connectionKey is used with new connection information, it will delete the old data sources and create new ones from the new information in the payload.

`connection`

Attribute

Description

Required or optional

handler

Snowflake

Required

ssl

boolean Set to true to enable SSL communication with the remote database.

Optional

database

string The database name.

Required

schema

string The schema in the remote database.

Optional

hostname

string The hostname of the remote database instance.

Required

port

number The port of the remote database instance.

Optional

warehouse

string The default pool of compute resources Immuta will use to run queries and other Snowflake operations.

Required

connectionStringOptions

string Additional connection string options to be used when connecting to the remote database.

Optional

authenticationMethod

string The type of authentication method to use. Options include userPassword, keyPair, and oAuthClientCredentials.

Required

username

string The username used to connect to the remote database.

Required if using userPassword or keyPair.

password

string The password used to connect to the remote database.

Required if using userPassword.

useCertificate

boolean Set to true when using client certificate credentials to request an access token. Otherwise, set to false to use client secret.

Required if using oAuthClientCredentials.

userFiles

object Details about the files required for the request.

Required if using keyPair or oAuthClientCredentials with useCertificate set to true.

keyName

string The connection name of the key file. Must be PRIV_KEY_FILE if using keyPair, or must be oauth client certificate if using oAuthClientCredentials.

Required if using keyPair or oAuthClientCredentials with useCertificate set to true.

content

string The content of the file, base-64 encoded.

Required if using keyPair or oAuthClientCredentials with useCertificate set to true.

userFilename

string The name of the file - for display purposes in the UI.

Required if using keyPair or oAuthClientCredentials with useCertificate set to true.

Attribute

Description

Required or optional

handler

Databricks

Required

ssl

boolean Set to true to enable SSL communication with the remote database.

Optional

database

string The database name.

Optional

hostname

string The hostname of the remote database instance.

Required

port

number The port of the remote database instance.

Optional

connectionStringOptions

string Additional connection string options to be used when connecting to the remote database.

Optional

authenticationMethod

string The type of authentication method to use. Options include oAuthM2M and token.

Required

token

string The Databricks personal access token for the service principal created for Immuta.

Required if using token authentication.

useCertificate

boolean True when using client certificate credentials to request an access token. Otherwise, client secret.

Required if using oAuthM2M.

clientId

string The client identifier of the Immuta service principal you configured. This is the client ID displayed in Databricks when creating the client secret for the service principal.

Required if using oAuthM2M.

audience

string The audience for the OAuth Client Credential token request.

Required if using oAuthM2M.

clientSecret

string An application password an app can use in place of a certificate to identity itself.

Required if using oAuthM2M and useCertificate is set to false.

certificateThumbprint

string The certificate thumbprint to use to generate the JWT for the OAuth Client Credential request.

Required if using oAuthM2M and useCertificate is set to true.

scope

Optional

httpPath

string The HTTP path of your Databricks cluster or SQL warehouse.

Required

Attribute

Description

Required or optional

handler

Redshift

Required

ssl

boolean Set to true to enable SSL communication with the remote database.

Optional

database

string The database name.

Optional

schema

string The schema in the remote database.

Required

connectionStringOptions

string Additional connection string options to be used when connecting to the remote database.

Optional

hostname

string The hostname of the remote database instance.

Required

port

number The port of the remote database instance.

Optional

authenticationMethod

string The type of authentication method to use. Options include userPassword and okta.

Required

username

string The username used to connect to the remote database.

Required

password

string The password used to connect to the remote database.

Required

idpHost

string The Okta identity provider host URL.

Required if using okta.

appID

string The Okta application ID.

Required if using okta.

role

string The Okta role.

Required if using okta.

Attribute

Description

handler

Google BigQuery, Presto, and Trino

ssl

boolean Set to true to enable SSL communication with the remote database.

database

string The database name.

schema

string The schema in the remote database.

userFiles

array Array of objects; each object must have keyName (corresponds to a connection string option), content (base-64 encoded content), and userFilename (the name of the file - for display purposes in the app).

connectionStringOptions

string Additional connection string options to be used when connecting to the remote database.

hostname

string The hostname of the remote database instance.

port

number The port of the remote database instance.

authenticationMethod

string The type of authentication method to use. Options include userPassword, keyPair, oAuthClientCredentials, token, oAuthM2M, keyFile, auto and okta.

username

string The username used to connect to the remote database.

password

string The password used to connect to the remote database.

sid

string For Google BigQuery, the BigQuery project used to build the connection string.

Special Cases

BigQuery: Does not require hostname and password. Requires sid, which is the GCP project ID, and userFiles with the keyName of KeyFilePath and the base64-encoded keyfile.json.
Trino: authenticationMethod can be No Authentication, LDAP Authentication, or Kerberos Authentication.

`nameTemplate`

Attribute

Description

dataSourceFormat

string Format to be used to name the data sources created in this group.

schemaFormat

string Format to be used to name the Immuta schema created in this group.

tableFormat

string Format to be used to name the Immuta table created in this group.

schemaProjectNameFormat

string Format to be used to name the Immuta schema project created in this group.

Available templates include

<tablename>
<schema>
<database>

All cases of the name in Immuta should be lowercase.

For example, consider a table TPC.CUSTOMER that is given the following nameTemplate:

dataSourceFormat: <schema> <tablename>
tableFormat: <tablename>
schemaFormat: <schema>
schemaProjectNameFormat: <schema>

This nameTemplate will produce a data source named tpc.customer in a schema project named tpc.

`options`

Attribute

Description

staleDataTolerance

integer The length in seconds that data for these sources can be cached.

disableSensitiveDataDiscovery

domainCollectionId

hardDelete

boolean If true, when the table backing the data source is no longer available, the data source in Immuta is deleted. If this is false, the data source will be disabled. Default: false.

tableTags

array An array of tags (strings) to place at the data source level on every data source.

`owners`

Attribute

Description

type

group or user The type of owner that is being added.

name

string The name of the group or the user (username they log in with).

iam (optional)

string The ID of the identity manager system the user or group comes from. If excluded, any user/group that matches will be added as an owner.

`sources`

Best practice: Use Subscription Policies to Control Access

If you are not tagging individual columns, omit sources to create data sources for all tables in the schema or database, and then use Subscription Policies to control access to the tables instead of excluding them from Immuta.

This attribute configures which sources are created. If sources is not provided, all sources from the given connection will be created.

There are 3 types of sources than can be specified:

all tables
query
table

Recommended: Specify All Tables

If you specify any sources (either tables or queries), but you still want to create data sources for the rest of the tables in the schema or database, you can specify all as a source:

sources:
  - all: true

Best practice: Use schema monitoring

Excluding sources or specifying all: true will turn on automatic schema monitoring in Immuta. As tables are added or removed, Immuta will look for those changes on a schedule (by default, once a day) and either disable or delete data sources for removed tables or create data sources for new tables. New tables will be tagged New so that you can build a policy to restrict access to new tables until they are evaluated by data owners. Data owners will be notified of new tables, and all subscribers will be notified if data sources are disabled or deleted.

Specify a Query

Immuta recommends creating a view in your remote database instead of using this option, but if that is not possible, you can create data sources based on SQL statements:

sources:
  - query: “select * from table”
    naming:
      datasource: “My Source”,
      table: “my_source”,
      schema: “queries”

Specify a Table

If you want to select specific tables to be created as data sources, or if you want to tag individual data sources or columns within a data source, you need to leverage this parameter:

sources:
  - table: name_of_table
    schema: name_of_schema

Additional Options

When specifying a table or query there are other options that can be specified:

Option

Description

columnDescriptions

description

A short description for the data source.

documentation

Markdown-supported documentation for the data source.

naming

owners

Specify owners for an individual data source. The payload is the same as owners at the root level.

Columns

If any columns are specified, those are the only columns that will be available in the data source.
If no columns are specified, Immuta will look for new or removed columns on a schedule (by default, once a day) and add or remove columns from the data sources automatically as needed.
New columns will be tagged New, so you can build a policy to automatically mask new columns until they are approved.
Data Owners will be notified when columns are added or removed.

columns is an array of objects for each column:

Attribute

Description

name

The column name.

dataType

The data type.

nullable

Whether or not the column contains null.

remoteType

The actual data type in the remote database.

primaryKey

Specify whether this is the primary key of the remote table.

description

Describe the column.

Column Descriptions

You can add descriptions to columns without having to specify all the columns in the data source. columnDescriptions is an array of objects with the following schema:

Attribute

Description

columnName

string The column name.

description

string The description of the column.

columnDescriptions:
  - columnName: acct_num
    description: The account number

Data Source Request Payload Examples

Audience: Data Engineers
Content Summary: This page contains example request payloads for creating data sources.

Basic Data Source

connectionKey: my-databricks
connection:
    hostname: your.databricks.hostname.com
    port: 443
    ssl: true
    database: tpc
    username: token
    password: "${DATABRICKS_PASSWORD}"
    httpPath: sql/protocolv1/o/0/11101101
    handler: Databricks

Data Source (with More Options)

connectionKey: my-databricks
nameTemplate:
  dataSourceFormat: Databricks <Tablename>
  tableFormat: <tablename>
  schemaFormat: databricks
connection:
  hostname: your.databricks.hostname.com
  port: 443
  ssl: true
  database: data
  username: token
  password: "${DATABRICKS_PASSWORD}"
  httpPath: sql/protocolv1/o/0/1110-11123
  handler: Databricks
sources:
  - table: credit_card_transactions
    schema: data
    tags:
      table:
        - PCI
        - SENSITIVE
      columns:
      - columnName: transaction_date
        tags:
          - PCI
          - DATE
  - table: crime_data
    schema: data
    naming:
        datasource: Crime Data
        table: crime_data
        schema: databricks

Databricks Data Source (M2M OAuth - Azure Databricks)

connectionKey: my-databricks
nameTemplate:
  dataSourceFormat: Databricks <Tablename>
  tableFormat: <tablename>
  schemaFormat: databricks
  schemaProjectNameFormat: <schema>
connection:
  hostname: your.databricks.hostname.com
  port: 443
  ssl: true
  database: data
  authenticationMethod: oAuthM2M
  useCertificate: false
  clientId: "${service_principal_clientId}"
  audience: https://your.databricks.hostname.com/oidc/v1/token 
  scope: all-apis
  clientSecret: "${clientSecret}"
  httpPath: sql/protocolv1/o/0/1110-11123
  handler: Databricks

Databricks Data Source (Override Naming Convention)

connectionKey: ebock-databricks
nameTemplate:
  dataSourceFormat: Databricks <Tablename>
  tableFormat: <tablename>
  schemaFormat: databricks
connection:
  hostname: your.databricks.hostname.com
  port: 443
  ssl: true
  database: ebock
  username: token
  password: "${DATABRICKS_PASSWORD}"
  httpPath: sql/protocolv1/o/0/1110-185737-wove
  handler: Databricks
sources:
  - table: credit_card_transactions
    schema: ebock
  - table: crime_data_delta
    schema: ebock
    naming:
        datasource: Crime Data
        table: crime_data
        schema: databricks
  - table: hipaa_data
    schema: ebock

Redshift Spectrum Data Sources

Your nativeSchemaFormat must contain _immuta to avoid schema name conflicts.

connectionKey: redshift
connection:
  hostname: your-redshift-cluster.djie25k.us-east-1.redshift.amazonaws.com
  port: 5439
  ssl: true
  database: your_database_with_external_schema
  username: awsuser
  password: your_password
  handler: Redshift
  schema: external_schema
nameTemplate:
  dataSourceFormat: <Tablename>
  schemaFormat: <schema>
  tableFormat: <tablename>
  schemaProjectNameFormat: <Schema>
  nativeSchemaFormat: <schema>_immuta
  nativeViewFormat: <tablename>
sources:
  - all: true

Snowflake Data Source (Specify Sources)

connectionKey: tpc-snowflake
nameTemplate:
  dataSourceFormat: Snowflake <Tablename>
  tableFormat: <tablename>
  schemaFormat: snowflake
connection:
  hostname: example.hostname.snowflakecomputing.com
  port: 443
  ssl: true
  database: TPC
  username: USERA
  password: "${SNOWFLAKE_PASSWORD}"
  schema: PUBLIC
  warehouse: IT_WH
  handler: Snowflake
sources:
  - table: CASE
    schema: PUBLIC
  - table: CASE2
    schema: PUBLIC
  - table: CUSTOMER
    schema: PUBLIC
  - table: WEB_SALES
    schema: PUBLIC

Create Policies API Examples

Subscription Policies

name: Anyone
policyKey: subscription anyone
type: subscription
actions:
  type: anyone
  automaticSubscription: false
  description: Rationale
circumstances:
- type: tags
  tag: Discovered

name: Approval
policyKey: subscription approval
type: subscription
actions:
  type: approval
  approvals:
  - specificApproverRequired: false
    requiredPermission: OWNER
  - specificApproverRequired: true
    requiredPermission: GOVERNANCE
  description: Rationale
circumstances:
- type: columnTags
  columnTag: Discovered

Users with Specific Groups or Attributes

name: Entitlement
policyKey: subscription entitlements
type: subscription
actions:
  type: entitlements
  entitlements:
    operator: any
    groups:
    - Employee
    attributes:
    - name: auth1
      value: SOMETHING_ELSE
  automaticSubscription: true
  allowDiscovery: false
  description: Some description here
circumstances:
- type: columnRegex
  regex: ssn
  caseInsensitive: false
staged: false

Users with Specific Groups or Attributes (Advanced)

name: Advanced Entitlement
policyKey: subscription entitlements advanced boolean
type: subscription
actions:
  type: entitlements
  advanced: "@isInGroups('Engineers', 'Founders'') AND @hasAttribute('Auth1', 'Super Secret')"
  automaticSubscription: true
  allowDiscovery: false
  description: Some description here
circumstances:
- type: columnRegex
  regex: ssn
  caseInsensitive: false
staged: false

Individual Users You Select

name: Manual
policyKey: subscription manual
type: subscription
actions:
  type: manual
  description: Rationale

Data Policies

Data Owner Restrictions

name: Owner Restricted Policy
policyKey: data owner restriction
type: data
ownerRestrictions:
  users:
  - iamid: bim
    username: user@example.com
  groups:
  - engineers
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Passport
      maskingConfig:
        type: Hash
circumstances:
- type: columnTags
  columnTag: Discovered.Passport

Masking Policies

Conditional Masking

name: Conditional Masking
policyKey: data conditional masking
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Passport
      conditionalPredicate: "@columnTagged('Discovered.Country') = 'USA'"
      maskingConfig:
        type: Hash
circumstanceOperator: all
circumstances:
- type: columnTags
  columnTag: Discovered.Passport
- type: columnTags
  columnTag: Discovered.Country

Conditional Masking (Using Otherwise Clause)

name: Conditional
policyKey: data mask otherwise
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Country
      maskingConfig:
        type: "Null"
    inclusions:
      groups:
      - Employee
  - type: Masking
    exceptions:
      purposes:
      - Re-identification Prohibited
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Country
      maskingConfig:
        type: Hash
circumstances:
- type: columnTags
  columnTag: Discovered.Country

With a Constant

name: Mask with Constant
policyKey: data mask constant
type: data
actions:
- rules:
  - type: Masking
    exceptions:
      operator: any
      attributes:
      - name: auth
        value: SOMETHING_ELSE
      - name: auth1
        value: super secret
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Country
      - type: columnTags
        columnTag: Discovered.Passport
      maskingConfig:
        type: Constant
        constant: REDACTED
circumstanceOperator: any
circumstances:
- type: columnTags
  columnTag: Discovered.Country
- type: columnTags
  columnTag: Discovered.Passport

Format Preserving Masking

Support limitation: This policy is only supported in Snowflake integrations.

name: Format Preserving Masking
policyKey: data mask fpe
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered
      maskingConfig:
        type: Format Preserving Masking
circumstances:
- type: columnTags
  columnTag: Discovered

With Hashing (No Tags)

name: Hashing
policyKey: data mask hashing
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: noTags
      maskingConfig:
        type: Hash
circumstances:
  - type: noTags

K-Anonymization (Using Fingerprint)

Support limitation: This policy is only supported in Snowflake integrations.

Sample data is processed during computation of k-anonymization policies

When a k-anonymization policy is applied to a data source, the columns targeted by the policy are queried under a fingerprinting process that generates rules enforcing k-anonymity. The results of this query, which may contain data that is subject to regulatory constraints such as GDPR or HIPAA, are stored in Immuta's metadata database.

The location of the metadata database depends on your deployment:

Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.

name: K-Anonymization Using Fingerprint on any tags
policyKey: masking kanon using fingerprint
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: anyTag
      maskingConfig:
        type: K-Anonymization
circumstances:
- type: anyTag

K-Anonymization (by Specifying K)

Support limitation: This policy is only supported in Snowflake integrations.

Sample data is processed during computation of k-anonymization policies

The location of the metadata database depends on your deployment:

Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.

name: K-Anonymization using kLevel
policyKey: data mask kanon specifying k
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: anyTag
      maskingConfig:
        type: K-Anonymization
        kLevel: 5
circumstances:
- type: anyTag

K-Anonymization (by Specifying Re-identification Probability)

Support limitation: This policy is only supported in Snowflake integrations.

Sample data is processed during computation of k-anonymization policies

The location of the metadata database depends on your deployment:

Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.

name: K-Anonymization using reIdProbability
policyKey: data mask kanon specifying re-id
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: anyTag
      maskingConfig:
        type: K-Anonymization
        reIdProbability: 15
circumstances:
- type: anyTag

Make Null Using Column Regex

name: Null using column regex
policyKey: data mask null
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnRegex
        regex: ssn
        caseInsensitive: true
      maskingConfig:
        type: "Null"
circumstances:
- type: columnRegex
  regex: ssn
  caseInsensitive: true

Randomized Response

Support limitation: This policy is only supported in Snowflake integrations.

Sample data is processed during computation of randomized response policies

When a randomized response policy is applied to a data source, the columns targeted by the policy are queried under a fingerprinting process. To enforce the policy, Immuta generates and stores predicates and a list of allowed replacement values that may contain data that is subject to regulatory constraints (such as GDPR or HIPAA) in Immuta's metadata database.

The location of the metadata database depends on your deployment:

Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.

name: Random Categorical
policyKey: data mask random response
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: allColumns
      maskingConfig:
        type: Randomized Response
        replacementRatePercent: 10

Randomized Response (by Specifying Standard Deviation)

Support limitation: This policy is only supported in Snowflake integrations.

name: Random Numeric
policyKey: data mask random response specifying stddev
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: allColumns
      maskingConfig:
        type: Randomized Response
        stddev: 2
        clip: false

Using a Regex

name: Regex
policyKey: data mask regex
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Entity.Postal Code
      maskingConfig:
        type: Regular Expression
        regex: "(\\d{4})(\\d)"
        replacement: "$1X"
        caseInsensitive: true
        global: true
circumstances:
- type: columnTags
  columnTag: Discovered.Entity.Postal Code

With Reversibility

Support limitation: This policy is only supported in Snowflake integrations.

name: Mask using Reversible
policyKey: data mask reversible
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Entity.Social Security Number
      maskingConfig:
        type: Reversible
    exceptions:
      groups:
      - founders
circumstances:
- type: columnTags
  columnTag: Discovered.Entity.Social Security Number

Using Rounding (Date)

name: RoundingDate
policyKey: data mask rounding by date
type: data
actions:
- rules:
  - type: Masking
    exceptions:
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Entity.Date
      maskingConfig:
        type: Grouping
        timePrecision: MONTH
circumstances:
- type: columnTags
  columnTag: Discovered.Entity.Date

Using Rounding (Using Fingerprint)

Support limitation: Rounding using the fingerprint is only supported in Snowflake integrations.

name: RoundingFingerprint
policyKey: data mask round using fingerprint
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Entity.Date
      maskingConfig:
        type: Grouping
circumstances:
- type: columnTags
  columnTag: Discovered.Entity.Date

Using Rounding (Numeric)

name: RoundingNumeric
policyKey: data mask round numeric
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Entity.Date
      maskingConfig:
        type: Grouping
        bucketSize: 10
circumstances:
- type: columnTags
  columnTag: Discovered.Entity.Date

Minimize Data Created Between

name: Minimize
policyKey: data minimize
type: data
actions:
- rules:
  - type: Minimization
    config:
      percent: 15
circumstances:
- type: time
  startDate: '2020-12-01T16:23:54.734Z'
  endDate: '2020-12-31T16:23:54.745Z'

Purpose Restrictions

Any Purpose

name: Purpose
policyKey: data purpose restriction
type: data
actions:
- rules:
  - type: Purpose Restriction
    config:
        operator: any
        purposes:
        - "<ANY PURPOSE>"

Purpose in Server

name: Purpose in a specific server
policyKey: data server circumstance
type: data
actions:
- rules:
  - type: Purpose Restriction
    config:
        purposes:
          - Re-identification Prohibited
circumstances:
- type: server
  server: your@server.example.com:5432/tpc

Row-level Policy

By Time

name: Row Level By Time
policyKey: data row-level
type: data
actions:
- rules:
  - type: Time Restriction
    config:
      isOlderOrNewer: newer
      time: 2592000
circumstances:
- type: tags
  tag: Discovered.PCI

Where User

name: Row Level Where User
policyKey: data where user
type: data
actions:
- rules:
  - type: Row Restriction By User Entitlements
    config:
      operator: all
      matches:
        type: group
        tag: Discovered.Entity
circumstanceOperator: ANY
circumstances:
- type: columnTags
  columnTag: Discovered.Entity

Custom Where Clause

name: Row Level Where
policyKey: data custom where
type: data
actions:
- rules:
  - type: Row Restriction by Custom Where Clause
    config:
      predicate: "@columnTagged('Discovered.Country')  in ('USA', 'CANADA', 'MEXICO')"
circumstances:
- type: tags
  tag: Discovered.Country

Multiple Policies

name: Multiple
policyKey: data multiple
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Passport
      maskingConfig:
        type: Hash
  description: 'Passport Rule'
- rules:
  - type: Minimization
    config:
      percent: 25
  description: 'Passport Rule, also'
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Healthcare NPI
      maskingConfig:
        type: "Null"
  description: 'Healthcare NPI Rule'
circumstanceOperator: any
circumstances:
- type: columnTags
  columnTag: Discovered.Passport
- type: columnTags
  columnTag: Discovered.Person Name

Create Projects API Examples

Basic Project

name: A Bare Bones Project
projectKey: simplest possible project

name: Anyone Project
projectKey: Anyone project
documentation: "# Anyone Can See This"
description: "Anyone can join this project"
allowMaskedJoins: false
subscriptionPolicy:
    type: anyone
    automaticSubscription: true
    description: "Auto-subscribe everyone"
tags:
  - Discovered.Person Name
purposes:
  - Use Purposes
  - Purpose Hierarchy.Child 2.Grandchild 2

Project: Anyone Who is Approved

name: Approval Project
projectKey: Approval project
description: "Need approval to join this project"
allowMaskedJoins: true
subscriptionPolicy:
    type: approval
    approvals:
      - requiredPermission: GOVERNANCE
        specificApproverRequired: true
      - requiredPermission: ADMIN
        specificApproverRequired: false

Project: Users with Specific Groups or Attributes

name: Entitlement Project
projectKey: entitlement project
description: "Need specific entitlements to join this project"
subscriptionPolicy:
    type: entitlements
    automaticSubscription: false
    allowDiscovery: true
    entitlements:
      operator: any
      groups:
        - Engineers
        - Founders
      attributes:
        - name: Auth1
          value: super secret

Project with Databricks Spark Workspace

name: Databricks Spark Project
projectKey: databricks spark project
datasources:
  - Crime Data
  - Databricks Credit Card Transactions
  - Databricks Hipaa Data
purposes:
  - Use Purposes
workspace:
  type: databricks
  config:
    database: native
    directory: native
    workspaceConfigurationName: S3

Project with Snowflake Workspace

name: Snowflake Project
projectKey: snowflake project
datasources:
  - Snowflake Case
  - Snowflake Customer
  - Snowflake Web Sales
workspace:
  type: snowflake
  config:
    schema:
      SNOWFLAKE_NATIVE
    warehouses:
      - DEMO_WH
tags:
  - Discovered.Passport

Create Purposes API Examples

Basic Purpose

name: A basic purpose

Advanced Purpose

name: Use Purposes
acknowledgement: You promise to be good
description: For use with projects to do policy adjustments
kAnonNoiseReduction: Medium

Sub-Purpose

name: Purpose Hierarchy
acknowledgement: The root acknowledgement
subpurposes:
  - name: Purpose Hierarchy.Child 1
    acknowledgement: Override the root acknowledgement
  - name: Purpose Hierarchy.Child 2
  - name: Purpose Hierarchy.Child 1.Grandchild 1
  - name: Purpose Hierarchy.Child 1.Grandchild 2
  - name: Purpose Hierarchy.Child 2.Grandchild 1
  - name: Purpose Hierarchy.Child 2.Grandchild 2

Data Source Payload Attribute Details

`connectionKey`

`connection`

Attribute

Description

Required or optional

handler

Snowflake

Required

ssl

boolean Set to true to enable SSL communication with the remote database.

Optional

database

string The database name.

Required

schema

string The schema in the remote database.

Optional

hostname

string The hostname of the remote database instance.

Required

port

number The port of the remote database instance.

Optional

warehouse

string The default pool of compute resources Immuta will use to run queries and other Snowflake operations.

Required

connectionStringOptions

string Additional connection string options to be used when connecting to the remote database.

Optional

authenticationMethod

string The type of authentication method to use. Options include userPassword, keyPair, and oAuthClientCredentials.

Required

username

string The username used to connect to the remote database.

Required if using userPassword or keyPair.

password

string The password used to connect to the remote database.

Required if using userPassword.

useCertificate

boolean Set to true when using client certificate credentials to request an access token. Otherwise, set to false to use client secret.

Required if using oAuthClientCredentials.

userFiles

object Details about the files required for the request.

Required if using keyPair or oAuthClientCredentials with useCertificate set to true.

keyName

string The connection name of the key file. Must be PRIV_KEY_FILE if using keyPair, or must be oauth client certificate if using oAuthClientCredentials.

Required if using keyPair or oAuthClientCredentials with useCertificate set to true.

content

string The content of the file, base-64 encoded.

Required if using keyPair or oAuthClientCredentials with useCertificate set to true.

userFilename

string The name of the file - for display purposes in the UI.

Required if using keyPair or oAuthClientCredentials with useCertificate set to true.

Attribute

Description

Required or optional

handler

Databricks

Required

ssl

boolean Set to true to enable SSL communication with the remote database.

Optional

database

string The database name.

Optional

hostname

string The hostname of the remote database instance.

Required

port

number The port of the remote database instance.

Optional

connectionStringOptions

string Additional connection string options to be used when connecting to the remote database.

Optional

authenticationMethod

string The type of authentication method to use. Options include oAuthM2M and token.

Required

token

string The Databricks personal access token for the service principal created for Immuta.

Required if using token authentication.

useCertificate

boolean True when using client certificate credentials to request an access token. Otherwise, client secret.

Required if using oAuthM2M.

clientId

string The client identifier of the Immuta service principal you configured. This is the client ID displayed in Databricks when creating the client secret for the service principal.

Required if using oAuthM2M.

audience

string The audience for the OAuth Client Credential token request.

Required if using oAuthM2M.

clientSecret

string An application password an app can use in place of a certificate to identity itself.

Required if using oAuthM2M and useCertificate is set to false.

certificateThumbprint

string The certificate thumbprint to use to generate the JWT for the OAuth Client Credential request.

Required if using oAuthM2M and useCertificate is set to true.

scope

clientSecret The scope limits the operations and roles allowed in Databricks by the access token. See the for details about scopes.

Optional

httpPath

string The HTTP path of your Databricks cluster or SQL warehouse.

Required

Attribute

Description

Required or optional

handler

Redshift

Required

ssl

boolean Set to true to enable SSL communication with the remote database.

Optional

database

string The database name.

Optional

schema

string The schema in the remote database.

Required

connectionStringOptions

string Additional connection string options to be used when connecting to the remote database.

Optional

hostname

string The hostname of the remote database instance.

Required

port

number The port of the remote database instance.

Optional

authenticationMethod

string The type of authentication method to use. Options include userPassword and okta.

Required

username

string The username used to connect to the remote database.

Required

password

string The password used to connect to the remote database.

Required

idpHost

string The Okta identity provider host URL.

Required if using okta.

appID

string The Okta application ID.

Required if using okta.

role

string The Okta role.

Required if using okta.

Attribute

Description

handler

Google BigQuery, Presto, and Trino

ssl

boolean Set to true to enable SSL communication with the remote database.

database

string The database name.

schema

string The schema in the remote database.

userFiles

connectionStringOptions

string Additional connection string options to be used when connecting to the remote database.

hostname

string The hostname of the remote database instance.

port

number The port of the remote database instance.

authenticationMethod

string The type of authentication method to use. Options include userPassword, keyPair, oAuthClientCredentials, token, oAuthM2M, keyFile, auto and okta.

username

string The username used to connect to the remote database.

password

string The password used to connect to the remote database.

sid

string For Google BigQuery, the BigQuery project used to build the connection string.

Special Cases

BigQuery: Does not require hostname and password. Requires sid, which is the GCP project ID, and userFiles with the keyName of KeyFilePath and the base64-encoded keyfile.json.
Trino: authenticationMethod can be No Authentication, LDAP Authentication, or Kerberos Authentication.

`nameTemplate`

Attribute

Description

dataSourceFormat

string Format to be used to name the data sources created in this group.

schemaFormat

string Format to be used to name the Immuta schema created in this group.

tableFormat

string Format to be used to name the Immuta table created in this group.

schemaProjectNameFormat

string Format to be used to name the Immuta schema project created in this group.

Available templates include

<tablename>
<schema>
<database>

All cases of the name in Immuta should be lowercase.

For example, consider a table TPC.CUSTOMER that is given the following nameTemplate:

dataSourceFormat: <schema> <tablename>
tableFormat: <tablename>
schemaFormat: <schema>
schemaProjectNameFormat: <schema>

This nameTemplate will produce a data source named tpc.customer in a schema project named tpc.

`options`

Attribute

Description

staleDataTolerance

integer The length in seconds that data for these sources can be cached.

disableSensitiveDataDiscovery

boolean If true, Immuta will not perform . Default: false.

domainCollectionId

string The ID of the domain to assign the data sources to. Use the to retrieve domains and domain IDs.

hardDelete

boolean If true, when the table backing the data source is no longer available, the data source in Immuta is deleted. If this is false, the data source will be disabled. Default: false.

tableTags

array An array of tags (strings) to place at the data source level on every data source.

`owners`

Attribute

Description

type

group or user The type of owner that is being added.

name

string The name of the group or the user (username they log in with).

iam (optional)

string The ID of the identity manager system the user or group comes from. If excluded, any user/group that matches will be added as an owner.

`sources`

Best practice: Use Subscription Policies to Control Access

This attribute configures which sources are created. If sources is not provided, all sources from the given connection will be created.

There are 3 types of sources than can be specified:

all tables
query
table

Recommended: Specify All Tables

If you specify any sources (either tables or queries), but you still want to create data sources for the rest of the tables in the schema or database, you can specify all as a source:

sources:
  - all: true

Best practice: Use schema monitoring

Specify a Query

Immuta recommends creating a view in your remote database instead of using this option, but if that is not possible, you can create data sources based on SQL statements:

sources:
  - query: “select * from table”
    naming:
      datasource: “My Source”,
      table: “my_source”,
      schema: “queries”

Specify a Table

If you want to select specific tables to be created as data sources, or if you want to tag individual data sources or columns within a data source, you need to leverage this parameter:

sources:
  - table: name_of_table
    schema: name_of_schema

Additional Options

When specifying a table or query there are other options that can be specified:

Option

Description

columnDescriptions

description

A short description for the data source.

documentation

Markdown-supported documentation for the data source.

naming

See the example above in . This is required for query-based sources, but is optional for table-based sources and can be used to override the nameTemplate provided for the whole database/schema.

owners

Specify owners for an individual data source. The payload is the same as owners at the root level.

Columns

If any columns are specified, those are the only columns that will be available in the data source.
If no columns are specified, Immuta will look for new or removed columns on a schedule (by default, once a day) and add or remove columns from the data sources automatically as needed.
New columns will be tagged New, so you can build a policy to automatically mask new columns until they are approved.
Data Owners will be notified when columns are added or removed.

columns is an array of objects for each column:

Attribute

Description

name

The column name.

dataType

The data type.

nullable

Whether or not the column contains null.

remoteType

The actual data type in the remote database.

primaryKey

Specify whether this is the primary key of the remote table.

description

Describe the column.

Column Descriptions

You can add descriptions to columns without having to specify all the columns in the data source. columnDescriptions is an array of objects with the following schema:

Attribute

Description

columnName

string The column name.

description

string The description of the column.

columnDescriptions:
  - columnName: acct_num
    description: The account number

Create Policies API Examples

Subscription Policies

name: Anyone
policyKey: subscription anyone
type: subscription
actions:
  type: anyone
  automaticSubscription: false
  description: Rationale
circumstances:
- type: tags
  tag: Discovered

name: Approval
policyKey: subscription approval
type: subscription
actions:
  type: approval
  approvals:
  - specificApproverRequired: false
    requiredPermission: OWNER
  - specificApproverRequired: true
    requiredPermission: GOVERNANCE
  description: Rationale
circumstances:
- type: columnTags
  columnTag: Discovered

Users with Specific Groups or Attributes

name: Entitlement
policyKey: subscription entitlements
type: subscription
actions:
  type: entitlements
  entitlements:
    operator: any
    groups:
    - Employee
    attributes:
    - name: auth1
      value: SOMETHING_ELSE
  automaticSubscription: true
  allowDiscovery: false
  description: Some description here
circumstances:
- type: columnRegex
  regex: ssn
  caseInsensitive: false
staged: false

Users with Specific Groups or Attributes (Advanced)

name: Advanced Entitlement
policyKey: subscription entitlements advanced boolean
type: subscription
actions:
  type: entitlements
  advanced: "@isInGroups('Engineers', 'Founders'') AND @hasAttribute('Auth1', 'Super Secret')"
  automaticSubscription: true
  allowDiscovery: false
  description: Some description here
circumstances:
- type: columnRegex
  regex: ssn
  caseInsensitive: false
staged: false

Individual Users You Select

name: Manual
policyKey: subscription manual
type: subscription
actions:
  type: manual
  description: Rationale

Data Policies

Data Owner Restrictions

name: Owner Restricted Policy
policyKey: data owner restriction
type: data
ownerRestrictions:
  users:
  - iamid: bim
    username: user@example.com
  groups:
  - engineers
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Passport
      maskingConfig:
        type: Hash
circumstances:
- type: columnTags
  columnTag: Discovered.Passport

Masking Policies

Conditional Masking

name: Conditional Masking
policyKey: data conditional masking
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Passport
      conditionalPredicate: "@columnTagged('Discovered.Country') = 'USA'"
      maskingConfig:
        type: Hash
circumstanceOperator: all
circumstances:
- type: columnTags
  columnTag: Discovered.Passport
- type: columnTags
  columnTag: Discovered.Country

Conditional Masking (Using Otherwise Clause)

name: Conditional
policyKey: data mask otherwise
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Country
      maskingConfig:
        type: "Null"
    inclusions:
      groups:
      - Employee
  - type: Masking
    exceptions:
      purposes:
      - Re-identification Prohibited
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Country
      maskingConfig:
        type: Hash
circumstances:
- type: columnTags
  columnTag: Discovered.Country

With a Constant

name: Mask with Constant
policyKey: data mask constant
type: data
actions:
- rules:
  - type: Masking
    exceptions:
      operator: any
      attributes:
      - name: auth
        value: SOMETHING_ELSE
      - name: auth1
        value: super secret
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Country
      - type: columnTags
        columnTag: Discovered.Passport
      maskingConfig:
        type: Constant
        constant: REDACTED
circumstanceOperator: any
circumstances:
- type: columnTags
  columnTag: Discovered.Country
- type: columnTags
  columnTag: Discovered.Passport

Format Preserving Masking

Support limitation: This policy is only supported in Snowflake integrations.

name: Format Preserving Masking
policyKey: data mask fpe
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered
      maskingConfig:
        type: Format Preserving Masking
circumstances:
- type: columnTags
  columnTag: Discovered

With Hashing (No Tags)

name: Hashing
policyKey: data mask hashing
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: noTags
      maskingConfig:
        type: Hash
circumstances:
  - type: noTags

K-Anonymization (Using Fingerprint)

Support limitation: This policy is only supported in Snowflake integrations.

Sample data is processed during computation of k-anonymization policies

The location of the metadata database depends on your deployment:

Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.

To ensure this process does not violate your organization's data localization regulations, you need to first activate this masking policy type before you can use it in your Immuta tenant. To enable k-anonymization for your account, see the .

name: K-Anonymization Using Fingerprint on any tags
policyKey: masking kanon using fingerprint
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: anyTag
      maskingConfig:
        type: K-Anonymization
circumstances:
- type: anyTag

K-Anonymization (by Specifying K)

Support limitation: This policy is only supported in Snowflake integrations.

Sample data is processed during computation of k-anonymization policies

The location of the metadata database depends on your deployment:

Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.

name: K-Anonymization using kLevel
policyKey: data mask kanon specifying k
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: anyTag
      maskingConfig:
        type: K-Anonymization
        kLevel: 5
circumstances:
- type: anyTag

K-Anonymization (by Specifying Re-identification Probability)

Support limitation: This policy is only supported in Snowflake integrations.

Sample data is processed during computation of k-anonymization policies

The location of the metadata database depends on your deployment:

Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.

name: K-Anonymization using reIdProbability
policyKey: data mask kanon specifying re-id
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: anyTag
      maskingConfig:
        type: K-Anonymization
        reIdProbability: 15
circumstances:
- type: anyTag

Make Null Using Column Regex

name: Null using column regex
policyKey: data mask null
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnRegex
        regex: ssn
        caseInsensitive: true
      maskingConfig:
        type: "Null"
circumstances:
- type: columnRegex
  regex: ssn
  caseInsensitive: true

Randomized Response

Support limitation: This policy is only supported in Snowflake integrations.

Sample data is processed during computation of randomized response policies

The location of the metadata database depends on your deployment:

Self-managed Immuta deployment: The metadata database is located in the server where you have your external metadata database deployed.
SaaS Immuta deployment: The metadata database is located in the AWS global segment you have chosen to deploy Immuta.

name: Random Categorical
policyKey: data mask random response
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: allColumns
      maskingConfig:
        type: Randomized Response
        replacementRatePercent: 10

Randomized Response (by Specifying Standard Deviation)

Support limitation: This policy is only supported in Snowflake integrations.

name: Random Numeric
policyKey: data mask random response specifying stddev
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: allColumns
      maskingConfig:
        type: Randomized Response
        stddev: 2
        clip: false

Using a Regex

name: Regex
policyKey: data mask regex
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Entity.Postal Code
      maskingConfig:
        type: Regular Expression
        regex: "(\\d{4})(\\d)"
        replacement: "$1X"
        caseInsensitive: true
        global: true
circumstances:
- type: columnTags
  columnTag: Discovered.Entity.Postal Code

With Reversibility

Support limitation: This policy is only supported in Snowflake integrations.

name: Mask using Reversible
policyKey: data mask reversible
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Entity.Social Security Number
      maskingConfig:
        type: Reversible
    exceptions:
      groups:
      - founders
circumstances:
- type: columnTags
  columnTag: Discovered.Entity.Social Security Number

Using Rounding (Date)

name: RoundingDate
policyKey: data mask rounding by date
type: data
actions:
- rules:
  - type: Masking
    exceptions:
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Entity.Date
      maskingConfig:
        type: Grouping
        timePrecision: MONTH
circumstances:
- type: columnTags
  columnTag: Discovered.Entity.Date

Using Rounding (Using Fingerprint)

Support limitation: Rounding using the fingerprint is only supported in Snowflake integrations.

name: RoundingFingerprint
policyKey: data mask round using fingerprint
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Entity.Date
      maskingConfig:
        type: Grouping
circumstances:
- type: columnTags
  columnTag: Discovered.Entity.Date

Using Rounding (Numeric)

name: RoundingNumeric
policyKey: data mask round numeric
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Entity.Date
      maskingConfig:
        type: Grouping
        bucketSize: 10
circumstances:
- type: columnTags
  columnTag: Discovered.Entity.Date

Minimize Data Created Between

name: Minimize
policyKey: data minimize
type: data
actions:
- rules:
  - type: Minimization
    config:
      percent: 15
circumstances:
- type: time
  startDate: '2020-12-01T16:23:54.734Z'
  endDate: '2020-12-31T16:23:54.745Z'

Purpose Restrictions

Any Purpose

name: Purpose
policyKey: data purpose restriction
type: data
actions:
- rules:
  - type: Purpose Restriction
    config:
        operator: any
        purposes:
        - "<ANY PURPOSE>"

Purpose in Server

name: Purpose in a specific server
policyKey: data server circumstance
type: data
actions:
- rules:
  - type: Purpose Restriction
    config:
        purposes:
          - Re-identification Prohibited
circumstances:
- type: server
  server: your@server.example.com:5432/tpc

Row-level Policy

By Time

name: Row Level By Time
policyKey: data row-level
type: data
actions:
- rules:
  - type: Time Restriction
    config:
      isOlderOrNewer: newer
      time: 2592000
circumstances:
- type: tags
  tag: Discovered.PCI

Where User

name: Row Level Where User
policyKey: data where user
type: data
actions:
- rules:
  - type: Row Restriction By User Entitlements
    config:
      operator: all
      matches:
        type: group
        tag: Discovered.Entity
circumstanceOperator: ANY
circumstances:
- type: columnTags
  columnTag: Discovered.Entity

Custom Where Clause

name: Row Level Where
policyKey: data custom where
type: data
actions:
- rules:
  - type: Row Restriction by Custom Where Clause
    config:
      predicate: "@columnTagged('Discovered.Country')  in ('USA', 'CANADA', 'MEXICO')"
circumstances:
- type: tags
  tag: Discovered.Country

Multiple Policies

name: Multiple
policyKey: data multiple
type: data
actions:
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Passport
      maskingConfig:
        type: Hash
  description: 'Passport Rule'
- rules:
  - type: Minimization
    config:
      percent: 25
  description: 'Passport Rule, also'
- rules:
  - type: Masking
    config:
      fields:
      - type: columnTags
        columnTag: Discovered.Healthcare NPI
      maskingConfig:
        type: "Null"
  description: 'Healthcare NPI Rule'
circumstanceOperator: any
circumstances:
- type: columnTags
  columnTag: Discovered.Passport
- type: columnTags
  columnTag: Discovered.Person Name

Immuta V2 API

Policy as code benefits

Reduces complexity: The data source API has been simplified to only require the connection information in most instances and one endpoint for all database technologies.
Maintains less state: Whether updating or creating, the same endpoint is used, and the same data is passed. No ids are required, so no additional state is required.
Requires fewer steps: Only an API key is required; no additional authentication step is required before using the API.
Integrates with Git: Define data sources and policies in files that can be tracked in Git and easily pushed to Immuta. Both JSON and YAML are supported for more flexibility. (For example, use YAML to add comments in files.)

Authentication

Before using the Immuta API, users need to authenticate with an API key. To generate an API key, complete the following steps in the Immuta UI.

Click the user icon and select Profile from the menu.
Go to the API Keys tab and then click Generate Key.
Complete the required fields in the modal and click Create.

Pass the key that is provided in the Authorization header:

curl --data-binary project.yaml -H "Authorization: ${IMMUTA_API_KEY}" -H "Content-Type: text/plain" https:/<immuta-url>/api/v2/project

Endpoints and details

All of the API endpoints described below take either JSON or YAML, and the endpoint and payload are the same for both creating and updating data sources, policies, projects, and purposes.

Create a data source

If you POST once explicitly defining a single table under sources, and then POST a second time with a different table, this will result in a single data source in Immuta pointing to the second table and the first data source will be deleted or disabled (depending on the value specified for hardDelete).
If you POST once with two tableTags specified (e.g., Tag.A and Tag.B) and do a follow-up POST with tableTags: [Tag.C], only Tag.C will exist on all of the tables specified; tags Tag.A and Tag.B will be removed from all the data sources. Note: If you are frequently using the v2 API to update data tags, consider using the custom REST catalog integration instead.

Through this endpoint, you can create or update all data sources for a given schema or database.

POST /api/v2/data

Query parameters

Parameter

Description

dryRun

boolean If true, no updates will actually be made. Default: false

wait

number The number of seconds to wait for data sources to be created before returning. Anything less than 0 will wait indefinitely. Default: 0

Payload

Attribute

Description

connectionKey

string A key/name to uniquely identify this collection of data sources.

object Connection information.

object Supply a template to override naming conventions. Immuta will use the system default if not supplied.

object Override options for these sources. If not provided, system defaults will all be used.

object Configure which sources are created. If not provided, all sources from the given connection will be created.

Note: See Create Data Source Payload Attribute Details for more details about these attributes.

Request payload examples

Basic Data Source
Databricks Data Source (Override Naming Convention)
Impala Data Source (with userFile)
Basic Postgres Data Source
Snowflake Data Source (Specifies Sources)

Create a policy

POST /api/v2/policy

Requirements:

Immuta permission GOVERNANCE

Query parameters

Parameter

Description

dryRun

boolean If true, no updates will actually be made. Default: false

reCertify

boolean If true (and if the certification has changed), someone will need to re-certify this policy on all impacted data sources. Default: false

Payload

Attribute

Description

policyKey

string A key/name to uniquely identify this policy.

name

string The name of the policy.

type

subscription or data The type of policy.

actions

object The actual rules for this policy ().

ownerRestrictions (optional)

object[] Object identifying the entities to which this global policy should be restricted.

circumstances (optional)

object When this policy should get applied

circumstanceOperator (optional)

all or any Specify whether "all" of the circumstances must be met for the policy to be applied, or just "any" of them.

staged (optional)

boolean Whether or not this global policy is in a staged. status. Default: false

certification (optional)

object Certification information for the global policy.

Note: See Policy Request Payload Examples for payload details.

Request payload examples**

Subscription Policies
Data Policies:
Multiple Policies

Create a project

POST /api/v2/project

Query parameters

Parameter

Description

dryRun

boolean If true, no updates will actually be made. Default: false

deleteDataSourcesOnWorkspaceDelete

boolean If true, will delete all data and the data sources associated with a project workspace when the workspace is deleted. Default: false

Payload

Attribute

Description

projectKey

string A key/name to uniquely identify this project.

name

string The name of the project.

description (optional)

string A short description for the project.

documentation (optional)

object Markdown-supported documentation for this project.

allowedMaskedJoins (optional)

boolean If true, will allow joining on masked columns between data sources in this project. Only certain policies allow masked join. Default: false

purposes (optional)

string[] The list of purposes to add to this project.

datasources (optional)

string[] The list of data sources to add to this project.

subscriptionPolicy (optional)

object The policy for which users can subscribe to this project. Default: manual subscription policy

workspace (optional)

object If this is a workspace project, this is the workspace configuration. The project will automatically be equalized.

equalization (optional)

boolean If true, will normalize all users to the same entitlements so that everyone sees the same data. Default: false

tags (optional)

string[] Tags to add to the project.

Note: See Project Request Payload Examples for payload details.

Request payload examples

Basic Project
Project: Anyone Can Subscribe
Project: Anyone Who is Approved
Project: Users with Specific Groups or Attributes
Project with Databricks Spark Workspace
Project with Snowflake Workspace

Create a purpose

POST /api/v2/purpose

Query parameters

Parameter

Description

dryRun

boolean If true, no updates will actually be made. Default: false

reAcknowledgeRequired

boolean If true, will require all users of any projects using this purpose to re-acknowledge any updated acknowledgement statements. Default: false

Payload

Attribute

Description

name

string The name of the purpose.

description (optional)

string A short description for the purpose.

acknowledgement (optional)

string The acknowledgement that users must agree to when joining a project with this purpose. If not provided, the system default will be used.

kAnonNoiseReduction (optional)

string The level of reduction allowed when doing policy adjustments on data sources in projects with this purpose.

Note: See Purposes Request Payload Examples for payload details.

Request payload examples

Basic Purpose
Advanced Purpose
Sub-Purpose

Best practices

Register all tables in a schema by enabling schema monitoring. Schema monitoring will negate the need to re-call the V2 /data endpoint when you have new tables because schema monitoring will automatically recognize and register them.
To frequently update data tags on a data source, use the custom REST catalog integration instead of the V2/data endpoint.
Use the Data engineering with limited policy downtime guide. Rather than relying on re-calling the V2 /data endpoint after a dbt run to update your data sources, follow the dbt and transform workflow and use schema monitoring to recognize changes to your data sources and reapply policies.

Immuta V2 API

Policy as code benefits

Authentication

Endpoints and details

Create a data source

Query parameters

Payload

Request payload examples

Create a policy

Query parameters

Payload

Request payload examples**

Create a project

Query parameters

Payload

Request payload examples

Create a purpose

Query parameters

Payload

Request payload examples

Best practices

Data Source Payload Attribute Details

connectionKey

connection

Special Cases

nameTemplate

options

owners

sources

Recommended: Specify All Tables

Specify a Query

Specify a Table

Additional Options

Columns

Column Descriptions

Tags

Data Source Request Payload Examples

Basic Data Source

Data Source (with More Options)

Databricks Data Source (M2M OAuth - Azure Databricks)

Databricks Data Source (Override Naming Convention)

Redshift Spectrum Data Sources

Snowflake Data Source (Specify Sources)

Create Policies API Examples

Subscription Policies

Anyone Can Subscribe

Anyone Can Subscribe When Approved

Users with Specific Groups or Attributes

Users with Specific Groups or Attributes (Advanced)

Individual Users You Select

Data Policies

Data Owner Restrictions

Masking Policies

Conditional Masking

Conditional Masking (Using Otherwise Clause)

With a Constant

Format Preserving Masking

With Hashing (No Tags)

K-Anonymization (Using Fingerprint)

K-Anonymization (by Specifying K)

K-Anonymization (by Specifying Re-identification Probability)

Make Null Using Column Regex

Randomized Response

Randomized Response (by Specifying Standard Deviation)

Using a Regex

With Reversibility

Using Rounding (Date)

Using Rounding (Using Fingerprint)

Using Rounding (Numeric)

Minimize Data Created Between

Purpose Restrictions

Any Purpose

Purpose in Server

Row-level Policy

By Time

Where User

Custom Where Clause

Multiple Policies

Create Projects API Examples

Basic Project

`connectionKey`

`connection`

`nameTemplate`

`options`

`owners`

`sources`

`connectionKey`

`connection`

`nameTemplate`

`options`

`owners`

`sources`