Skip to content

Sensitive Data Discovery (SDD) API Reference Guide

Note

In previous documentation, rule and pattern are referred to as classifier or identifier, and framework is referred to as template.

Warning

Users with native SDD enabled can use the UI to complete the actions documented on this page; however, it is not recommended to use both the API and the UI to configure native SDD.

Workflow

  1. Create a pattern.
  2. Create an identification framework containing one or more rules.
  3. Search for pattern or identification frameworks.
  4. Apply an identification framework to one or more data sources.
  5. Run SDD on one more more data sources; tags are applied to columns where rules were found.
  6. Update patterns or identification frameworks.
  7. Delete patterns or identification frameworks.

Create a pattern

Endpoint

Method Path Purpose
POST sdd/classifier Create a pattern.

Query Parameters

None.

Payload Parameters

Attribute Description Required
name string Unique, request-friendly pattern name. Yes
displayName string Unique, human-readable pattern name. Yes
description string The pattern description. Yes
type string The type of pattern: regex, dictionary, columnNameRegex, or builtIn. Yes
config object The configuration of the pattern, which includes config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. Yes
config.minConfidence number The minimum percentage confidence that the pattern must match for the resulting tags to be applied. This attribute is not supported with native SDD. No
config.tags array[string] The default resulting tags to apply when the pattern is matched; they must begin with Discovered .. No
config.regex string A case-insensitive regular expression to match against column values. The pattern must have regex, columnNameRegex, or values to match to data. No
config.columnNameRegex string A case-insensitive regular expression to match against column names. The pattern must have regex, columnNameRegex, or values to match to data. No
config.values array[string] The list of words included in the dictionary to match against column values. The pattern must have regex, columnNameRegex, or values to match to data. No
config.caseSensitive boolean Indicates whether or not values are case sensitive. Defaults to false. No

Response Parameters

Attribute Description
createdBy array Includes details about the user who created the pattern, such as their profile id, name, and email.
name string Unique, request-friendly pattern name.
displayName string Unique, human-readable pattern name.
description string The pattern description.
type string The type of pattern: regex, dictionary, columnNameRegex, or builtIn.
config object The configuration of the pattern, which includes config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.
config.minConfidence number The minimum percentage confidence that the pattern must match for the resulting tags to be applied.
config.tags array[string] The default resulting tags to apply to the data source when the pattern is matched.
config.columnNameRegex string A case-insensitive regular expression to match against column names.
config.regex string A case-insensitive regular expression to match against column values.
config.values array[string] The list of words included in the dictionary to match against column values.
config.caseSensitive boolean Indicates whether or not values are case sensitive.
createdAt date When the pattern was created.
updatedAt date When the pattern was last updated.

Request example

The following request creates a pattern, saved in example-payload.json.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/classifier

Payload examples

Regex pattern payload
{
  "name": "MY_REGEX_PATTERN",
  "displayName": "My Regex Pattern",
  "description": "A regex pattern example",
  "type": "regex",
  "config": {
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5,
    "tags": ["Discovered.regex-example"]
  }
}
Dictionary pattern payload
{
  "name": "MY_DICTIONARY_PATTERN",
  "displayName": "My Dictionary Pattern",
  "description": "A dictionary pattern example",
  "type": "dictionary",
  "config": {
    "values": ["Bob", "Eve"],
    "caseSensitive": true,
    "minConfidence": 0.6,
    "tags": ["Discovered.dictionary-example", "Discovered.dictionary-pattern-example"]
  }
}
Column name pattern payload
{
  "name": "MY_COLUMN_NAME_PATTERN",
  "displayName": "My Column Name Pattern",
  "description": "A column name pattern example",
  "type": "columnNameRegex",
  "config": {
    "columnNameRegex": "ssn|social ?security",
    "tags": ["Discovered.column-name-regex"]
  }
}

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "MY_REGEX_PATTERN",
  "displayName": "My Regex Pattern",
  "description": "A regex pattern example",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-14T18:48:56.289Z",
  "updatedAt": "2021-10-14T18:48:56.289Z"
}

Create an identification framework

Endpoint

Method Path Purpose
POST sdd/template Create an identification framework.

Query Parameters

None.

Payload Parameters

Attribute Description Required
name string Unique, request-friendly framework name. Yes
displayName string Unique, human-readable framework name. Yes
description string The framework description. Yes
classifiers array The patterns to include in the framework and any additional overrides for those patterns. Yes
classifiers.name string The name of the pattern to include in the framework. Yes
classifiers.overrides array The overrides to modify the pattern for this framework. No
classifiers.overrides.minConfidence string The minimum percentage confidence that the pattern must match for the resulting tags to be applied. This attribute is not supported with native SDD. No
classifiers.overrides.tags array The resulting tags to apply when the pattern is matched. These tags will override the pattern's default tags and must begin with Discovered .. No
sampleSize integer The number of records to sample from the data source to discover pattern matches. This attribute is not supported with native SDD. No

Response Parameters

Attribute Description
id integer The unique ID of the framework.
createdBy array Includes details about the user who created the framework, such as their profile id, name, and email.
name string Unique, request-friendly framework name.
displayName string Unique, human-readable framework name.
description string The framework description.
classifiers array The rules in the framework and any overrides for those rules.
sampleSize integer The number of records to sample from the data source to discover pattern matches.
createdAt date When the framework was created.
updatedAt date When the framework was last updated.

Request example

The following request creates an identification framework that contains 2 rules, saved in example-payload.json.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template

Payload example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_PATTERN"
    },
    {
      "name": "MY_REGEX_PATTERN"
    },
    {
      "name": "AGE",
      "overrides": {
        "tags": [
          "Discovered.Entity.Age"
        ]
      }
    }
  ],
  "sampleSize": 100
}

Response example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 1,
  "createdAt": "2021-10-14T19:12:22.092Z",
  "updatedAt": "2021-10-14T19:12:22.092Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_PATTERN",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_PATTERN",
      "overrides": {}
    }
  ]
}

Search for patterns or identification frameworks

Method Path Purpose
GET sdd/classifier List or search patterns.
GET sdd/template List or search identification frameworks.
GET sdd/classifier/{classifierName} View a specific pattern by name.
GET sdd/template/{templateName} View a specific identification framework by name.
GET sdd/template/global View the current global framework.

List or search for patterns

Endpoint

Method Path Purpose
GET sdd/classifier List or search patterns.

Query Parameters

Attribute Description Required
sortField string The field by which to sort the search results: id, name, displayName, type, createdAt, or updatedAt. No
sortOrder string Denotes whether to sort the results in ascending (asc) or descending (desc) order. Default is asc. No
offSet integer Use in combination with limit to fetch pages. No
limit integer Limits the number of results displayed per page. No
type array[string] Searches based on pattern type: regex, dictionary, builtIn, or columnNameRegex. No
searchText string A partial, case-insensitive search on name. No

Response Parameters

Attribute Description
count integer The number of pattern found matching the search criteria.
createdBy array Includes details about the user who created the pattern, such as their profile id, name, and email.
name string Unique, request-friendly pattern name.
displayName string Unique, human-readable pattern name.
description string The pattern description.
type string The type of pattern: regex, dictionary, columnNameRegex, or builtIn.
config object The configuration of the pattern, which includes config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.
config.minConfidence number The minimum percentage confidence that the pattern must match for the resulting tags to be applied.
config.tags array[string] The default resulting tags to apply when the pattern is matched.
config.columnNameRegex string A case-insensitive regular expression to optionally match against column names.
config.regex string A case-insensitive regular expression to match against column values.
config.values array[string] The list of words included in the dictionary to match against column values.
config.caseSensitive boolean Indicates whether or not values are case sensitive.
createdAt date When the pattern was created.
updatedAt date When the pattern was last updated.

Request example

The following request lists 5 patterns.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier?sortField=name&sortOrder=asc&limit=5

Response example

{
  "count": 67,
  "hits": [
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AGE",
      "displayName": "Age",
      "description": "Matches numeric strings between 10 and 199.",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Indirect",
          "Discovered.PHI",
          "Discovered.Entity.Age"
        ],
        "conditionalTags": {}
      },
      "id": 3,
      "createdAt": "2021-10-28T07:34:58.761Z",
      "updatedAt": "2021-10-28T07:34:58.761Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "ARGENTINA_DNI_NUMBER",
      "displayName": "Argentina DNI Number",
      "description": "Matches strings consistent with Argentina National Identity (DNI) Number.  Requires an eight digit number with optional periods between the second and third and fifth and sixth digit.",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Argentina",
          "Discovered.PHI",
          "Discovered.Entity.DNI Number"
        ],
        "conditionalTags": {}
      },
      "id": 4,
      "createdAt": "2021-10-28T07:34:58.769Z",
      "updatedAt": "2021-10-28T07:34:58.769Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_MEDICARE_NUMBER",
      "displayName": "Australia Medicare Number",
      "description": "Matches numeric strings consistent with Australian Medicare Number.  Requires a ten or eleven digit number.  The starting digit must be between 2 and 6, inclusive.  Optional spaces can be placed between the fourth and fifth and ninth and tenth digit.  Optional 11th separated by a `/` can be present.  A checksum is required.",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Medicare Number"
        ],
        "conditionalTags": {}
      },
      "id": 5,
      "createdAt": "2021-10-28T07:34:58.779Z",
      "updatedAt": "2021-10-28T07:34:58.779Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_PASSPORT",
      "displayName": "Australia Passport",
      "description": "Matches strings consistent with Australian Passport number.  A 8 or 9 character string is required, with a starting upper case character (N, E, D, F, A, C, U, X) or a two character starting character (P followed by A, B, C, D, E, F, U, W, X, or Z) followed by seven digits",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Passport"
        ],
        "conditionalTags": {}
      },
      "id": 26,
      "createdAt": "2021-10-28T07:34:59.010Z",
      "updatedAt": "2021-10-28T07:34:59.010Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_TAX_FILE_NUMBER",
      "displayName": "Australia Tax File Number",
      "description": "Matches strings consistent with Australia Tax File Number.  Requires a nine digit number with optional spaces between the third and fourth and sixth and seventh digits.  A checksum is also required",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Tax File Number"
        ],
        "conditionalTags": {}
      },
      "id": 6,
      "createdAt": "2021-10-28T07:34:58.789Z",
      "updatedAt": "2021-10-28T07:34:58.789Z"
    }
  ]
}

List or search for identification frameworks

Endpoint

Method Path Purpose
GET sdd/template List or search identification frameworks.

Query Parameters

Attribute Description Required
sortField string The field by which to sort the search results: id, name, displayName, type, createdAt, or updatedAt. No
sortOrder string Denotes whether to sort the results in ascending (asc) or descending (desc) order. Default is asc. No
offSet integer Use in combination with limit to fetch pages. No
limit integer Limits the number of results displayed per page. No
classifiers array[string] Filters framework results to those containing the specified patterns. No
searchText string A partial, case-insensitive search on the framework name. No

Response Parameters

Attribute Description
count integer The number of identification frameworks found matching the search criteria.
id integer The unique ID of the framework.
createdBy array Includes details about the user who created the framework, such as their profile id, name, and email.
name string Unique, request-friendly framework name.
displayName string Unique, human-readable framework name.
description string The framework description.
classifiers array The rules in the framework and any overrides for those rules.
sampleSize integer The number of records to sample from the data source to discover pattern matches.
createdAt date When the framework was created.
updatedAt date When the framework was last updated.

Request example

The following request lists all identification frameworks.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template

Response example

{
  "count": 1,
  "hits": [
    {
      "name": "MY_FIRST_FRAMEWORK",
      "displayName": "My First Framework",
      "description": "This is the first framework I've created.",
      "sampleSize": 100,
      "createdBy": {
        "id": 1,
        "name": "John",
        "email": "john@example.com"
      },
      "id": 1,
      "createdAt": "2021-10-14T19:12:22.092Z",
      "updatedAt": "2021-10-14T19:12:22.092Z",
      "classifiers": [
        {
          "name": "MY_COLUMN_NAME_PATTERN",
          "overrides": {}
        },
        {
          "name": "MY_REGEX_PATTERN",
          "overrides": {}
        }
      ]
    }
  ]
}

View a pattern by name

Endpoint

Method Path Purpose
GET sdd/classifier/{classifierName} Get a pattern by name.

Query Parameters

Attribute Description Required
classifierName string The name of the pattern. Yes

Response Parameters

Attribute Description
id integer The unique ID of the pattern.
createdBy array Includes details about the user who created the pattern, such as their profile id, name, and email.
name string Unique, request-friendly pattern name.
displayName string Unique, human-readable pattern name.
description string The pattern description.
type string The type of pattern: regex, dictionary, columnNameRegex, or builtIn.
config object The configuration of the pattern, which includes config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.
config.minConfidence number The minimum percentage confidence that the pattern must match for the resulting tags to be applied.
config.tags array[string] The name of the resulting tags to apply to the data source.
config.columnNameRegex string A case-insensitive regular expression to optionally match against column names.
config.regex string A case-insensitive regular expression to match against column values.
config.values array[string] The list of words included in the dictionary to match against column values.
config.caseSensitive boolean Indicates whether or not values are case sensitive.
createdAt date When the pattern was created.
updatedAt date When the pattern was last updated.

Request example

This request gets the pattern named MY_REGEX_PATTERN.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier/MY_REGEX_PATTERN

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "MY_REGEX_PATTERN",
  "displayName": "My Regex Pattern",
  "description": "A regex pattern example",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-18T16:48:18.819Z",
  "updatedAt": "2021-10-18T16:48:18.819Z"
}

View an identification framework by name

Endpoint

Method Path Purpose
GET sdd/template/{templateName} Get an identification framework by name.

Query Parameters

Attribute Description Required
templateName string The name of the identification framework. Yes

Response Parameters

Attribute Description
id integer The unique ID of the framework.
createdBy array Includes details about the user who created the framework, such as their profile id, name, and email.
name string Unique, request-friendly framework name.
displayName string Unique, human-readable framework name.
description string The framework description.
classifiers array The rules in the framework and any overrides for those rules.
sampleSize integer The number of records to sample from the data source to discover pattern matches.
createdAt date When the framework was created.
updatedAt date When the framework was last updated.

Request example

This request gets the identification framework named MY_FIRST_FRAMEWORK.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_FRAMEWORK

Response example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@immuta.com"
  },
  "id": 1,
  "createdAt": "2021-10-18T16:54:24.920Z",
  "updatedAt": "2021-10-18T16:54:24.920Z",
  "classifiers": [
    {
      "name": "MY_DICTIONARY_PATTERN",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_PATTERN",
      "overrides": {}
    }
  ]
}

View the current global framework

Endpoint

Method Path Purpose
GET sdd/template/global View the current global framework.

Query Parameters

None.

Response Parameters

Attribute Description
id integer The unique ID of the framework.
name string Unique, request-friendly framework name.
displayName string Unique, human-readable framework name.
description string The framework description.
classifiers array The rules in the framework and any overrides for those rules.
sampleSize integer The number of records to sample from the data source to discover pattern matches.
createdBy array Includes details about the user who created the framework, such as their profile id, name, and email.
createdAt date When the framework was created.
updatedAt date When the framework was last updated.

Request example

This request gets the current global framework information.

curl -X 'GET' \
  'https://demo.immuta.com/sdd/template/global' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer 9ba76f3c64c345ad817fa467d7110556'

Response example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "sampleSize": 100,
  "createdBy": {
    "id": 2,
    "name": "Jane Doe",
    "email": "jane.doe@immuta.com"
  },
  "id": 1,
  "createdAt": "2022-08-10T20:35:43.252Z",
  "updatedAt": "2022-08-10T20:35:43.252Z",
  "classifiers": [
    {
      "name": "AGE",
      "overrides": {}
    },
    {
      "name": "ETHNIC_GROUP",
      "overrides": {}
    }
  ]
}

Apply identification frameworks to data sources

Endpoint

Method Path Purpose
PUT sdd/template/apply Apply an identification framework to a set of data sources.

Query Parameters

None.

Payload Parameters

Attribute Description Required
template string The name of the identification framework to apply to the data sources. null to clear current framework, and the data source will use the global framework. Yes
sources string The name of the data sources to apply the framework to. Yes

Response Parameters

Attribute Description
success boolean When true, the request was successful.

Request example

This request applies the MY_FIRST_FRAMEWORK framework to the Public Case data source.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/apply

Payload example

{
  "template": "MY_FIRST_FRAMEWORK",
  "sources": [
    "Public Case"
  ]
}

Response example

{
  "success": true
}

Run SDD on data sources

Endpoint

Method Path Purpose
POST sdd/run Run SDD on specified data sources.

Query Parameters

None.

Payload Parameters

Attribute Description Required
sources string The name of the data sources to apply the identification framework to. Yes
all boolean If true, SDD will run on all Immuta data sources. No
wait integer The number of seconds to wait for the SDD jobs to finish. The value -1 will wait until the jobs complete. Default is -1. No
dryRun boolean When true, SDD will not update the tags on the data source(s). Instead of applying tags, SDD returns the tags that would be applied to the data source. This allows users to evaluate whether or not rules or frameworks are applying tags correctly without updating the data source. Default is false. No
template string If passed, Immuta will run SDD with this framework instead of the applied framework on the data source(s). Passing template when dryRun is false will cause an error. No

Response Parameters

Attribute Description
id string The unique identifier of the job.
state string The job state. Statuses include created, retry, active, completed, expired, cancelled, or failed.
output array[string] Information about the tags applied on the data source, including diff (added and removed tags) and the current state of allTags on all columns in the data sources.

Request example: Run SDD on a single data source

This request runs SDD on the data source Public Case.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/run

Payload example

{
  "sources": [
    "Insurance Data"
  ]
}

Response example

{
  "Insurance Data": {
    "id": "d2edc1d0-328c-11ec-9d5a-6793988ccf95",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {
          "ssn": [
            "Discovered.PII"
          ],
          "email": [
            "Discovered.PII"
          ]
        },
        "removedTags": {
          "ssn": [
            "Discovered.Country.US"
          ]
        }
      },
      "sddTagResult": {
        "ssn": [
          "Discovered.Entity.Social Security Number",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ],
        "email": [
          "Discovered.Entity.Electronic Mail Address",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ]
      }
    }
  }
}

Request example: Run SDD on all data sources

This request runs SDD on all your data sources.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/run

Payload example

{
  "all": true
}

Response example

{
  "Insurance Data": {
    "id": "d2edc1d0-328c-11ec-9d5a-6793988ccf95",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {
          "ssn": [
            "Discovered.PII"
          ],
          "email": [
            "Discovered.PII"
          ]
        },
        "removedTags": {
          "ssn": [
            "Discovered.Country.US"
          ]
        }
      },
      "sddTagResult": {
        "ssn": [
          "Discovered.Entity.Social Security Number",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ],
        "email": [
          "Discovered.Entity.Electronic Mail Address",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ]
      }
    }
  }
  "Finance Data": {
    "id": "d2edc1d0-328c-11ec-9d5a-695e896d59s",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {
          "ssn": [
            "Discovered.PII"
          ],
          "email": [
            "Discovered.PII"
          ]
        },
        "removedTags": {
          "ssn": [
            "Discovered.Country.US"
          ]
        }
      },
      "sddTagResult": {
        "ssn": [
          "Discovered.Entity.Social Security Number",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ],
        "email": [
          "Discovered.Entity.Electronic Mail Address",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ]
      }
    }
  }
}

Request example: Test run SDD on all data sources

This request runs SDD on the Medical Claims data source with the PII_REVISION framework, but will not tag any columns if matches are found.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/run

Payload example

{
  "sources": [
    "Medical Claims"
  ],
  "dryRun": true,
  "template": "PII_REVISION"
}

Response example

{
  "Medical Claims": {
    "id": "86fc4f70-380f-11ec-a432-81748c911385",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {},
        "removedTags": {
          "dob": [
            "Discovered.Entity.Date",
            "Discovered.Entity.Date of Birth",
            "Discovered.Identifier Indirect",
            "Discovered.PHI",
            "Discovered.PII"
          ],
          "ssn": [
            "Discovered.Country.US",
            "Discovered.Entity.Social Security Number",
            "Discovered.Identifier Direct",
            "Discovered.PHI"
          ],
          "state": [
            "Discovered.Country.US",
            "Discovered.Entity.Location",
            "Discovered.Entity.State",
            "Discovered.Identifier Indirect"
          ],
          "gender": [
            "Discovered.Entity.Gender",
            "Discovered.Identifier Indirect",
            "Discovered.PHI",
            "Discovered.PII"
          ],
          "date_of_service": [
            "Discovered.Entity.Date",
            "Discovered.Identifier Indirect",
            "Discovered.PHI",
            "Discovered.PII"
          ]
        }
      },
      "sddTagResult": {
        "ssn": [
          "Discovered.PII"
        ]
      }
    }
  }
}

Update patterns or identification frameworks

Method Path Purpose
PUT /sdd/classifier/{classifierName} Update a pattern. Partial updates are not supported.
POST sdd/classifier/template/{templateName}/clone Clone an identification framework.
PUT /sdd/template/{templateName} Update an identification framework.

Update a pattern

Endpoint

Method Path Purpose
PUT sdd/classifier/{classifierName} Update a pattern. Partial updates are not supported.

Query Parameters

Attribute Description Required
classifierName string The name of the pattern to update. Yes

Payload Parameters

Attribute Description Required
name string Unique, request-friendly pattern name. Yes
displayName string Unique, human-readable pattern name. Yes
description string The pattern description. Yes
type string The type of pattern: regex, dictionary, columnNameRegex, or builtIn. Yes
config object The configuration of the pattern, which includes config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. Yes
config.minConfidence number The minimum percentage confidence that the pattern must match for the resulting tags to be applied. This attribute is not supported with native SDD. No
config.tags array[string] The default resulting tags to apply when the pattern is matched; they must begin with Discovered .. No
config.regex string A case-insensitive regular expression to match against column values. The pattern must have regex, columnNameRegex, or values to match to data. No
config.columnNameRegex string A case-insensitive regular expression to match against column names. The pattern must have regex, columnNameRegex, or values to match to data. No
config.values array[string] The list of words included in the dictionary to match against column values. The pattern must have regex, columnNameRegex, or values to match to data. No
config.caseSensitive boolean Indicates whether or not values are case sensitive. Defaults to false. No

Response Parameters

Attribute Description
createdBy array Includes details about the user who created the pattern, such as their profile id, name, and email.
name string Unique, request-friendly pattern name.
displayName string Unique, human-readable pattern name.
description string The pattern description.
type string The type of pattern: regex, dictionary, columnNameRegex, or builtIn.
config object The configuration of the pattern, which includes config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.
config.minConfidence number The minimum percentage confidence that the pattern must match for the resulting tags to be applied.
config.tags array[string] The default resulting tags to apply to the data source when the pattern is matched.
config.columnNameRegex string A case-insensitive regular expression to match against column names.
config.regex string A case-insensitive regular expression to match against column values.
config.values array[string] The list of words included in the dictionary to match against column values.
config.caseSensitive boolean Indicates whether or not values are case sensitive.
createdAt date When the pattern was created.
updatedAt date When the pattern was last updated.

Request example

The following request updates the name and description of the MY_REGEX_PATTERN pattern.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/classifier/MY_REGEX_PATTERN
Payload example
{
  "name": "REGULAR_EXPRESSIONS",
  "displayName": "Regular Expressions",
  "description": "A pattern example using regex.",
  "type": "regex",
  "config": {
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5,
    "tags": ["Discovered.regex-example"]
  }
}

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "REGULAR_EXPRESSIONS",
  "displayName": "Regular Expressions",
  "description": "A pattern example using regex.",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-14T18:48:56.289Z",
  "updatedAt": "2021-10-19T12:48:56.289Z"
}

Clone an identification framework

Endpoint

Method Path Purpose
POST sdd/template/{templateName}/clone Clone an identification framework.

Query Parameters

Attribute Description Required
templateName string The name of the identification framework to clone. Yes

Payload Parameters

Attribute Description Required
name string Unique, request-friendly framework name for the cloned framework. Yes
displayName string Unique, human-readable framework name for the cloned framework. Yes
description string The cloned framework description. No

Response Parameters

Attribute Description
id integer The unique ID of the framework.
createdBy array Includes details about the user who created the framework, such as their profile id, name, and email.
name string Unique, request-friendly framework name.
displayName string Unique, human-readable framework name.
description string The framework description.
classifiers array The rules in the framework and any overrides for those rules.
sampleSize integer The number of records to sample from the data source to discover pattern matches.
createdAt date When the framework was created.
updatedAt date When the framework was last updated.

Request example

This request clones the MY_FIRST_FRAMEWORK identification framework.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_FRAMEWORK/clone
Payload example
{
  "name": "CLONE_OF_FIRST_FRAMEWORK",
  "displayName": "Clone of My First Framework",
  "description": "This is a clone of my first framework."
}

Response example

{
  "name": "CLONE_OF_FIRST_FRAMEWORK",
  "displayName": "Clone of My First Framework",
  "description": "This is a clone of my first framework.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 4,
  "createdAt": "2021-10-19T16:21:17.660Z",
  "updatedAt": "2021-10-19T16:21:17.660Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_PATTERN",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_PATTERN",
      "overrides": {}
    }
  ]
}

Update an identification framework

Endpoint

Method Path Purpose
PUT sdd/template/{templateName} Update an identification framework.

Query Parameters

Attribute Description Required
templateName string The name of the identification framework to update. Yes

Payload Parameters

Attribute Description Required
name string Unique, request-friendly framework name. Yes
displayName string Unique, human-readable framework name. Yes
description string The framework description. Yes
classifiers array The patterns to include in the framework and any additional overrides for those patterns. Yes
classifiers.name string The name of the pattern to include in the framework. Yes
classifiers.overrides array The overrides to modify the pattern for this framework. No
classifiers.overrides.minConfidence string The minimum percentage confidence that the pattern must match for the resulting tags to be applied. This attribute is not supported with native SDD. No
classifiers.overrides.tags array The resulting tags to apply when the pattern is matched. These tags will override the pattern's default tags and must begin with Discovered .. No
sampleSize integer The number of records to sample from the data source to discover pattern matches. This attribute is not supported with native SDD. No

Response Parameters

Attribute Description
id integer The unique ID of the framework.
createdBy array Includes details about the user who created the framework, such as their profile id, name, and email.
name string Unique, request-friendly framework name.
displayName string Unique, human-readable framework name.
description string The framework description.
classifiers array The rules in the framework and any overrides for those rules.
sampleSize integer The number of records to sample from the data source to discover pattern matches.
createdAt date When the framework was created.
updatedAt date When the framework was last updated.

Request example

The following request updates the name of, description of, and rules in the MY_FIRST_FRAMEWORK identification framework.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_FRAMEWORK
Payload example
{
  "name": "HEALTH_DATA",
  "displayName": "Health Data",
  "description": "This framework uses the column regex and regex patterns.",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_PATTERN"
    },
    {
      "name": "REGULAR_EXPRESSION"
    }
  ],
  "sampleSize": 100
}

Response example

{
  "name": "HEALTH_DATA",
  "displayName": "Health Data",
  "description": "This framework uses the column regex and regex patterns.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 1,
  "createdAt": "2021-10-14T19:12:22.092Z",
  "updatedAt": "2021-10-20T19:12:22.092Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_PATTERN",
      "overrides": {}
    },
    {
      "name": "REGULAR_EXPRESSION",
      "overrides": {}
    }
  ]
}

Delete patterns or identification frameworks

Method Path Purpose
DELETE /sdd/classifier/{classifierName} Delete a pattern.
DELETE /sdd/template/{templateName} Delete an identification framework.

Delete a pattern

Endpoint

Method Path Purpose
DELETE sdd/classifier/{classifierName} Delete a pattern.

Query Parameters

Attribute Description Required
classifierName string The name of the pattern to delete. Yes

Response Parameters

Attribute Description
createdBy array Includes details about the user who created the pattern, such as their profile id, name, and email.
name string Unique, request-friendly pattern name.
displayName string Unique, human-readable pattern name.
description string The pattern description.
type string The type of pattern: regex, dictionary, columnNameRegex, or builtIn.
config object The configuration of the pattern, which includes config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.
config.minConfidence number The minimum percentage confidence that the pattern must match for the resulting tags to be applied.
config.tags array[string] The default resulting tags to apply to the data source when the pattern is matched.
config.columnNameRegex string A case-insensitive regular expression to match against column names.
config.regex string A case-insensitive regular expression to match against column values.
config.values array[string] The list of words included in the dictionary to match against column values.
config.caseSensitive boolean Indicates whether or not values are case sensitive.
createdAt date When the pattern was created.
updatedAt date When the pattern was last updated.

Request example

The following request deletes the REGULAR_EXPRESSION pattern.

curl \
    --request DELETE \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier/REGULAR_EXPRESSION

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "REGULAR_EXPRESSION",
  "displayName": "Regular Expression",
  "description": "This pattern uses regular expression",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-19T15:54:28.695Z",
  "updatedAt": "2021-10-19T16:00:02.329Z"
}

Delete an identification framework

Endpoint

Method Path Purpose
DELETE sdd/template/{templateName} Delete an identification framework.

Query Parameters

Attribute Description Required
templateName string The name of the identification framework to delete. Yes

Response Parameters

Attribute Description
id integer The unique ID of the framework.
createdBy array Includes details about the user who created the framework, such as their profile id, name, and email.
name string Unique, request-friendly framework name.
displayName string Unique, human-readable framework name.
description string The framework description.
classifiers array The rules in the framework and any overrides for those rules.
sampleSize integer The number of records to sample from the data source to discover pattern matches.
createdAt date When the framework was created.
updatedAt date When the framework was last updated.

Request example

The following request deletes the HEALTH_DATA identification framework.

curl \
    --request DELETE \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template/HEALTH_DATA

Response example

{
  "name": "HEALTH_DATA",
  "displayName": "Health Data",
  "description": "This is a framework for health data.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@immuta.com"
  },
  "id": 1,
  "createdAt": "2021-10-19T16:07:39.356Z",
  "updatedAt": "2021-10-19T16:07:39.356Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_PATTERN",
      "overrides": {}
    }
  ]
}