Manage Sensitive Data Discovery (SDD)

Sensitive data discovery (SDD) API reference guide

In previous documentation, rule and pattern are referred to as classifier or identifier, and framework is referred to as template.

Workflow

Create a pattern

To run this pattern against your data, ensure it is added to a framework.

POST /sdd/classifier

Create a pattern.

Payload parameters

AttributeDescriptionRequired

name

string Unique, request-friendly pattern name.

Yes

displayName

string Unique, human-readable pattern name.

Yes

description

string The pattern description.

Yes

type

string The type of pattern: regex, dictionary, columnNameRegex, or builtIn.

Yes

config

object The configuration of the pattern, which includes config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.

Yes

config.tags

array[string] The default resulting tags to apply when the pattern is matched; they must begin with Discovered . .

No

config.regex

string A case-insensitive regular expression to match against column values. The pattern must have regex, columnNameRegex, or values to match to data.

No

config.columnNameRegex

string A case-insensitive regular expression to match against column names. The pattern must have regex, columnNameRegex, or values to match to data.

No

config.values

array[string] The list of words included in the dictionary to match against column values. The pattern must have regex, columnNameRegex, or values to match to data.

No

config.caseSensitive

boolean Indicates whether or not values are case sensitive. Defaults to false.

No

Response parameters

AttributeDescription

createdBy

array Includes details about the user who created the pattern, such as their profile id, name, and email.

name

string Unique, request-friendly pattern name.

displayName

string Unique, human-readable pattern name.

description

string The pattern description.

type

string The type of pattern: regex, dictionary, columnNameRegex, or builtIn.

config

object The configuration of the pattern, which includes config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.

config.tags

array[string] The default resulting tags to apply to the data source when the pattern is matched.

config.columnNameRegex

string A case-insensitive regular expression to match against column names.

config.regex

string A case-insensitive regular expression to match against column values.

config.values

array[string] The list of words included in the dictionary to match against column values.

config.caseSensitive

boolean Indicates whether or not values are case sensitive.

createdAt

date When the pattern was created.

updatedAt

date When the pattern was last updated.

Request example

The following request creates a pattern, saved in example-payload.json.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/classifier

Payload examples

{
  "name": "MY_REGEX_PATTERN",
  "displayName": "My Regex Pattern",
  "description": "A regex pattern example",
  "type": "regex",
  "config": {
    "regex": "^[A-Z][a-z]+",
    "tags": ["Discovered.regex-example"]
  }
}

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "MY_REGEX_PATTERN",
  "displayName": "My Regex Pattern",
  "description": "A regex pattern example",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+"
  },
  "id": 67,
  "createdAt": "2021-10-14T18:48:56.289Z",
  "updatedAt": "2021-10-14T18:48:56.289Z"
}

Create an identification framework

POST /sdd/template

Create an identification framework.

Payload parameters

AttributeDescriptionRequired

name

string Unique, request-friendly framework name.

Yes

displayName

string Unique, human-readable framework name.

Yes

description

string The framework description.

Yes

classifiers

array The patterns to include in the framework and any additional overrides for those patterns.

Yes

classifiers.name

string The name of the pattern to include in the framework.

Yes

classifiers.overrides

array The overrides to modify the pattern for this framework.

No

classifiers.overrides.tags

array The resulting tags to apply when the pattern is matched. These tags will override the pattern's default tags and must begin with Discovered . .

No

Response parameters

AttributeDescription

id

integer The unique ID of the framework.

createdBy

array Includes details about the user who created the framework, such as their profile id, name, and email.

name

string Unique, request-friendly framework name.

displayName

string Unique, human-readable framework name.

description

string The framework description.

classifiers

array The rules in the framework and any overrides for those rules.

createdAt

date When the framework was created.

updatedAt

date When the framework was last updated.

Request example

The following request creates an identification framework that contains 2 rules, saved in example-payload.json.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template

Payload example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_PATTERN"
    },
    {
      "name": "MY_REGEX_PATTERN"
    },
    {
      "name": "AGE",
      "overrides": {
        "tags": [
          "Discovered.Entity.Age"
        ]
      }
    }
  ]
}

Response example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 1,
  "createdAt": "2021-10-14T19:12:22.092Z",
  "updatedAt": "2021-10-14T19:12:22.092Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_PATTERN",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_PATTERN",
      "overrides": {}
    }
  ]
}

Search for patterns or identification frameworks

MethodPathPurpose

GET

sdd/classifier

GET

sdd/template

GET

sdd/classifier/{classifierName}

GET

sdd/template/{templateName}

GET

sdd/template/global

List or search for patterns

GET /sdd/classifier

List or search patterns.

Query parameters

AttributeDescriptionRequired

sortField

string The field by which to sort the search results: id, name, displayName, type, createdAt, or updatedAt.

No

sortOrder

string Denotes whether to sort the results in ascending (asc) or descending (desc) order. Default is asc.

No

offSet

integer Use in combination with limit to fetch pages.

No

limit

integer Limits the number of results displayed per page.

No

type

array[string] Searches based on pattern type: regex, dictionary, builtIn, or columnNameRegex.

No

searchText

string A partial, case-insensitive search on name.

No

Response parameters

AttributeDescription

count

integer The number of pattern found matching the search criteria.

createdBy

array Includes details about the user who created the pattern, such as their profile id, name, and email.

name

string Unique, request-friendly pattern name.

displayName

string Unique, human-readable pattern name.

description

string The pattern description.

type

string The type of pattern: regex, dictionary, columnNameRegex, or builtIn.

config

object The configuration of the pattern, which includes config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.

config.tags

array[string] The default resulting tags to apply when the pattern is matched.

config.columnNameRegex

string A case-insensitive regular expression to optionally match against column names.

config.regex

string A case-insensitive regular expression to match against column values.

config.values

array[string] The list of words included in the dictionary to match against column values.

config.caseSensitive

boolean Indicates whether or not values are case sensitive.

createdAt

date When the pattern was created.

updatedAt

date When the pattern was last updated.

Request example

The following request lists 5 patterns.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier?sortField=name&sortOrder=asc&limit=5

Response example

{
  "count": 67,
  "hits": [
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AGE",
      "displayName": "Age",
      "description": "Matches numeric strings between 10 and 199.",
      "type": "builtIn",
      "config": {
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Indirect",
          "Discovered.PHI",
          "Discovered.Entity.Age"
        ],
        "conditionalTags": {}
      },
      "id": 3,
      "createdAt": "2021-10-28T07:34:58.761Z",
      "updatedAt": "2021-10-28T07:34:58.761Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "ARGENTINA_DNI_NUMBER",
      "displayName": "Argentina DNI Number",
      "description": "Matches strings consistent with Argentina National Identity (DNI) Number.  Requires an eight digit number with optional periods between the second and third and fifth and sixth digit.",
      "type": "builtIn",
      "config": {
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Argentina",
          "Discovered.PHI",
          "Discovered.Entity.DNI Number"
        ],
        "conditionalTags": {}
      },
      "id": 4,
      "createdAt": "2021-10-28T07:34:58.769Z",
      "updatedAt": "2021-10-28T07:34:58.769Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_MEDICARE_NUMBER",
      "displayName": "Australia Medicare Number",
      "description": "Matches numeric strings consistent with Australian Medicare Number.  Requires a ten or eleven digit number.  The starting digit must be between 2 and 6, inclusive.  Optional spaces can be placed between the fourth and fifth and ninth and tenth digit.  Optional 11th separated by a `/` can be present.  A checksum is required.",
      "type": "builtIn",
      "config": {
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Medicare Number"
        ],
        "conditionalTags": {}
      },
      "id": 5,
      "createdAt": "2021-10-28T07:34:58.779Z",
      "updatedAt": "2021-10-28T07:34:58.779Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_PASSPORT",
      "displayName": "Australia Passport",
      "description": "Matches strings consistent with Australian Passport number.  A 8 or 9 character string is required, with a starting upper case character (N, E, D, F, A, C, U, X) or a two character starting character (P followed by A, B, C, D, E, F, U, W, X, or Z) followed by seven digits",
      "type": "builtIn",
      "config": {
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Passport"
        ],
        "conditionalTags": {}
      },
      "id": 26,
      "createdAt": "2021-10-28T07:34:59.010Z",
      "updatedAt": "2021-10-28T07:34:59.010Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_TAX_FILE_NUMBER",
      "displayName": "Australia Tax File Number",
      "description": "Matches strings consistent with Australia Tax File Number.  Requires a nine digit number with optional spaces between the third and fourth and sixth and seventh digits.  A checksum is also required",
      "type": "builtIn",
      "config": {
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Tax File Number"
        ],
        "conditionalTags": {}
      },
      "id": 6,
      "createdAt": "2021-10-28T07:34:58.789Z",
      "updatedAt": "2021-10-28T07:34:58.789Z"
    }
  ]
}

List or search for identification frameworks

GET /sdd/template

List or search identification frameworks.

Query parameters

AttributeDescriptionRequired

sortField

string The field by which to sort the search results: id, name, displayName, type, createdAt, or updatedAt.

No

sortOrder

string Denotes whether to sort the results in ascending (asc) or descending (desc) order. Default is asc.

No

offSet

integer Use in combination with limit to fetch pages.

No

limit

integer Limits the number of results displayed per page.

No

classifiers

array[string] Filters framework results to those containing the specified patterns.

No

searchText

string A partial, case-insensitive search on the framework name.

No

Response parameters

AttributeDescription

count

integer The number of identification frameworks found matching the search criteria.

id

integer The unique ID of the framework.

createdBy

array Includes details about the user who created the framework, such as their profile id, name, and email.

name

string Unique, request-friendly framework name.

displayName

string Unique, human-readable framework name.

description

string The framework description.

classifiers

array The rules in the framework and any overrides for those rules.

createdAt

date When the framework was created.

updatedAt

date When the framework was last updated.

Request example

The following request lists all identification frameworks.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template

Response example

{
  "count": 1,
  "hits": [
    {
      "name": "MY_FIRST_FRAMEWORK",
      "displayName": "My First Framework",
      "description": "This is the first framework I've created.",
      "createdBy": {
        "id": 1,
        "name": "John",
        "email": "john@example.com"
      },
      "id": 1,
      "createdAt": "2021-10-14T19:12:22.092Z",
      "updatedAt": "2021-10-14T19:12:22.092Z",
      "classifiers": [
        {
          "name": "MY_COLUMN_NAME_PATTERN",
          "overrides": {}
        },
        {
          "name": "MY_REGEX_PATTERN",
          "overrides": {}
        }
      ]
    }
  ]
}

View a pattern by name

GET /sdd/classifier/{classifierName}

Get a pattern by name.

Query parameters

AttributeDescriptionRequired

classifierName

string The name of the pattern.

Yes

Response parameters

AttributeDescription

id

integer The unique ID of the pattern.

createdBy

array Includes details about the user who created the pattern, such as their profile id, name, and email.

name

string Unique, request-friendly pattern name.

displayName

string Unique, human-readable pattern name.

description

string The pattern description.

type

string The type of pattern: regex, dictionary, columnNameRegex, or builtIn.

config

object The configuration of the pattern, which includes config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.

config.tags

array[string] The name of the resulting tags to apply to the data source.

config.columnNameRegex

string A case-insensitive regular expression to optionally match against column names.

config.regex

string A case-insensitive regular expression to match against column values.

config.values

array[string] The list of words included in the dictionary to match against column values.

config.caseSensitive

boolean Indicates whether or not values are case sensitive.

createdAt

date When the pattern was created.

updatedAt

date When the pattern was last updated.

Request example

This request gets the pattern named MY_REGEX_PATTERN.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier/MY_REGEX_PATTERN

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "MY_REGEX_PATTERN",
  "displayName": "My Regex Pattern",
  "description": "A regex pattern example",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+"
  },
  "id": 67,
  "createdAt": "2021-10-18T16:48:18.819Z",
  "updatedAt": "2021-10-18T16:48:18.819Z"
}

View an identification framework by name

GET /sdd/template/{templateName}

Get an identification framework by name.

Query parameters

AttributeDescriptionRequired

templateName

string The name of the identification framework.

Yes

Response parameters

AttributeDescription

id

integer The unique ID of the framework.

createdBy

array Includes details about the user who created the framework, such as their profile id, name, and email.

name

string Unique, request-friendly framework name.

displayName

string Unique, human-readable framework name.

description

string The framework description.

classifiers

array The rules in the framework and any overrides for those rules.

createdAt

date When the framework was created.

updatedAt

date When the framework was last updated.

Request example

This request gets the identification framework named MY_FIRST_FRAMEWORK.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_FRAMEWORK

Response example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@immuta.com"
  },
  "id": 1,
  "createdAt": "2021-10-18T16:54:24.920Z",
  "updatedAt": "2021-10-18T16:54:24.920Z",
  "classifiers": [
    {
      "name": "MY_DICTIONARY_PATTERN",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_PATTERN",
      "overrides": {}
    }
  ]
}

View the current global framework

GET /sdd/template/global

View the current global framework.

Response parameters

AttributeDescription

id

integer The unique ID of the framework.

name

string Unique, request-friendly framework name.

displayName

string Unique, human-readable framework name.

description

string The framework description.

classifiers

array The rules in the framework and any overrides for those rules.

createdBy

array Includes details about the user who created the framework, such as their profile id, name, and email.

createdAt

date When the framework was created.

updatedAt

date When the framework was last updated.

Request example

This request gets the current global framework information.

curl -X 'GET' \
  'https://demo.immuta.com/sdd/template/global' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer 9ba76f3c64c345ad817fa467d7110556'

Response example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "createdBy": {
    "id": 2,
    "name": "Jane Doe",
    "email": "jane.doe@immuta.com"
  },
  "id": 1,
  "createdAt": "2022-08-10T20:35:43.252Z",
  "updatedAt": "2022-08-10T20:35:43.252Z",
  "classifiers": [
    {
      "name": "AGE",
      "overrides": {}
    },
    {
      "name": "ETHNIC_GROUP",
      "overrides": {}
    }
  ]
}

Apply identification frameworks to data sources

PUT /sdd/template/apply

Apply an identification framework to a set of data sources.

Payload parameters

AttributeDescriptionRequired

template

string The name of the identification framework to apply to the data sources. null to clear current framework, and the data source will use the global framework.

Yes

sources

string The name of the data sources to apply the framework to.

Yes

Response parameters

AttributeDescription

success

boolean When true, the request was successful.

Request example

This request applies the MY_FIRST_FRAMEWORK framework to the Public Case data source.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/apply

Payload example

{
  "template": "MY_FIRST_FRAMEWORK",
  "sources": [
    "Public Case"
  ]
}

Response example

{
  "success": true
}

Run SDD on data sources

POST /sdd/run

Run SDD on specified data sources.

Payload parameters

AttributeDescriptionRequired

sources

string The name of the data sources to apply the identification framework to.

Yes

all

boolean If true, SDD will run on all Immuta data sources.

No

wait

integer The number of seconds to wait for the SDD jobs to finish. The value -1 will wait until the jobs complete. Default is -1.

No

dryRun

boolean When true, SDD will not update the tags on the data source(s). Instead of applying tags, SDD returns the tags that would be applied to the data source. This allows users to evaluate whether or not rules or frameworks are applying tags correctly without updating the data source. Default is false.

No

template

string If passed, Immuta will run SDD with this framework instead of the applied framework on the data source(s). Passing template when dryRun is false will cause an error.

No

Response parameters

AttributeDescription

id

string The unique identifier of the job.

state

string The job state. Statuses include created, retry, active, completed, expired, cancelled, or failed.

output

array[string] Information about the tags applied on the data source, including diff (added and removed tags) and the current state of allTags on all columns in the data sources.

Request example: Run SDD on a single data source

This request runs SDD on the data source Public Case.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header &q