Manage Sensitive Data Discovery (SDD)

Sensitive data discovery (SDD) API reference guide

Workflow

Create an identifier

To run this identifier against your data, ensure it is added to a framework.

POST /sdd/classifier

Create an identifier.

Payload parameters

AttributeDescriptionRequired

name

string Unique, request-friendly identifier name. Must be uppercase letters or numbers.

Yes

displayName

string Unique, human-readable identifier name.

Yes

description

string The identifier description.

Yes

type

string The type of identifier: regex, dictionary, columnNameRegex, or builtIn.

Yes

config

object The configuration of the identifier, which includes config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. The config object must include one of the following: config.regex, config.columnNameRegex, or config.values.

Yes

config.tags

array[string] The default resulting tags to apply when the identifier is matched; they must begin with Discovered . .

No

config.regex

string A case-insensitive regular expression to match against column values.

No

config.columnNameRegex

string A case-insensitive regular expression to match against column names.

No

config.values

array[string] The list of words included in the dictionary to match against column values.

No

config.caseSensitive

boolean Indicates whether or not values are case sensitive. Defaults to false.

No

config.minConfidence

integer Apply tags when the identifier match is at least this percentage. Not supported for native SDD. Must be between 0 and 1.

Yes

Response parameters

AttributeDescription

createdBy

array Includes details about the user who created the identifier, such as their profile id, name, and email.

name

string Unique, request-friendly identifier name.

displayName

string Unique, human-readable identifier name.

description

string The identifier description.

type

string The type of identifier: regex, dictionary, columnNameRegex, or builtIn.

config

object The configuration of the identifier, which includes config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.

config.tags

array[string] The default resulting tags to apply to the data source when the identifier is matched.

config.columnNameRegex

string A case-insensitive regular expression to match against column names.

config.regex

string A case-insensitive regular expression to match against column values.

config.values

array[string] The list of words included in the dictionary to match against column values.

config.caseSensitive

boolean Indicates whether or not values are case sensitive.

config.minConfidence

integer Apply tags when the identifier match is at least this percentage. Not supported for native SDD.

createdAt

date When the identifier was created.

updatedAt

date When the identifier was last updated.

Request example

The following request creates an identifier, saved in example-payload.json.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/classifier

Payload examples

{
  "name": "MY_REGEX_IDENTIFIER",
  "displayName": "My Regex Identifier",
  "description": "A regex identifier example",
  "type": "regex",
  "config": {
    "regex": "^[A-Z][a-z]+",
    "tags": ["Discovered.regex-example"],
    "minConfidence": 0.9
  }
}

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "MY_REGEX_IDENTIFIER",
  "displayName": "My Regex Identifier",
  "description": "A regex identifier example",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.9
  },
  "id": 67,
  "createdAt": "2021-10-14T18:48:56.289Z",
  "updatedAt": "2021-10-14T18:48:56.289Z"
}

Create an identification framework

POST /sdd/template

Create an identification framework.

Payload parameters

AttributeDescriptionRequired

name

string Unique, request-friendly framework name. Must be uppercase letters or numbers.

Yes

displayName

string Unique, human-readable framework name.

Yes

description

string The framework description.

Yes

classifiers

array The identifiers to include in the framework and any additional overrides for those identifiers.

Yes

classifiers.name

string The name of the identifier to include in the framework.

Yes

classifiers.overrides

array The overrides to modify the identifier for this framework.

No

classifiers.overrides.tags

array The resulting tags to apply when the identifier is matched. These tags will override the identifier's default tags and must begin with Discovered . .

No

Response parameters

AttributeDescription

id

integer The unique ID of the framework.

createdBy

array Includes details about the user who created the framework, such as their profile id, name, and email.

name

string Unique, request-friendly framework name.

displayName

string Unique, human-readable framework name.

description

string The framework description.

classifiers

array The identifiers in the framework and any overrides for those identifiers.

createdAt

date When the framework was created.

updatedAt

date When the framework was last updated.

Request example

The following request creates an identification framework that contains 2 identifiers, saved in example-payload.json.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template

Payload example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_IDENTIFIER"
    },
    {
      "name": "MY_REGEX_IDENTIFIER"
    },
    {
      "name": "AGE",
      "overrides": {
        "tags": [
          "Discovered.Entity.Age"
        ]
      }
    }
  ]
}

Response example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 1,
  "createdAt": "2021-10-14T19:12:22.092Z",
  "updatedAt": "2021-10-14T19:12:22.092Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_IDENTIFIER",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_IDENTIFIER",
      "overrides": {}
    }
  ]
}

Search for identifiers or identification frameworks

MethodPathPurpose

GET

sdd/classifier

GET

sdd/template

GET

sdd/classifier/{classifierName}

GET

sdd/template/{templateName}

GET

sdd/template/global

List or search for identifiers

GET /sdd/classifier

List or search identifiers.

Query parameters

AttributeDescriptionRequired

sortField

string The field to sort the search results: id, name, displayName, type, createdAt, or updatedAt.

No

sortOrder

string Denotes whether to sort the results in ascending (asc) or descending (desc) order. Default is asc.

No

offSet

integer Use in combination with limit to fetch pages.

No

limit

integer Limits the number of results displayed per page.

No

type

array[string] Searches based on identifier type: regex, dictionary, builtIn, or columnNameRegex.

No

searchText

string A partial, case-insensitive search on name.

No

Response parameters

AttributeDescription

count

integer The number of identifiers found matching the search criteria.

createdBy

array Includes details about the user who created the identifier, such as their profile id, name, and email.

name

string Unique, request-friendly identifier name.

displayName

string Unique, human-readable identifier name.

description

string The identifier description.

type

string The type of identifier: regex, dictionary, columnNameRegex, or builtIn.

config

object The configuration of the identifier, which includes config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.

config.tags

array[string] The default resulting tags to apply when the identifier is matched.

config.columnNameRegex

string A case-insensitive regular expression to optionally match against column names.

config.regex

string A case-insensitive regular expression to match against column values.

config.values

array[string] The list of words included in the dictionary to match against column values.

config.caseSensitive

boolean Indicates whether or not values are case sensitive.

createdAt

date When the identifier was created.

updatedAt

date When the identifier was last updated.

Request example

The following request lists 5 identifiers.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier?sortField=name&sortOrder=asc&limit=5

Response example

{
  "count": 67,
  "hits": [
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AGE",
      "displayName": "Age",
      "description": "Matches numeric strings between 10 and 199.",
      "type": "builtIn",
      "config": {
        "tags": ["Discovered.Entity.Age"],
        "conditionalTags": {}
      },
      "id": 3,
      "createdAt": "2021-10-28T07:34:58.761Z",
      "updatedAt": "2021-10-28T07:34:58.761Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "ARGENTINA_DNI_NUMBER",
      "displayName": "Argentina DNI Number",
      "description": "Matches strings consistent with Argentina National Identity (DNI) Number.  Requires an eight digit number with optional periods between the second and third and fifth and sixth digit.",
      "type": "builtIn",
      "config": {
        "tags": [
          "Discovered.Country.Argentina",
          "Discovered.Entity.DNI Number"
        ],
        "conditionalTags": {}
      },
      "id": 4,
      "createdAt": "2021-10-28T07:34:58.769Z",
      "updatedAt": "2021-10-28T07:34:58.769Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_MEDICARE_NUMBER",
      "displayName": "Australia Medicare Number",
      "description": "Matches numeric strings consistent with Australian Medicare Number.  Requires a ten or eleven digit number.  The starting digit must be between 2 and 6, inclusive.  Optional spaces can be placed between the fourth and fifth and ninth and tenth digit.  Optional 11th separated by a `/` can be present.  A checksum is required.",
      "type": "builtIn",
      "config": {
        "tags": [
          "Discovered.Country.Australia",
          "Discovered.Entity.Medicare Number"
        ],
        "conditionalTags": {}
      },
      "id": 5,
      "createdAt": "2021-10-28T07:34:58.779Z",
      "updatedAt": "2021-10-28T07:34:58.779Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_PASSPORT",
      "displayName": "Australia Passport",
      "description": "Matches strings consistent with Australian Passport number.  A 8 or 9 character string is required, with a starting upper case character (N, E, D, F, A, C, U, X) or a two character starting character (P followed by A, B, C, D, E, F, U, W, X, or Z) followed by seven digits",
      "type": "builtIn",
      "config": {
        "tags": [
          "Discovered.Country.Australia",
          "Discovered.Entity.Passport"
        ],
        "conditionalTags": {}
      },
      "id": 26,
      "createdAt": "2021-10-28T07:34:59.010Z",
      "updatedAt": "2021-10-28T07:34:59.010Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_TAX_FILE_NUMBER",
      "displayName": "Australia Tax File Number",
      "description": "Matches strings consistent with Australia Tax File Number.  Requires a nine digit number with optional spaces between the third and fourth and sixth and seventh digits.  A checksum is also required",
      "type": "builtIn",
      "config": {
        "tags": [
          "Discovered.Country.Australia",
          "Discovered.Entity.Tax File Number"
        ],
        "conditionalTags": {}
      },
      "id": 6,
      "createdAt": "2021-10-28T07:34:58.789Z",
      "updatedAt": "2021-10-28T07:34:58.789Z"
    }
  ]
}

List or search for identification frameworks

GET /sdd/template

List or search identification frameworks.

Query parameters

AttributeDescriptionRequired

sortField

string The field to sort the search results: id, name, displayName, type, createdAt, or updatedAt.

No

sortOrder

string Denotes whether to sort the results in ascending (asc) or descending (desc) order. Default is asc.

No

offSet

integer Use in combination with limit to fetch pages.

No

limit

integer Limits the number of results displayed per page.

No

classifiers

array[string] Filters framework results to those containing the specified identifiers.

No

searchText

string A partial, case-insensitive search on the framework name.

No

Response parameters

AttributeDescription

count

integer The number of identification frameworks found matching the search criteria.

id

integer The unique ID of the framework.

createdBy

array Includes details about the user who created the framework, such as their profile id, name, and email.

name

string Unique, request-friendly framework name.

displayName

string Unique, human-readable framework name.

description

string The framework description.

classifiers

array The identifiers in the framework and any overrides for those identifiers.

createdAt

date When the framework was created.

updatedAt

date When the framework was last updated.

Request example

The following request lists all identification frameworks.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template

Response example

{
  "count": 1,
  "hits": [
    {
      "name": "MY_FIRST_FRAMEWORK",
      "displayName": "My First Framework",
      "description": "This is the first framework I've created.",
      "createdBy": {
        "id": 1,
        "name": "John",
        "email": "john@example.com"
      },
      "id": 1,
      "createdAt": "2021-10-14T19:12:22.092Z",
      "updatedAt": "2021-10-14T19:12:22.092Z",
      "classifiers": [
        {
          "name": "MY_COLUMN_NAME_IDENTIFIER",
          "overrides": {}
        },
        {
          "name": "MY_REGEX_IDENTIFIER",
          "overrides": {}
        }
      ]
    }
  ]
}

View an identifier by name

GET /sdd/classifier/{classifierName}

Get an identifier by name.

Query parameters

AttributeDescriptionRequired

classifierName

string The name of the identifier.

Yes

Response parameters

AttributeDescription

id

integer The unique ID of the identifier.

createdBy

array Includes details about the user who created the identifier, such as their profile id, name, and email.

name

string Unique, request-friendly identifier name.

displayName

string Unique, human-readable identifier name.

description

string The identifier description.

type

string The type of identifier: regex, dictionary, columnNameRegex, or builtIn.

config

object The configuration of the identifier, which includes config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags.

config.tags

array[string] The name of the resulting tags to apply to the data source.

config.columnNameRegex

string A case-insensitive regular expression to optionally match against column names.

config.regex

string A case-insensitive regular expression to match against column values.

config.values

array[string] The list of words included in the dictionary to match against column values.

config.caseSensitive

boolean Indicates whether or not values are case sensitive.

createdAt

date When the identifier was created.

updatedAt

date When the identifier was last updated.

Request example

This request gets the identifier named MY_REGEX_IDENTIFIER.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier/MY_REGEX_IDENTIFIER

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "MY_REGEX_IDENTIFIER",
  "displayName": "My Regex Identifier",
  "description": "A regex identifier example",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+"
  },
  "id": 67,
  "createdAt": "2021-10-18T16:48:18.819Z",
  "updatedAt": "2021-10-18T16:48:18.819Z"
}

View an identification framework by name

GET /sdd/template/{templateName}

Get an identification framework by name.

Query parameters

AttributeDescriptionRequired

templateName

string The name of the identification framework.

Yes

Response parameters

AttributeDescription

id

integer The unique ID of the framework.

createdBy

array Includes details about the user who created the framework, such as their profile id, name, and email.

name

string Unique, request-friendly framework name.

displayName

string Unique, human-readable framework name.

description

string The framework description.

classifiers

array The identifiers in the framework and any overrides for those identifiers.

createdAt

date When the framework was created.

updatedAt

date When the framework was last updated.

Request example

This request gets the identification framework named MY_FIRST_FRAMEWORK.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_FRAMEWORK

Response example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@immuta.com"
  },
  "id": 1,
  "createdAt": "2021-10-18T16:54:24.920Z",
  "updatedAt": "2021-10-18T16:54:24.920Z",
  "classifiers": [
    {
      "name": "MY_DICTIONARY_IDENTIFIER",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_IDENTIFIER",
      "overrides": {}
    }
  ]
}

View the current global framework

GET /sdd/template/global

View the current global framework.

Response parameters

AttributeDescription

id

integer The unique ID of the framework.

name

string Unique, request-friendly framework name.

displayName

string Unique, human-readable framework name.

description

string The framework description.

classifiers

array The identifiers in the framework and any overrides for those identifiers.

createdBy

array Includes details about the user who created the framework, such as their profile id, name, and email.

createdAt

date When the framework was created.

updatedAt

date When the framework was last updated.

Request example

This request gets the current global framework information.

curl -X 'GET' \
  'https://demo.immuta.com/sdd/template/global' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer 9ba76f3c64c345ad817fa467d7110556'

Response example

{
  "name": "MY_FIRST_FRAMEWORK",
  "displayName": "My First Framework",
  "description": "This is the first framework I've created.",
  "createdBy": {
    "id": 2,
    "name": "Jane Doe",
    "email": "jane.doe@immuta.com"
  },
  "id": 1,
  "createdAt": "2022-08-10T20:35:43.252Z",
  "updatedAt": "2022-08-10T20:35:43.252Z",
  "classifiers": [
    {
      "name": "AGE",
      "overrides": {}
    },
    {
      "name": "ETHNIC_GROUP",
      "overrides": {}
    }
  ]
}

Apply identification frameworks to data sources

PUT /sdd/template/apply

Apply an identification framework to a set of data sources.

Payload parameters

AttributeDescriptionRequired

template

string The name of the identification framework to apply to the data sources. null to clear current framework, and the data source will use the global framework.

Yes

sources

string The name of the data sources to apply the framework to.

Yes

Response parameters

AttributeDescription

success

boolean When true, the request was successful.

Request example

This request applies the MY_FIRST_FRAMEWORK framework to the Public Case data source.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/apply

Payload example

{
  "template": "MY_FIRST_FRAMEWORK",
  "sources": [
    "Public Case"
  ]
}

Response example

{
  "success": true
}

Run SDD on data sources

POST /sdd/run

Run SDD on specified data sources.

Payload parameters

AttributeDescriptionRequired

sources

string The name of the data sources to apply the identification framework to.

Yes

all

boolean If true, SDD will run on all Immuta data sources.

No

wait

integer The number of seconds to wait for the SDD jobs to finish. The value -1 will wait until the jobs complete. Default is -1.

No

dryRun

boolean When true, SDD will not update the tags on the data source(s). Instead of applying tags, SDD returns the tags that would be applied to the data source. This allows users to evaluate whether or not identifiers or frameworks are applying tags correctly without updating the data source. Default is false.

No

template

string If passed, Immuta will run SDD with this framework instead of the applied framework on the data source(s). Passing template when dryRun is false will cause an error.

No

Response parameters

AttributeDescription

id

string The unique identifier of the job.

state

string The job state. Statuses include created, retry, active, completed, expired, cancelled, or failed.

output

array[string] Information about the tags applied on the data source, including diff (added and removed tags) and the current state of allTags on all columns in the data sources.

Request example: Run SDD on a single data source

This request runs SDD on the data source Public Case.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/run

Payload example

{
  "sources": [
    "Insurance Data"
  ]
}

Response example

{
  "Insurance Data": {
    "id": "d2edc1d0-328c-11ec-9d5a-6793988ccf95",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {
          "ssn": [
            "Discovered.Social Security Number"