Skip to content

Manage Sensitive Data Discovery Identifiers

Note

In previous documentation, identifier is referred to as classifier. The language is being updated to identifier to be more accurate and not conflate meaning with the Immuta data classification and frameworks feature.

Prerequisite

Sensitive data discovery must be enabled.

Command overview: immuta sdd classifier

This command allows you to manage identifiers that will apply tags to data that matches patterns you specify during SDD. The table below illustrates subcommands and arguments.

Subcommands Aliases Description
create save Create an identifier.
delete None Delete the passed identifier.
get None Get an identifier.
search ls, list Search all identifiers.
update None Update an identifier.

Options

Use these options to get more details about the sdd classifier command or any of its subcommands:

  • -h
  • --help
$ immuta sdd classifier -h
Manage Sensitive Data Discovery Classifiers

Usage:
  immuta sdd classifier [command]

Available Commands:
  create      Create an SDD classifier
  delete      Delete the passed SDD classifier
  get         Get an SDD classifier
  search      Search all classifiers
  update      Update an SDD classifier

Flags:
  -h, --help   Help for classifier

Global Flags:
      --config string    Config file (default $HOME/.immutacfg.yaml)
  -p, --profile string   Specifies the profile for what instance/api the cli will use (default "default")

Use "immuta sdd classifier [command] --help" for more information about a command.

Create an identifier

  1. Save your identifier to a valid YAML or JSON file using these attributes.

    Attribute Description Required
    name string Unique, request-friendly identifier name. Yes
    displayName string Unique, human-readable identifier name. Yes
    description string The identifier description. Yes
    type string The type of identifier: regex, dictionary, columnNameRegex, or builtIn. Yes
    config object May include config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. *See descriptions below. Yes
    minConfidence* number When the detection confidence is at least this percentage, tags are applied. Yes
    tags* array[string] The name of the tags to apply to the data source. Yes
    regex* string A case-insensitive regular expression to match against column values. No
    columnNameRegex* string A case-insensitive regular expression to match against column names. No
    values* array[string] The list of words to include in the dictionary. No
    caseSensitive* boolean Indicates whether or not values are case sensitive. Defaults to false. No

    Examples are provided below.

    Regex identifier

    {
      "name": "MY_REGEX_IDENTIFIER",
      "displayName": "My Regex Identifier",
      "description": "An identifier using regex",
      "type": "regex",
      "config": {
        "regex": "^[A-Z][a-z]+",
        "minConfidence": 0.5,
        "tags": ["Discovered.regex-example"]
      }
    }
    

    Dictionary identifier

    {
      "name": "MY_DICTIONARY_IDENTIFIER",
      "displayName": "My Dictionary Identifier",
      "description": "An identifier using dictionary",
      "type": "dictionary",
      "config": {
        "values": ["Bob", "Eve"],
        "caseSensitive": true,
        "minConfidence": 0.6,
        "tags": ["Discovered.dictionary-example", "Discovered.dictionary-identifier-example"]
      }
    }
    

    Column name regex identifier

    {
      "name": "MY_COLUMN_NAME_REGEX_IDENTIFIER",
      "displayName": "My Column Name Regex Identifier",
      "description": "An identifier using column name regex",
      "type": "columnNameRegex",
      "config": {
        "columnNameRegex": "ssn|social ?security",
        "tags": ["Discovered.column-name-example"]
      }
    }
    
  2. Run immuta sdd classifier create <filepath> [flags], referencing the file you just created. The options you can specify include

    • -h or --help: Get more information about the command.
    • -o or --output json | yaml: Specify the output format.
    • --outputTemplate string: Format the response using a Go template.

Example

$ immuta sdd classifier create ./account-classifier.json
Creating classifier from ./account-classifier...
Create successful.

Get an identifier

Run immuta sdd classifier get <classifierName> [flags], specifying the name of the identifier you would like to get. Options you can specify include

  • -h or --help: Get more information about the command.
  • -o or --output json | yaml: Specify the output format.
  • --outputTemplate string: Format the response using a Go template.

Example

The example below illustrates a user getting an identifier called ACCOUNT_NUMBER_IDENTIFIER.

$ immuta sdd classifier get ACCOUNT_NUMBER_IDENTIFIER
Getting classifier ACCOUNT_NUMBER_IDENTIFIER...
{
  "createdBy": {
    "id": 1,
    "name": "Example User",
    "email": "user@example.com"
  },
  "name": "ACCOUNT_NUMBER_IDENTIFIER",
  "displayName": "Account Number Identifier",
  "description": "This identifier recognizes account numbers using a regex",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.account-number"
    ],
    "regex": "^[0-9]{9}-[0-9]{3}-[0-9]{1}$",
    "minConfidence": 0.5
  },
  "id": 69,
  "createdAt": "2022-03-28T14:52:14.004Z",
  "updatedAt": "2022-03-28T14:52:14.004Z"
}

Search identifiers

Run immuta sdd classifier search [string] [flags] to list all identifiers or search identifiers by name. Options you can specify include

  • -h, --help: Help for search.
  • --limit int The search limit for pagination (default 25).
  • --offset int: The search offset for pagination.
  • --order asc | desc: The sort order.
  • -o, --output json | yaml: The output format.
  • --outputTemplate string: Format the response using a Go template.
  • -s, --sort id | name | displayName | type | createdAt | updatedAt: Field to sort by.
  • --type regex | columnNameRegex | dictionary | builtIn: Limit results to the specified identifier type.

Example

The example below illustrates a user searching all identifiers containing account.

$ immuta sdd classifier search account
Searching all classifiers...
ACCOUNT_NUMBER_IDENTIFIER This identifier recognizes to account numbers using a regex.

Update an identifier

  1. Update your identifier in a valid YAML or JSON file using these attributes:

    Attribute Description Required
    name string Unique, request-friendly identifier name. Yes
    displayName string Unique, human-readable identifier name. Yes
    description string The identifier description. Yes
    type string The type of identifier: regex, dictionary, columnNameRegex, or builtIn. Yes
    config object May include config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. *See descriptions below. Yes
    minConfidence* number When the detection confidence is at least this percentage, tags are applied. Yes
    tags* array[string] The name of the tags to apply to the data source. Yes
    regex* string A case-insensitive regular expression to match against column values. No
    columnNameRegex* string A case-insensitive regular expression to match against column names. No
    values* array[string] The list of words to include in the dictionary. No
    caseSensitive* boolean Indicates whether or not values are case sensitive. Defaults to false. No
  2. Run immuta sdd classifier update <classifierName> <filepath> [flags], referencing the file you just updated. The options you can specify include

    • -h or --help: Get more information about the command.
    • -o or --output json | yaml: Specify the output format.
    • --outputTemplate string: Format the response using a Go template.

Example

The example below illustrates a user updating an identifier named ACCOUNT_NUMBER_IDENTIFIER.

$ immuta sdd classifier update ACCOUNT_NUMBER_IDENTIFIER ./account-classifier -o json
{
  "createdBy": {
    "id": 1,
    "name": "Example User",
    "email": "user@example.com"
  },
  "name": "ACCOUNT_NUMBER_IDENTIFIER",
  "displayName": "Account Number Identifier",
  "description": "This identifier recognizes account numbers using a regex.",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.account-number"
    ],
    "regex": "^[0-9]{9}-[0-9]{3}-[0-9]{1}$",
    "minConfidence": 0.5
  },
  "id": 69,
  "createdAt": "2022-03-28T14:52:14.004Z",
  "updatedAt": "2022-03-28T15:25:28.575Z"
}

Delete an identifier

Run immuta sdd classifier delete <classifierName> [flags] to delete the identifier. The options you can specify include

  • -h or --help: Get more information about the command.
  • -o or --output json | yaml: Specify the output format.
  • --outputTemplate string: Format the response using a Go template.

Example

$ immuta sdd classifier delete ACCOUNT_NUMBER_IDENTIFIER -o json
{
  "createdBy": {
    "id": 1,
    "name": "Example User",
    "email": "user@example.com"
  },
  "name": "ACCOUNT_NUMBER_IDENTIFIER",
  "displayName": "Account Number Identifier",
  "description": "This identifier recognizes account numbers using a regex.",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.account-number"
    ],
    "regex": "^[0-9]{9}-[0-9]{3}-[0-9]{1}$",
    "minConfidence": 0.5
  },
  "id": 69,
  "createdAt": "2022-03-28T14:52:14.004Z",
  "updatedAt": "2022-03-28T15:25:28.575Z"
}