Skip to content

Create a Rule with a Dictionary Pattern

Note

In previous documentation, rule is referred to as classifier or identifier and framework is referred to as template.

Use case: Custom dictionary pattern

Scenario: You have data that includes the names of the rooms employees' desks are in across your organization. Although these locations may be considered sensitive in particular datasets, they would not be recognized by Immuta's built-in patterns.

In this scenario, you can create a rule with a dictionary pattern specific to your desk locations. Immuta's sensitive data discovery will use your new dictionary pattern to find matches to values in the dataset and tag them. The tutorial below uses this scenario to illustrate creating this rule and pattern.

Attributes of a rule with a dictionary pattern

Attributes of all rules are provided on the Sensitive data discovery API page. However, attributes specific to the custom dictionary pattern are outlined in the table below.

Attribute Description
name string Unique, request-friendly rule name.
displayName string Unique, human-readable rule name.
description string The rule description.
type string The type of pattern: dictionary.
config object The configuration of the rule, which includes config.minConfidence, config.tags, config.values, and config.caseSensitive.
config.minConfidence number When the detection confidence is at least this percentage, tags are applied.
config.tags array[string] The name of the resulting tags to apply to the column. Note: All tags must start with Discovered..
config.values array[string] The list of words to include in the dictionary.
config.caseSensitive boolean Indicates whether or not values are case sensitive. Defaults to false.

Create a rule with a dictionary pattern

  1. Generate your API key on the API Keys tab on your profile page and save the API key somewhere secure. You will include this API key in the authorization header when you make a request to the Immuta API or use it to configure your instance with the Immuta CLI.

  2. Save the rule with a dictionary pattern payload in a .json file. The dictionary below contains the words Research Lab, Blue Room, and Purple Room.

    {
      "name": "EMPLOYEE_DESK_LOCATION_RULE",
      "displayName": "Employee Desk Location Rule",
      "description": "This rule detects when an employee's desk location appears in a dataset.",
      "type": "dictionary",
      "config": {
        "values": ["Research Lab", "Blue Room", "Purple Room"],
        "caseSensitive": false,
        "minConfidence": 0.6,
        "tags": ["Discovered.desk-location"]
      }
    }
    
  3. Create the rule using one of these methods:

    Immuta CLI

    immuta api sdd/classifier -X POST --input ./example-payload.json
    

    HTTP API

    curl \
        --request POST \
        --header "Content-Type: application/json" \
        --header "Authorization: 12345678900000" \
        --data @example-payload.json \
        https://your-immuta-url.immuta.com/sdd/classifier
    
  4. If the request is successful, you will receive a response that contains details about the rule.

    {
      "createdBy": {
        "id": 1,
        "name": "John",
        "email": "john@example.com"
      },
      "name": "EMPLOYEE_DESK_LOCATION_RULE",
      "displayName": "Employee Desk Location Rule",
      "description": "This rule detects when an employee's desk location appears in a dataset.",
      "type": "dictionary",
      "config": {
        "tags": [
          "Discovered.desk-location"
        ],
        "values": [
          "Research Lab",
          "Blue Room",
          "Purple Room"
        ],
        "caseSensitive": false,
        "minConfidence": 0.6
      },
      "id": 68,
      "createdAt": "2021-10-20T17:57:51.696Z",
      "updatedAt": "2021-10-20T17:57:51.696Z"
    }
    

What's next

Continue to one of the following tutorials:

  • Run sensitive data discovery on data sources: Trigger SDD to run on specified data sources.
  • Create a framework: Although only data governors can create rules, data owners can add rules to frameworks, which they then apply to their data sources to override minConfidence or tags for rules within the framework.