Built-in Pattern Reference
In previous documentation, rule and pattern are referred to as classifier or identifier. The language is being updated to rule to be more accurate and not conflate meaning with Detect classification.
Immuta comes with a set of built-in patterns that look for common data types. These patterns were written by Immuta's research and development team and cannot be deleted or edited by users. However, users can build their own rules using these built-in patterns, which will customize the resulting tags based on the organization's needs.
When using SDD with classification frameworks, it is recommended to use the default resulting tags listed in the table below for these built-in patterns. This ensures that the framework rules apply sensitivity tags as intended.
Pattern descriptions and default resulting tags
AGE
Matches numeric strings between 10 and 199.
Discovered.PII
Discovered.Identifier Indirect
Discovered.PHI
Discovered.Entity.Age
ARGENTINA_DNI_NUMBER
Matches strings consistent with Argentina National Identity (DNI) Number. Requires an eight-digit number with optional periods between the second and third and fifth and sixth digit.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Argentina
Discovered.PHI
Discovered.Entity.DNI Number
AUSTRALIA_MEDICARE_NUMBER
Matches numeric strings consistent with Australian Medicare number. Requires a ten- or eleven-digit number. The starting digit must be between 2 and 6, inclusive. Optional spaces can be placed between the fourth and fifth and ninth and tenth digit. The optional 11th digit separated by a /
can be present. A checksum is required.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Australia
Discovered.PHI
Discovered.Entity.Medicare Number
AUSTRALIA_PASSPORT
Matches strings consistent with Australian Passport number. An 8- or 9-character string is required, with a starting upper case character (N, E, D, F, A, C, U, X) or a two-character starting character (P followed by A, B, C, D, E, F, U, W, X, or Z) followed by seven digits.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Australia
Discovered.PHI
Discovered.Entity.Passport
BELGIUM_NATIONAL_ID_CARD_NUMBER
Matches numeric strings consistent with Belgium's National ID card. Requires a twelve-digit number with hyphen (-
) between the third and fourth digit and tenth and eleventh digits. A two checksum is required.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Belgium
Discovered.PHI
Discovered.Entity.National ID Card Number
BITCOIN_INVOICE_ADDRESS
Matches strings consistent with the following Bitcoin Invoice Address formats: P2PKH, P2SH, and Bech32. P2PKH and P2SH must start with a 1 or a 3, respectively, followed by 25 - 34 alphanumeric characters, excluding l, I, O, and 0. Bech32 formats must begin with bc1
and be followed by 39 characters. To be identified, any addresses must have a valid checksum.
Discovered.Entity.CRYPTO
Discovered.PCI
BRAZIL_CPF_NUMBER
Matches a numeric string consistent with Brazil's CPF (Cadastro Pessoal de Pessoa Física) number. An eleven-digit numeric string with non-numeric separators after the third, sixth, and ninth digits. A two digit checksum is required.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Brazil
Discovered.PHI
Discovered.Entity.CPF Number
CANADA_BC_PHN
Matches numeric strings consistent with British Columbia's Personal Health Number (PHN). Requires a ten-digit numeric string with optional hyphen (-
) or spaces after the fourth and seventh digits.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Canada
Discovered.PHI
Discovered.Entity.British Columbia Health Network Number
CANADA_OHIP
Matches alphanumeric strings consistent with Ontario's Health Insurance Plan (OHIP). Requires a twelve-digit alphanumeric code. Optional hyphens (-
) or spaces can appear after the fourth, seventh, and tenth digits. The final two characters are a checksum.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Canada
Discovered.PHI
Discovered.Entity.Ontario Health Insurance Number
CANADA_PASSPORT
Matches strings consistent with the Canadian Passport Number format as described here.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Canada
Discovered.PHI
Discovered.Entity.Passport
CANADA_QUEBEC_HIN
Matches alphanumeric strings consistent with Quebec's Health Insurance Number (HIN). Requires four alphabetic characters followed by an optional space or hyphen (-
), and then eight digits with an optional hyphen or space after the fourth digit.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Canada
Discovered.PHI
Discovered.Entity.Quebec Health Insurance Number
CREDIT_CARD_NUMBER
Matches strings consistent with a credit card number with prefixes matching major credit card companies. Must include a valid checksum.
Discovered.PCI
Discovered.Entity.Credit Card Number
DATE
Matches strings consistent with dates. These can include days of the week, dates, and date times.
Discovered.Entity.Date
DENMARK_CPR_NUMBER
Matches numeric strings consistent with Personal Identification Number (CPR-number or Person-number). Requires a ten-digit number with either a DDMMYY-SSSS
or DDMMYYSSSS
format. The first six digits are an individual's birth date in Day, Month, Year format. The final four digits comprise the sequence number.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Denmark
Discovered.PHI
Discovered.Entity.CPR Number
DOMAIN_NAME
Matches domain names using a very broad pattern.
Discovered.Entity.Domain Name
EMAIL_ADDRESS
Detect strings consistent with an email address. Usernames are required to be fewer than 255 characters, follow by @a
, a domain of fewer than 255 characters, and a top level domain of between 2 and 20 characters.
Discovered.PHI
Discovered.Entity.Electronic Mail Address
Discovered.Identifier Direct
ETHNIC_GROUP
Matches strings consistent with the US Census race designations.
Discovered.PII
Discovered.Entity.Ethnic Group
FDA_CODE
Matches a string consistent with a drug or ingredient registered with Food and Drug Administration (FDA). Must start with between 4 to 6 digits, followed by a hyphen, followed by 3 to 4 digits, followed by a hyphen, and finishing with one to two digits.
Discovered.Country.US
Discovered.Entity.FDA Code
FINLAND_NATIONAL_ID_NUMBER
Matches a string consistent with Finland's National ID number. Requires an eleven-character string in a DDMMYYCZZZQ
format. The first six digits are an individual's birth date in Day, Month, Year format. The C
character is a century of birth indicator (+
for the years 1800-1899, -
for years 1900-1999, and A
for years 2000-2099). ZZZ
is an individual ID number, and Q
is a required checksum.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Finland
Discovered.PHI
Discovered.Entity.National ID Number
FRANCE_CNI
Matches numeric strings consistent with the French National ID card number (carte nationale d'identité). Requires a twelve-digit numeric string.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.France
Discovered.PHI
Discovered.Entity.CNI
FRANCE_NIR
Matches numeric strings consistent with France's National ID number (Numéro d'Inscription au Répertoire). Requires a fifteen-digit numeric string. An optional hyphen (-
) or space can appear after the 13th digit. The 14th and 15th digits act as a checksum.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.France
Discovered.PHI
Discovered.Entity.NIR
FRANCE_PASSPORT
Matches alphanumeric strings consistent with the French Passport number. Requires two numbers followed by two upper case letters and ends with 5 digits.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.France
Discovered.PHI
Discovered.Entity.Passport
GENDER
Matches strings consistent with gender or gender abbreviations.
Discovered.PII
Discovered.Identifier Indirect
Discovered.PHI
Discovered.Entity.Gender
GERMANY_DRIVERS_LICENSE_NUMBER
Matches alphanumeric strings consistent with Germany's Driver's License number. Requires an eleven-element string, with a digit or a letter followed by two digits, 6 digits or letters, one digit, and one digit or letter.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Germany
Discovered.PHI
Discovered.Entity.Drivers License Number
GERMANY_IDENTITY_CARD_NUMBER
Matches alphanumeric strings consistent with Germany's Identity Card number. Requires a single letter followed by eight digits.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Germany
Discovered.PHI
Discovered.Entity.Identity Card Number
IBAN_CODE
Matches strings consistent with an International Bank Account Number (IBAN). Must contain a valid country code.
Discovered.Entity.IBAN Code
ICD10_CODE
Matches strings consistent with codes from the International Statistical Classification of Diseases and Related Health Problems (ICD), as drawn from the Clinical Modification lexicon from the year 2020.
Discovered.Entity.ICD10 Code
IMEI_HARDWARE_ID
Matches strings consistent with an International Mobile Equipment Identity (IMEI) number. Must contain 15 digits with optional hyphens or spaces after the second, 8th, and 14th digits.
Discovered.Entity.IMEI
IP_ADDRESS
Matches IP Addresses in the V4 and V6 formats.
Discovered.Entity.IP Address
LOCATION
Matches strings consistent with Countries, States, Addresses, or Municipalities. By default focuses on locations in the United States.
Discovered.Entity.Location
MAC_ADDRESS
Matches strings consistent with a Media Access Control (MAC) address. Must contain twelve hexadecimal digits, with every two digits separated by a colon.
Discovered.Entity.MAC Address
MAC_ADDRESS_LOCAL
Matches strings consistent with a local Media Access Control (MAC) address.
Discovered.Entity.MAC Address Local
PERSON_NAME
Matches strings consistent with a dictionary of people's names. Names are drawn from the US Social Security database.
Discovered.PII
Discovered.PHI
Discovered.Entity.Person Name
Discovered.Identifier Indirect
PHONE_NUMBER
Matches strings consistent with telephone numbers. Primarily looks for strings consistent with the United States telephone numbers naming convention.
Discovered.Entity.Telephone Number
POSTAL_CODE
Matches strings consistent with a valid US zip code with an optional +4. Only valid 5 digit zip codes are detected.
Discovered.Entity.Postal Code
SPAIN_NIE_NUMBER
Matches strings consistent with Spain's Foreigner Identification number. Requires an eight-character string. The initial character must be X, Y, or Z, followed by seven digits, then by an optional hyphen or space and a single checksum character.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Spain
Discovered.PHI
Discovered.Entity.NIE Number
SPAIN_NIF_NUMBER
Matches strings consistent with Spain's Tax Identification number. Requires an eight-character string. Requires eight digits followed by an optional hyphen or space and a single checksum character.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Spain
Discovered.PHI
Discovered.Entity.NIF Number
SPAIN_PASSPORT
Matches strings consistent with Spain's Passport number. Requires an eight- or nine-character string, starting with either two or three letters followed by six digits.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Spain
Discovered.PHI
Discovered.Entity.Passport
STREET_ADDRESS
Matches strings consistent with street addresses. Primarily looks for strings consistent with the United States street naming convention.
Discovered.Entity.Location
SWEDEN_NATIONAL_ID_NUMBER
Matches numeric strings consistent with Sweden's Nation ID number. Requires a ten- or twelve-digit string that must start with a date in either the YYMMDD
or YYYYMMDD
formats. An optional -
or +
character then separates four ending digits. The final digit is a checksum.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Sweden
Discovered.PHI
Discovered.Entity.National ID Number
SWEDEN_PASSPORT
Matches numeric strings consistent with Sweden's Passport number. Requires an 8-digit number.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Sweden
Discovered.PHI
Discovered.Entity.Passport
SWIFT_CODE
Matches alphanumeric strings consistent with a SWIFT code (or Bank Identifier Code (BIC)) format.
Discovered.Entity.Swift Code
THAILAND_NATIONAL_ID_NUMBER
Matches strings consistent with Thailand's National ID number. Requires a 13-digit number with optional spaces or hyphens (-
) after the first, fifth, tenth, and twelfth digits. The final digit is a checksum.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.Thailand
Discovered.PHI
Discovered.Entity.National ID Number
TIME
Matches strings consistent with times. Can contain both date and time pieces.
Discovered.Entity.Date
UK_DRIVERS_LICENSE_NUMBER
Matches alphanumeric strings consistent with the United Kingdom's Driver's License number. Requires either a 16- or 18-character string. The first five characters represent the driver's surname, padded with 9
s, followed by a single digit for decade of birth, two digits for month of birth (incremented by 50 for female drivers), two digits for day of birth, one digit for year of birth, two letters, an arbitrary digit, and two digits. Two additional digits can be present for each license issuance.
Discovered.PII
Discovered.Identifier Direct
,
Discovered.Country.UK
Discovered.PHI
Discovered.Entity.Drivers License Number
UK_NATIONAL_INSURANCE_NUMBER
Matches alphanumeric strings consistent with the United Kingdom's National Insurance number. Requires a nine-character string. The first two digits must be letters, followed by an optional space, then six digits with optional spaces or hyphens (-
) every two digits, ending with a letter.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.UK
Discovered.PHI
Discovered.Entity.National Insurance Number
UK_TAXPAYER_REFERENCE
Matches ten-digit numeric strings consistent with UK Taxpayer Reference (UTR) numbers. The final digit is a checksum.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.UK
Discovered.PHI
Discovered.Entity.Taxpayer Reference
URL
Matches string consistent with a Uniform Resource Locator (URL). String must begin with http://
, https://
, ftp://
, file:///
, or mailto:
, followed by a string and ending with a top level domain of no more than 128 characters.
Discovered.Entity.URL
US_ADOPTION_TAXPAYER_IDENTIFICATION_NUMBER
Matches a numeric string consistent United States Adoption Taxpayer Identification Number (ATIN). Requires a string similar in format to a US Social Security Number, but starting with a 9 in the Area Number and having 93 as an allowed Group Number.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.US
Discovered.PHI
Discovered.Entity.Adoption Taxpayer ID Number
US_BANK_ROUTING_MICR
Matches numeric string consistent with an American Bankers Association (ABA) Routing Number. Must be a nine-digit number starting with 0, 1, 2, 3, 6, or 7, followed by eight digits. The final digit is a checksum.
Discovered.Country.US
Discovered.Entity.Bank Routing MICR
US_DEA_NUMBER
Matches alphanumeric strings consistent with a Drug Enforcement Administration (DEA) number that is assigned to a health care provider. Must be a length of nine characters. The first two digits must be alphanumeric, and the last seven digits must be digits. The final digit is a checksum.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.US
Discovered.Entity.DEA Number
US_EMPLOYER_IDENTIFICATION_NUMBER
Matches numeric string consistent United States Employer Identification Number (EIN). Strings must contain nine digits with a hyphen after the second digit.
Discovered.Country.US
Discovered.Entity.Employer ID Number
US_HEALTHCARE_NPI
Matches numeric strings consistent with US National Provider Identifier (NPI). Strings must be either 10 or 15 digits with the final digit being a valid checksum.
Discovered.PII
Discovered.Country.US
Discovered.Entity.Healthcare NPI
Discovered.Identifier Undetermined
US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER
Matches a numeric string consistent United States Individual Taxpayer Identification Number (ITIN). Requires a string similar in format to a US Social Security Number, but starting with a 9 in the Area Number and having a limited set of allowed Group Numbers.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.US
Discovered.PHI
Discovered.Entity.Individual Taxpayer ID Number
US_PASSPORT
Matches numeric strings consistent with United States Passport number. Strings must contain nine digits.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.US
Discovered.PHI
Discovered.Entity.Passport
US_PREPARER_TAXPAYER_IDENTIFICATION_NUMBER
Matches strings consistent with a Preparer Taxpayer ID number. Strings must have nine characters, starting with a P
that is followed by 8 digits.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.US
Discovered.Entity.Preparer Taxpayer ID Number
US_SOCIAL_SECURITY_NUMBER
Matches strings consistent with a US Social Security Number. Strings must contain nine digits and comprise three parts: the three left-most digits designating the area number, the middle two digits designating the group number, and the four right-most digits designating the serial number. For a column to be tagged, none of these parts can contain all zeroes, and area numbers must not be 666 or in the range of 900-999.
Discovered.PII
Discovered.Identifier Direct
Discovered.Country.US
Discovered.PHI
Discovered.Entity.Social Security Number
US_STATE
Matches strings consistent with either a full name or two-letter abbreviation of a US state or territory.
Discovered.Country.US
Discovered.Entity.State
US_TOLLFREE_PHONE_NUMBER
Matches strings consistent with a US toll-free telephone number. Allowed area codes are 800, 88+any digit, or 899.
Discovered.Country.US
Discovered.Entity.Tollfree Telephone Number
VEHICLE_IDENTIFICATION_NUMBER
Matches strings consistent with Vehicle Identification Numbers. A checksum is required as well as a valid World Manufacturer Identifier.
Discovered.Country.US
Discovered.Entity.Vehicle Identifier or Serial Number
Last updated
Was this helpful?