Data Processing, Encryption, and Masking Practices
Last updated
Was this helpful?
Last updated
Was this helpful?
Policy decision data is transmitted to ensure end users querying data are limited to the appropriate access as defined by the policies in Immuta.
A user runs a query against data in their environment.
The query is sent to the Immuta Web Service.
The Web Service queries the Metadata Database to obtain the policy definition, which includes data source metadata (tags, column names, etc.) and user entitlements (groups and attributes).
The policy information is transmitted to the remote data system for policy enforcement.
Query results are displayed based on what policy definition was applied.
Sample data is processed and aggregated or reduced during Immuta's and specific . Note: Data Owners can see sample data when editing a data source. However, this action requires the database password, and the small sample of data visible is only displayed in the UI and is not stored in Immuta.
In the Snowflake integration, statistical queries made during data source registration are distilled into summary statistics, called fingerprints. Fingerprinting allows Immuta to implement advanced privacy enhancing masking and data policies.
During this process, query results return statistics (not data samples) about the data to Immuta (no PII is included). The fingerprinting process checks for new tables through schema monitoring (when enabled) and captures summary statistics of changes to data sources, including when policies were applied, external views were created, or sensitive data elements were added.
Sample data is processed when randomized response policies are applied to Snowflake data sources.
Sample data exists temporarily in memory during the computation.
Raw data is processed for masking, producing either a distinct set of values or aggregated groups of values.
Immuta captures metadata and stores it in an internal PostgreSQL database. Organizations can encrypt the volumes backing the database using an external Key Management Service to ensure that data is encrypted at rest.
To encrypt data in transit, Immuta uses TLS protocol, which is configured per installation by the user.
Immuta encrypts values with data encryption keys, either those that are system-generated or managed using an external key management service (KMS). Immuta recommends a KMS to encrypt or decrypt data keys and supports the AWS Key Management Service; however, if no KMS is configured, Immuta will generate a data encryption key on a user-defined rollover schedule, using the most recent data key to encrypt new values while preserving old data keys to decrypt old values.
Immuta employs three families of functions in its masking policies:
One-way Hashing: One-way (irreversible) hashing is performed via a salted SHA256 hash. A consistent salt is used for values throughout the data source, so users can count or track the specific values without revealing the true value. Since hashed values are different across data sources, users are unable to join on hashed values. Note: Joining on masked values can be enabled in Immuta projects.
Reversible Masking: For reversible masking, values are encrypted using AES-256 CBC encryption. Encryption is performed using a cell-specific initialization vector. The resulting values can be unmasked by an authorized user. Note that this is dynamic encryption of individual fields as results are streamed to the querying system; Immuta is not modifying records in the data store.
Reversible Format Preserving Masking: Format preserving masking maintains the format of the data while masking the value, and is achieved by initializing and applying the NIST standard method FF1 at the column level. The resulting values can be unmasked by an authorized user. Note: This policy type is only available in the Snowflake integration.
Immuta communicates with remote databases over a TCP connection.
If a randomized response policy targets a column that contains PII, Immuta stores that PII in the Metadata Database in order to enforce the policy. If the list of substitution values for a categorical column is not part of the policy specification (e.g., when specified via the API), a list is obtained via query (which may contain PII) and merged into the policy definition in the Metadata Database. Immuta requires that you .