Google BigQuery Integration
Private preview: This integration is available to select accounts. Reach out to your Immuta representative for details.
The Google BigQuery integration allows users to query policy protected data directly in BigQuery as secure views within an Immuta-created dataset. Immuta controls who can see what within the views, allowing data governors to create complex ABAC policies and data users to query the right data within the BigQuery console.
Configuration
Google BigQuery is configured through the Immuta console and a script provided by Immuta. While you can complete some steps within the BigQuery console, it is easiest to install using gcloud and the Immuta script.
Protect your data
Once Google BigQuery has been configured, BigQuery admins can start creating subscription and data policies to meet compliance requirements and users can start querying policy protected data directly in BigQuery.
Revoke user access to the original datasets and grant users access to the Immuta created datasets in BigQuery.
Users query data from the Immuta created datasets directly in BigQuery.
FAQs
What permissions will Immuta have in my BigQuery environment?
What integration features will Immuta support for BigQuery?
For private preview, Immuta supports a basic version of the BigQuery integration where Immuta can enforce specific policies on data in a single BigQuery project. At this time, workspaces, tag ingestion, user impersonation, native query audit, and multiple integrations are not supported.
Google BigQuery integration conceptual overview
In this policy push integration, Immuta creates views that contain all policy logic. Each view has a 1-to-1 relationship with the original table. Access controls are applied in the view, allowing customers to leverage Immuta’s powerful set of attribute-based policies and query data directly in BigQuery.
BigQuery is organized by projects (which can be thought of as databases), datasets (which can be compared to schemas), tables, and views. When you enable the integration, an Immuta dataset is created in BigQuery that contains the Immuta-required user entitlements information. These objects within the Immuta dataset are intended to only be used and altered by the Immuta application.
After data sources are registered, Immuta uses the custom user and role, created before the integration is enabled, to push the Immuta data sources as views into a mirrored dataset of the original table. Immuta manages grants on the created view to ensure only users subscribed to the Immuta data source will see the data.
Secure views
Managing access
Following the principle of least privilege, Immuta does not have permission to manage Google Cloud Platform users, specifically in granting or denying access to a project and its datasets. This means that data governors should limit user access to original datasets to ensure data users are accessing the data through the Immuta created views and not the backing tables. The only users who need to have access to the backing tables are the credentials used to register the tables in Immuta.
Additionally, a data governor must grant users access to the mirrored datasets that Immuta will create and populate with views. Immuta and BigQuery’s best practice recommendation is to grant access via groups in Google Cloud Platform. Because users still must be registered in Immuta and subscribed to an Immuta data source to be able to query Immuta views, all Immuta users can be granted access to the mirrored datasets that Immuta creates.
Limitations
This integration can only be enabled through a manual bootstrap using the Immuta API.
This integration can only be enabled to work in a single region.
Supported policies
This integration supports the following policy types:
Column masking
Mask using hashing (SHA256())
Mask by making NULL
Mask using constant
Mask using a regular expression
Mask by date rounding
Mask by numeric rounding
Mask using custom functions
Row-level masking
Row visibility based on user attributes and/or object attributes
Only show rows that fall within a given time window
Minimize rows
Filter rows using custom WHERE clause
Always hide rows
Additional resources
See the resources below to start implementing and using the BigQuery integration:
Configure the Google BigQuery integration
Follow this guide to connect your Google BigQuery data warehouse to Immuta.
Prerequisites
Immuta SaaS or Immuta v2023.1 or newer with Google BigQuery integration (PrPr) enabled.
Google Cloud service account and role used by Immuta to connect to Google BigQuery
The Google BigQuery integration requires you to create a Google Cloud service account and role that will be used by Immuta to
create a Google BigQuery dataset that will be used to store a table of user entitlements, UDFs for policy enforcement, etc.
manage the table of user entitlements via updates when entitlements change in Immuta.
create datasets and secure views with access control policies enforced, which mirror tables inside of datasets you ingest as Immuta data sources.
You have two options to create the required Google Cloud service account and role:
The Immuta script
A new Google Cloud IAM role
A new Google Cloud service account, which will be granted the newly-created role
A JSON keyfile for the newly-created service account
Google Cloud IAM roles required to run the script
To execute bootstrap.sh
from your command line, you must be authenticated to the gcloud CLI utility as a user with all of the following roles:
roles/iam.roleAdmin
roles/iam.serviceAccountAdmin
roles/serviceusage.serviceUsageAdmin
Having these three roles is the least-privilege set of Google Cloud IAM roles required to successfully run the bootstrap.sh
script from your command line. However, having either of the following Google Cloud IAM roles will also allow you to run the script successfully:
roles/editor
roles/owner
Create a service account and role by running the script provided by Immuta
Set the account property in the core section for Google Cloud CLI to the account gcloud should use for authentication. (You can run gcloud auth list to see your currently available accounts):
In Immuta, navigate to the App Settings page and click the Integrations tab.
Click Add Native Integration and select Google BigQuery from the dropdown menu.
Click Select Authentication Method and select Key File.
Click Download Script(s).
Before you run the script, update your permissions to execute it:
Run the script, where
PROJECT_ID is the Google Cloud Platform project to operate on.
ROLE_ID is the name of the custom role to create.
NAME will create a service account with the provided name.
OUTPUT_FILE is the path where the resulting private key should be written. File system write permission will be checked on the specified path prior to the key creation.
undelete-role (optional) will undelete the custom role from the project. Roles that have been deleted for a long time can't be undeleted. This option can fail for the following reasons:
The role specified does not exist.
The active user does not have permission to access the given role.
enable-api (optional) provided you’ve been granted access to enable the Google BigQuery API, will enable the service.
Create a service account and role by using Google Cloud console
Alternatively, you may use the Google Cloud Console to create the prerequisite role, service account, and private key file for the integration to connect to Google BigQuery.
bigquery.datasets.create
bigquery.datasets.delete
bigquery.datasets.get
bigquery.datasets.update
bigquery.jobs.create
bigquery.jobs.get
bigquery.jobs.list
bigquery.jobs.listAll
bigquery.routines.create
bigquery.routines.delete
bigquery.routines.get
bigquery.routines.list
bigquery.routines.update
bigquery.tables.create
bigquery.tables.delete
bigquery.tables.export
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list
bigquery.tables.setCategory
bigquery.tables.update
bigquery.tables.updateData
bigquery.tables.updateTag
Enable the Google BigQuery integration
In Immuta, navigate to the App Settings page and click the Integrations tab.
Click Add Native Integration and select Google BigQuery from the dropdown menu.
Click Select Authentication Method and select Key File.
Project Id: The Google Cloud Platform project to operate on, where your Google BigQuery data warehouse is located. A new dataset will be provisioned in this Google BigQuery project to store the integration configuration.
Complete the following fields:
Immuta Dataset: The name of the Google BigQuery dataset to provision inside of the project. Important: if you are using multiple environments in the same Google BigQuery project, this dataset to provision must be unique across environments.
Dataset Suffix: The suffix that will be postfixed to the name of each dataset created to store secure views, one per dataset that you ingest a table for as a data source in Immuta. Important: if you are using multiple environments in the same Google BigQuery project, this suffix must be unique across environments.
GCP Location: The dataset’s location. After a dataset is created, the location can't be changed. Note that
If you choose EU for the dataset location, your Core BigQuery Customer Data resides in the EU.
Click Test Google BigQuery Integration.
Click Save.
GCP location must match dataset region
The region set for the GCP location must match the region of your datasets. Set GCP location to a general region (for example, US
) to include child regions.
Disable the Google BigQuery integration
You can disable the Google BigQuery integration automatically or manually.
Automatically disable integration
Click the App Settings icon, and then click the Integrations tab.
Select the Google BigQuery integration you would like to disable, and select the Disable Integration checkbox.
Click Save.
Manually disable integration
The privileges required to run the cleanup script are the same as the Google Cloud IAM roles required to run the bootstrap.sh
script.
Click the App Settings icon, and then click the Integrations tab.
Select the Google BigQuery integration you would like to disable, and click Download Scripts.
Click Save. Wait until Immuta has finished saving your configuration changes before proceeding.
Before you run the script, update your permissions to execute it:
Run the cleanup script.
Next steps
Last updated