Register an AWS Lake Formation Connection

Design partner: This connection is available to select accounts. Contact your Immuta representative for details.

The connection API is a REST API that allows users to register an AWS Lake Formation to Immuta with a single set of credentials rather than configuring an integration and creating data sources separately. Then Immuta can manage and enforce access controls on your data through that connection. To manage your connection, see the Manage a connection reference guide.

Requirements

  • Immuta permission: APPLICATION_ADMIN

  • The AWS account credentials or you provide for the Immuta service principal must have permissions to perform the following actions to register data and apply policies:

    • Glue Data Catalog actions

      • glue:GetDatabase

      • glue:GetTables

      • glue:GetDatabases

      • glue:GetTable

    • Lake Formation actions

      • lakeformation:ListPermissions

      • lakeformation:BatchGrantPermissions

      • lakeformation:BatchRevokePermissions

      • lakeformation:CreateLFTag

      • lakeformation:UpdateLFTag

      • lakeformation:DeleteLFTag

      • lakeformation:AddLFTagsToResource

      • lakeformation:RemoveLFTagsFromResource

Prerequisites

  • Data lake is set up in AWS Lake Formation. The account in which this is set up is referred to as the admin account. This is the account that you will use to initially configure IAM and AWS Lake Formation permissions to give the Immuta service principal access to perform operations. The user in this account must be able to manage IAM permissions and Lake Formation permissions for all data in the Glue Data Catalog.

  • No AWS Lake Formation connections configured in the same Immuta instance for the same Glue Data Catalog.

  • The databases and tables you want Immuta to govern must be configured in AWS to respect the AWS Lake Formation permissions. Immuta cannot govern resources that use IAM access control or hybrid access mode.

1. Set up the Immuta service principal

The Immuta service principal is the to perform operations in your AWS account. This role must have all the necessary permissions in AWS Glue and AWS Lake Formation to allow Immuta to register data sources and apply policies.

  1. Create an AWS Account IAM role (select AWS account as the trusted entity type) that can be used by Immuta to set up the connection and orchestrate AWS Lake Formation policies. Immuta will assume this IAM role from Immuta's AWS account in order to perform any operations in your AWS account. Before proceeding, contact your Immuta representative for the AWS account to add to your trust policy. Then, complete the steps below.

  2. Add the following IAM permissions to the service principal from the admin account. These permissions will allow the service principal to register data sources and apply policies on Immuta's behalf.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "glue:GetDatabase",
            "glue:GetTables",
            "glue:GetDatabases",
            "glue:GetTable",
            "lakeformation:ListPermissions",
            "lakeformation:BatchGrantPermissions",
            "lakeformation:BatchRevokePermissions",
            "lakeformation:CreateLFTag",
            "lakeformation:UpdateLFTag",
            "lakeformation:DeleteLFTag",
            "lakeformation:AddLFTagsToResource",
            "lakeformation:RemoveLFTagsFromResource"
          ],
          "Resource": "*"
        }
      ]
    }
  3. Grant the service principal permissions on any tables that will be registered in Immuta. There are two ways to give the service principal these permissions: either make a new LF-Tag that gives the appropriate permissions and apply it to all databases or tables that Immuta will manage, or make the role a superuser in Lake Formation.

2. Create the connection in Immuta

POST /data/connection

Copy the request and update the <placeholder_values> with your connection details. Then submit the request.

Find descriptions of the editable attributes in the table below.

Test run

Opt to test and validate the create connection payload using a dry run:

POST /data/connection/test

curl -X 'POST' \
    'https://<your-immuta-url>/data/connection' \
    -H 'accept: application/json' \
    -H 'Content-Type: application/json' \
    -H 'Authorization: <your-bearer-token>' \
    -d '{
     "connectionKey": "<your-connection-key-name>",
     "connection": {
       "technology": "Glue",
       "authenticationType": "accessKey",
       "serviceARN": "<your-aws-glue-catalog-arn>",
       "accessKeyId": "<your-access-key-id>",
       "secretAccessKey": "<your-secret-access-key>"
     },
     "settings": {
        "isActive": false
     },
     "options": {
        "forceRecursiveCrawl": true
     }
    }'

Payload parameters

Attribute
Description
Required

connectionKey string

A unique name for the connection.

Yes

connection object

Configuration attributes of the AWS Lake Formation connection.

Yes

connection.technology string

The technology backing the new connection.

Yes

connection.authenticationType string

The authentication type to register the connection.

Yes

connection.serviceARN string

The Amazon resource name of the Glue Data Catalog that contains the data you want to register.

Yes

connection.accessKeyId string

Required if authenticationType is accessKey.

connection.secretAccessKey string

The secret access key of an AWS account with the AWS permissions listed in the set up the Immuta service principal section.

Required if authenticationType is accessKey.

connection.roleARN string

The Amazon resource name of the role Immuta will assume from Immuta's AWS account in order to perform any operations in your AWS account.

Required if authenticationType is assumedRole.

connection.externalId string

The external ID provided in a condition on the trust relationship for the cross-account IAM specified above.

Optional

settings array

Specifications of the connection's settings, including active status.

No

settings.isActive boolean

When false, data objects will be inactive by default when created in Immuta. Set to false for the recommended configuration.

No

options array

Specification of the connection's default behavior for object crawls.

No

options.forceRecursiveCrawl boolean

If false, only active objects will be crawled. If true, both active and inactive data objects will be crawled; any child objects from inactive objects will be set as inactive. Set to true for the recommended configuration.

No

Response schema

Attribute
Description

objectPath string

The list of names that uniquely identify the path to a data object in the remote platform's hierarchy. The first element should be the associated connectionKey.

bulkId string

A bulk ID that can be used to search for the status of background jobs triggered by this request.

Example response

{
  "objectPath": ['<your-connection-key-name>'],
  "bulkId": "a-new-uuid"
}

Last updated

Was this helpful?