Register an AWS Lake Formation Connection
The connection API is a REST API that allows users to register an AWS Lake Formation to Immuta with a single set of credentials rather than configuring an integration and creating data sources separately. Then Immuta can manage and enforce access controls on your data through that connection. To manage your connection, see the Manage a connection reference guide.
Requirements
Immuta permission:
APPLICATION_ADMIN
The AWS account credentials or you provide for the Immuta service principal must have permissions to perform the following actions to register data and apply policies:
glue:GetDatabase
glue:GetTables
glue:GetDatabases
glue:GetTable
lakeformation:ListPermissions
lakeformation:BatchGrantPermissions
lakeformation:BatchRevokePermissions
lakeformation:CreateLFTag
lakeformation:UpdateLFTag
lakeformation:DeleteLFTag
lakeformation:AddLFTagsToResource
lakeformation:RemoveLFTagsFromResource
Prerequisites
Data lake is set up in AWS Lake Formation. The account in which this is set up is referred to as the admin account. This is the account that you will use to initially configure IAM and AWS Lake Formation permissions to give the Immuta service principal access to perform operations. The user in this account must be able to manage IAM permissions and Lake Formation permissions for all data in the Glue Data Catalog.
No AWS Lake Formation connections configured in the same Immuta instance for the same Glue Data Catalog.
The databases and tables you want Immuta to govern must be configured in AWS to respect the AWS Lake Formation permissions. Immuta cannot govern resources that use IAM access control or hybrid access mode.
1. Set up the Immuta service principal
The Immuta service principal is the to perform operations in your AWS account. This role must have all the necessary permissions in AWS Glue and AWS Lake Formation to allow Immuta to register data sources and apply policies.
Create an AWS Account IAM role (select AWS account as the trusted entity type) that can be used by Immuta to set up the connection and orchestrate AWS Lake Formation policies. Immuta will assume this IAM role from Immuta's AWS account in order to perform any operations in your AWS account. Before proceeding, contact your Immuta representative for the AWS account to add to your trust policy. Then, complete the steps below.
Add the following IAM permissions to the service principal from the admin account. These permissions will allow the service principal to register data sources and apply policies on Immuta's behalf.
Grant the service principal permissions on any tables that will be registered in Immuta. There are two ways to give the service principal these permissions: either make a new LF-Tag that gives the appropriate permissions and apply it to all databases or tables that Immuta will manage, or make the role a superuser in Lake Formation.
This method follows the principle of least privilege and is the most flexible way of granting permissions to the service principal. LF-Tags cascade down from databases to tables, while allowing for exceptions. This means that when you apply this tag to a database, it will automatically apply to all tables within that database and allow you to remove it from any tables if those should be out of the scope of Immuta’s governance.
Create a new LF-Tag, giving yourself permissions to grant that tag to a user, which will ultimately be your service principal.
In the Lake Formation console, navigate to LF-Tags and permissions and click Add LF-Tag.
Create a tag key and value.
On the LF-Tag key-value pair, grant the
ASSOCIATE
LF-Tag permission to your own IAM principal.
Grant this tag to the Immuta service principal.
In the Lake Formation console, navigate to Data permissions and click Grant.
Enter the service principal’s IAM role.
Add the key-value pair of the tag you created in step 1.
Under Table Permissions, select the following grantable permissions:
SELECT
,DESCRIBE
,INSERT
,DELETE
.Click Grant.
Apply this LF-Tag to the resources you would like Immuta to govern. The Immuta service principal will now have the minimum required permissions on these resources.
2. Create the connection in Immuta
POST
/data/connection
Copy the request and update the <placeholder_values>
with your connection details. Then submit the request.
Find descriptions of the editable attributes in the table below.
Payload parameters
connectionKey string
A unique name for the connection.
Yes
connection object
Configuration attributes of the AWS Lake Formation connection.
Yes
connection.technology string
The technology backing the new connection.
Yes
connection.authenticationType string
The authentication type to register the connection.
Yes
connection.serviceARN string
The Amazon resource name of the Glue Data Catalog that contains the data you want to register.
Yes
connection.accessKeyId string
The access key ID of an AWS account with the AWS permissions listed in the set up the Immuta service principal section.
Required if authenticationType is accessKey
.
connection.secretAccessKey string
The secret access key of an AWS account with the AWS permissions listed in the set up the Immuta service principal section.
Required if authenticationType is accessKey
.
connection.roleARN string
The Amazon resource name of the role Immuta will assume from Immuta's AWS account in order to perform any operations in your AWS account.
Required if authenticationType is assumedRole
.
connection.externalId string
The external ID provided in a condition on the trust relationship for the cross-account IAM specified above.
Optional
settings array
Specifications of the connection's settings, including active status.
No
settings.isActive boolean
When false
, data objects will be inactive by default when created in Immuta. Set to false
for the recommended configuration.
No
options array
Specification of the connection's default behavior for object crawls.
No
options.forceRecursiveCrawl boolean
If false
, only active objects will be crawled. If true
, both active and inactive data objects will be crawled; any child objects from inactive objects will be set as inactive. Set to true
for the recommended configuration.
No
Response schema
objectPath string
The list of names that uniquely identify the path to a data object in the remote platform's hierarchy. The first element should be the associated connectionKey
.
bulkId string
A bulk ID that can be used to search for the status of background jobs triggered by this request.
Example response
Last updated
Was this helpful?