Skip to content

S3 Integration

Audience: Data Owners, Data Users, and System Administrators

Content Summary: Immuta supports an S3-style REST API, which allows you to communicate with Immuta the same way you would with S3. Consequently, Immuta easily integrates with tools you may already be using to work with S3.

S3 as a Filesystem

In this integration, Immuta implements a single bucket (with data sources broken up as sub-directories under that bucket), since some S3 tools only support the new virtual-hosted style requests.

The three APIs (outlined below) used in this integration support basic AWS functionality; the requests and responses for each are identical to those in S3.

GET Bucket

This request returns the bucket configured within Immuta.

Method Path Successful Status Code
GET /s3p 200

GET Bucket Contents

This request returns the contents of the given bucket.

Method Path Successful Status Code
GET /s3p/{bucket} 200

GET Object

This request returns a stream from the requested object within Immuta.

Method Path Successful Status Code
GET /s3p/{bucket}/{dataSource}/{key*} 200

Example Request:

curl \
    --request GET \
    --header "Authorization: AWS <API KEY>:immuta" \
    https://demo.immuta.com/s3p/immuta/my_data_source/path/to/file/myfile.json

Example: HTTP Request and Response

GET Bucket Example Request:

curl \
    --request GET \
    --header "Authorization: AWS <API KEY>:immuta" \
    https://demo.immuta.com/s3p/immuta?delimiter=/&prefix=my_data_source/path/to/file

Note: There is a single file in the requested directory.

GET Bucket Example Response:

<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://doc.s3.amazonaws.com/2006-03-01/">
    <IsTruncated>false</IsTruncated>
    <Marker></Marker>
    <Name>immuta</Name>
    <Prefix>my_data_source/path/to/file</Prefix>
    <MaxKeys>1000</MaxKeys>
    <Delimiter>/</Delimiter>
    <Contents>
        <Key>my_data_source/path/to/file/myfile.json</Key>
        <LastModified>2018-11-05T21:25:04.000Z</LastModified>
        <ETag>5b0810c82a69a70e552cece19b20585fc94b67fe4eaa8b</ETag>
        <Size>389</Size>
        <StorageClass>STANDARD</StorageClass>
        <Owner>
            <ID>Immuta</ID>
            <DisplayName>Immuta</DisplayName>
        </Owner>
    </Contents>
</ListBucketResult>

Example: Using Boto 3 to Download Objects

Boto 3 is the official Amazon Web Services client SDK for Python and is widely used by developers for accessing S3 objects. With Immuta's S3 integration, Immuta users can use boto3 to download policy-enforced files or tables.

The first step is to create a Session object that points to your Immuta endpoint and is authenticated with a user-specific API Key.

import boto3

session = boto3.session.Session()

s3_client = session.client(
    service_name = 's3',
    aws_access_key_id = '<YOUR_USER_API_KEY>',
    aws_secret_access_key = 'immuta',
    endpoint_url = 'https://<YOUR_IMMUTA_URL>:443/s3p'
)

To find out what objects are available for download, you can list the objects in the immuta bucket. To filter down to a particular data source, pass in a Prefix that corresponds to the SQL table name of your Immuta data source.

bucket_contents = s3_client.list_objects(
    Bucket = 'immuta',
    Delimiter = '/',
    Prefix = '<SQL_TABLE_NAME>'
).get("Contents")

print(bucket_contents[0])
    {
        'Key': '<SQL_TABLE_NAME>/<SINGLE_OBJECT_KEY>',
        'ETag': 'aa0492082b95c5d8bb90377a006e...',
        'StorageClass': 'STANDARD',
        'Owner': {'DisplayName': 'Immuta', 'ID': 'Immuta'}
    }

Once you have an object key, you can use the download_file method to download the object to your local development environment.

s3_client.download_file(
    Bucket = "immuta",
    Key = "<SQL_TABLE_NAME>/<SINGLE_OBJECT_KEY>",
    Filename = "<OUTPUT_FILE_PATH>"
)