Skip to content

Object-Backed Data Source Tutorial

Audience: Data Owners

Content Summary: Object-backed data sources are data storage technologies that do not support SQL and can range from NoSQL technologies, to blob stores, to filesystems, to APIs. Object-backed data sources act like key/value stores and are often called ingested sources because Immuta must ingest metadata about the data source to provide access and create policy restrictions. Data Owners provide Immuta metadata about the blobs they are exposing so that Immuta understands how to reach the blobs and apply policies.

This guide outlines the process of creating object-backed data sources, such as Amazon S3, Apache HDFS, Azure Blob Storage, Custom, ElasticSearch, FTP, MongoDB, and Persisted.

If your storage technology is not listed above, navigate to the Query-backed Data Sources Tutorial.

Step 1: Create a New Data Source

To create a new data source,

  1. Click the plus button in the bottom left corner of the Immuta console.
  2. Select the Data Source icon.

Alternatively,

  1. Navigate to the My Data Sources page.
  2. Click the New Data Source button.

Step 2: Select Your Storage Technology

Select the storage technology containing the data you wish to expose by clicking a tile. Please note that the list of enabled technologies is configurable and may differ from the image below.

Data Source Creation Select Backend

Select from the list below for specific instructions on creating a data source for your chosen storage technology. If your storage technology is not listed, please refer to the Query-backed Data Source Tutorial.

Step 3: Enter Basic Information

Here you provide information about your source that makes it discoverable to users.

  1. Complete the Data Source Name field, which will be the name shown in the Immuta UI.
  2. Enter the Immuta S3 Folder, which is the name of the Immuta S3 folder that corresponds to this data source. Note that for object-backed data sources, this table will only store metadata about blobs in this data source.

    Data Source Creation Basic Information

Step 4: Manually Re-crawl Data Sources

Some object-backed data sources can be manually re-crawled to fetch fresh metadata about the data objects. If your data source is not set up to ingest the metadata automatically, you may need to perform this action from time to time.

  1. Navigate to the Data Source Overview page.
  2. Click on the menu icon in the upper right corner and select Re-crawl.

    Data Source Re-crawl