Create a Data Source
The V2 API is built to easily enable an “as-code” approach to managing your data sources, so each time you POST data to this endpoint, you must provide complete details of what you want in Immuta. The two examples below illustrate this design:
If you
POSTonce explicitly defining a single table under sources, and thenPOSTa second time with a different table, this will result in a single data source in Immuta pointing to the second table and the first data source will be deleted or disabled (depending on the value specified forhardDelete).If you
POSTonce with twotableTagsspecified (e.g.,Tag.AandTag.B) and do a follow-upPOSTwithtableTags: [Tag.C], onlyTag.Cwill exist on all of the tables specified; tagsTag.AandTag.Bwill be removed from all the data sources. Note: If you are frequently using the v2 API to update data tags, consider using the custom REST catalog integration instead.
Through this endpoint, you can create or update all data sources for a given schema or database.
POST /api/v2/data
/api/v2/dataCreate or update data sources.
Required Immuta permission: CREATE_DATA_SOURCE
connectionKey: my-databricks
connection:
hostname: your.databricks.hostname.com
port: 443
ssl: true
database: tpc
username: token
password: "${DATABRICKS_PASSWORD}"
httpPath: sql/protocolv1/o/0/11101101
handler: DatabricksconnectionKey: my-databricks
nameTemplate:
dataSourceFormat: Databricks <Tablename>
tableFormat: <tablename>
schemaFormat: databricks
connection:
hostname: your.databricks.hostname.com
port: 443
ssl: true
database: data
username: token
password: "${DATABRICKS_PASSWORD}"
httpPath: sql/protocolv1/o/0/1110-11123
handler: Databricks
sources:
- table: credit_card_transactions
schema: data
tags:
table:
- PCI
- SENSITIVE
columns:
- columnName: transaction_date
tags:
- PCI
- DATE
- table: crime_data
schema: data
naming:
datasource: Crime Data
table: crime_data
schema: databricksPath parameters
dryRun boolean
If true, no updates will actually be made.
Optional
false
wait number
The number of seconds to wait for data sources to be created before returning. Anything less than 0 will wait indefinitely.
Optional
0
Body parameters
The body of the request contains the details of the data source you want to create. The following table describes the attributes you can include in the body.
connectionKey string
A key/name to uniquely identify this collection of data sources.
Required
nameTemplate object
A template to override naming conventions. If not provided, system defaults will be used.
Optional
options object
Override options for these data sources. If not provided, system defaults will be used.
Optional
sources array
Configure which data sources are created. If not provided, all objects from the given connection will be created.
Optional
connection object
connection objectThe connection object specifies the connection details required to connect to your data source. The tables below describes its child attributes.
handler
Snowflake
Required
ssl boolean
Set to true to enable SSL communication with the remote database.
Optional
database string
The database name.
Required
schema string
The schema in the remote database.
Optional
hostname string
The hostname of the remote database instance.
Required
port number
The port of the remote database instance.
Optional
warehouse string
The default pool of compute resources Immuta will use to run queries and other Snowflake operations.
Required
connectionStringOptions string
Additional connection string options to be used when connecting to the remote database.
Optional
authenticationMethod string
The type of authentication method to use. Options include userPassword, keyPair, and oAuthClientCredentials.
Required
username string
The username used to connect to the remote database.
Required if using userPassword or keyPair.
password string
The password used to connect to the remote database.
Required if using userPassword.
useCertificate boolean
Set to true when using client certificate credentials to request an access token. Otherwise, set to false to use client secret.
Required if using oAuthClientCredentials.
userFiles object
Details about the files required for the request.
Required if using keyPair or oAuthClientCredentials with useCertificate set to true.
keyName string
The connection name of the key file. Must be PRIV_KEY_FILE if using keyPair, or must be oauth client certificate if using oAuthClientCredentials.
Required if using keyPair or oAuthClientCredentials with useCertificate set to true.
content string
The content of the file, base-64 encoded.
Required if using keyPair or oAuthClientCredentials with useCertificate set to true.
userFilename string
The name of the file - for display in the UI.
Required if using keyPair or oAuthClientCredentials with useCertificate set to true.
handler
Databricks
Required
ssl boolean
Set to true to enable SSL communication with the remote database.
Optional
database string
The database name.
Optional
hostname string
The hostname of the remote database instance.
Required
port number
The port of the remote database instance.
Optional
connectionStringOptions string
Additional connection string options to be used when connecting to the remote database.
Optional
authenticationMethod string
The type of authentication method to use. Options include oAuthM2M and token.
Required
token string
The Databricks personal access token for the service principal created for Immuta.
Required if using token authentication.
useCertificate boolean
Set to true when using client certificate credentials to request an access token. Otherwise, client secret.
Required if using oAuthM2M.
clientId string
The client identifier of the Immuta service principal you configured. This is the client ID displayed in Databricks when creating the client secret for the service principal.
Required if using oAuthM2M.
audience string
The audience for the OAuth Client Credential token request.
Required if using oAuthM2M.
clientSecret string
An application password an app can use in place of a certificate to identity itself.
Required if using oAuthM2M and useCertificate is set to false.
certificateThumbprint string
The certificate thumbprint to use to generate the JWT for the OAuth Client Credential request.
Required if using oAuthM2M and useCertificate is set to true.
scope string
The scope limits the operations and roles allowed in Databricks by the access token. See the OAuth 2.0 documentation for details about scopes.
Optional
httpPath string
The HTTP path of your Databricks cluster or SQL warehouse.
Required
handler
Redshift
Required
ssl boolean
Set to true to enable SSL communication with the remote database.
Optional
database string
The database name.
Optional
schema string
The schema in the remote database.
Required
connectionStringOptions string
Additional connection string options to be used when connecting to the remote database.
Optional
hostname string
The hostname of the remote database instance.
Required
port number
The port of the remote database instance.
Optional
authenticationMethod string
The type of authentication method to use. Options include userPassword and okta.
Required
username string
The username used to connect to the remote database.
Required
password string
The password used to connect to the remote database.
Required
idpHost string
The Okta identity provider host URL.
Required if using okta.
appID string
The Okta application ID.
Required if using okta.
role string
The Okta role.
Required if using okta.
handler
Google BigQuery, Presto, and Trino
ssl boolean
Set to true to enable SSL communication with the remote database.
database string
The database name.
schema string
The schema in the remote database.
userFiles array
Array of objects; each object must have keyName (corresponds to a connection string option), content (base-64 encoded content), and userFilename (the name of the file - for display purposes in the app).
connectionStringOptions string
Additional connection string options to be used when connecting to the remote database.
hostname string
The hostname of the remote database instance.
port number
The port of the remote database instance.
authenticationMethod string
The type of authentication method to use. Starburst (Trino) and Trino (Presto) options include No Authentication, LDAP Authentication, or Kerberos Authentication. Google BigQuery (Google BigQuery) option is keyFile.
username string
The username used to connect to the remote database.
password string
The password used to connect to the remote database.
sid string
Required for Google BigQuery, the BigQuery project ID used to build the connection string.
nameTemplate object
nameTemplate objectUse the nameTemplate object to use the backing table, schema, or database names to systematically name the Immuta data sources created through the connection. All names will default to lowercase. The table below describes its child attributes.
dataSourceFormat string
Format to be used to name the data sources created in this group.
<tablename><schema><database>Any string
schemaFormat string
Format to be used to name the Immuta schema created in this group.
<tablename><schema><database>Any string
tableFormat string
Format to be used to name the Immuta table created in this group.
<tablename><schema><database>Any string
schemaProjectNameFormat string
Format to be used to name the Immuta schema project created in this group.
<tablename><schema><database>Any string
Example
For the table, TPC.CUSTOMER, that is given the following nameTemplate:
dataSourceFormat: <schema> <tablename>
tableFormat: <tablename>
schemaFormat: <schema>
schemaProjectNameFormat: <schema>This nameTemplate will produce a data source named tpc.customer in a schema project named tpc.
options object
options objectThe options object allows you to override the default options for the data sources created through this connection. If not provided, Immuta will use the system defaults. The table below describes its child attributes.
staleDataTolerance integer
The length in seconds that data for these data sources can be cached.
-
disableSensitiveDataDiscovery boolean
If true, Immuta will not perform identification for the data sources created through this connection.
false
domainCollectionId string
The ID of the domain to assign the data sources to. Use the GET /domain endpoint to retrieve domains and domain IDs.
-
hardDelete boolean
If true, when the table backing the data source is no longer available, the data source in Immuta is deleted. If this is false, the data source will be disabled.
false
tableTags array
An array of tags (strings) to place at the data source level on every data source.
-
owners object
owners objectThere are three options for the owners object when POSTing to the /data endpoint:
Include the object with data owners.
Include the object, but leave the
type,name, andiamout. This will remove all data owners from the data source (other than the calling user).Exclude the object from the payload. This will not impact your data owners and allow you to manage data owners through external processes or the UI.
The owners object is an array of objects for each owner. The table below describes its child attributes.
type string
The type of owner that is being added.
groupuser
name string
The name of the group or the username of the user.
-
iam string
The ID of the identity manager system the user or group comes from. If excluded, any user/group that matches will be added as an owner.
-
sources array
sources arrayThe sources array determines which tables are registered as data sources. The table below describes its child attributes.
all boolean
If true, all tables will be registered in Immuta and schema monitoring will be on.
Required
table string
The specific table to register in Immuta as a data source.
Optional
schema string
The specific schema to monitor with schema monitoring.
Optional
description string
A short description for the data source.
Optional
documentation string
Markdown-supported documentation for the data source.
Optional
naming object
Use this object to override the nameTemplate provided for the whole database/schema. This object's attributes are the same as the nameTemplate object.
Optional
owners object
Specify owners for an individual data source. This object is the same as owners object.
Optional
Examples
sources:
- all: trueThis will register specific tables and add tags and column descriptions.
sources:
- table: name_of_table
schema: name_of_schema
tags:
table:
- Sensitive
- Marketing
columns:
- columnName: acct_num
tags:
- unique_id
columnDescriptions:
- columnName: acct_num
description: The account numbercolumns object
columns objectThere are three options for the columns object when POSTing to the /data endpoint:
Include the object with column details. Only the columns listed will be in the Immuta data source.
Include the object, but leave it empty. This will turn on column detection, and Immuta will update the columns once a day to be accurate to the backing table.
Exclude the object from the payload. This will register all the columns in the table, but column detection will be off.
The columns object is an array of objects for each column. The table below describes its child attributes.
name string
The column name.
dataType string
The data type.
nullable boolean
If true, the column contains null.
remoteType string
The actual data type in the remote database.
primaryKey string
Specifies whether this is the primary key of the remote table.
description string
Describes the column.
columnDescriptions array
columnDescriptions arrayYou can add descriptions to columns without having to specify all the columns in the data source. columnDescriptions is an array of objects with the following schema:
columnName string
The column name.
description string
The description of the column.
tags object
tags objectYou can add tags to columns or data sources. tags is an object with the following schema:
table array
An array of tags (strings) to add to this table.
columns array
An array of objects that specifies columnName (string) and tags (an array of tags). The listed tags will be applied to the columns.
Last updated

