This page describes the dataSource endpoint, through which users can subscribe to data sources, make unmasking requests, and manage data source tasks. To create data sources, see the specific handler endpoints.
Additional fields may be included in some responses you receive; however, these attributes are for internal purposes and are therefore undocumented.
Retrieves a summary of total records, total visibilities, and visibilities the current user has access to for a specified data source.
Search for data sources
GET/dataSource
Search for data sources.
Query Parameters
Attribute
Description
Required
blobHandlerType
array[string] Describes the type of underlying blob handler that will be used with this data source (e.g., Custom, MS SQL).
No
subscription
array[string] The requesting user's subscription status: pending, owner, subscribed, not_subscribed, expert, or ingest.
No
status
array[string] The data source status: passed or failed.
No
tag
array[string] Filters data sources by tags associated with the data sources.
No
searchText
string Searches for data source names using the provided string.
No
column
array[string] Searches for data source column names.
No
connectionString
array[string] Searches by connection string.
No
schema
string Searches for data source schema.
No
nameOnly
boolean When true, searchText will only search data source names. Default is false.
No
idOnly
boolean When true, only returns the ID Of the data source and the user's subscription status.
No
dataSourceIds
array[integer] Searches for the provided data source IDs.
No
selectFields
array[string] This field accepts the values id, name, and columnEvolutionEnabled. When id or name are provided, the request will return only the ID or name of the data source and the subscription status. If columnEvolutionEnabled is provided, the response will also include information about the policies, policy conflicts, and workspaces associated with the data sources.
No
offset
integer Used in combination with size to fetch pages. Default is 0.
No
size
integer The number of results to return per page. Default is 10.
No
sortField
string Used to sort results by field, which must be createdAt, name, blobHandlerType, subscriptionStatus, recordCount, status, policy, or editable.
No
sortOrder
string Sorts results by order, which must be asc or desc.
No
excludedProjects
array[integer] Filter out any data sources that belong to the specified projects.
No
ephemeral
boolean When true, returns ephemeral data sources.
No
clusterName
string The name of the remote cluster the data source is connected to.
No
mode
integer Specifies the query mode, which must be 0 (FULL), 1 (COUNT), 4 (TAG), 5 (MIN_MAX), or 6 (STATUS).
No
globalPolicy
string Filter by data sources that have this Global Policy applied.
No
hostname
string Searches data sources by hostname.
No
Response Parameters
Attribute
Description
name
string Data source name.
id
integer Data source ID.
deleted
boolean If true the data source is a deleted data source.
description
string The data source description.
createdAt
timestamp The date and time the data source was created.
subscriptionPolicy
array Details the type of Subscription Policy applied to the data source.
schemaEvolutionId
integer The schema evolution ID.
recordCount
integer The record count.
status
array[string] Accepted statuses are passed or failed.
subscriptionStatus
array[string] Accepted statuses are subscribed or unsubscribed.
blobHandlerType
array[string] Describes the type of underlying blob handler of this data source (e.g., Custom, MS SQL).
subscriptionType
string The type of subscription policy on the project. The type can be automatic (which allows anyone to subscribe), approval (which requires the subscriber to be manually approved), policy (which only allows users with specific groups or attributes to subscribe), or manual (which requires users to be manually added).
connectionString
string The connection string information.
sqlSchemaName
string The schema name.
policy
string When this value is none, there are no data policies applied to the data source. Otherwise, this field indicates whether or not there are policy conflicts among the data policies applied to the data source.
policyHandlerType
string The policy handler type, such as None or Builder.
string The data format of blobs in the data source, such as json, xml, html, or jpeg.
description
string The description of the data source.
policyHandler
array The ID of the policy handler and details about the data policies enforced on the data source.
sqlSchemaName
string A string that represents this data source's schema name in the Query Engine.
sqlTableName
string The SQL table name in the Query Engine.
blobHandler
array[object] A list of full URLs providing the locations of all blob store handlers to use with this data source.
blobHandlerType
string Describes the type of underlying blob handler that will be used with this data source (e.g., MS SQL).
createdBy
integer The ID of the profile creating the data source.
deleted
boolean If true, the data source was deleted.
type
string The data source type, such as queryable or ingested.
rowCount
integer The number of rows.
documentation
string Documentation associated with the data source.
id
integer The data source ID.
policyHandlerType
string The type of policy handler applied to the data source: Builder.
subscriptionType
string The type of subscription policy on the data source. The type can be automatic (which allows anyone to subscribe), approval (which requires the subscriber to be manually approved), policy (which only allows users with specific groups or attributes to subscribe), or manual (which requires users to be manually added).
subscriptionPolicy
array Details about the Subscription Policy applied to the data source.
globalPolicies
string Details about the Global Policies applied to the data source.
status
string The data source health status.
Request example
The following request gets a data source based on the ID 22.
string The data format of blobs in the data source, such as json, xml, html, or jpeg.
description
string The description of the data source.
policyHandler
array The ID of the policy handler and details about the data policies enforced on the data source.
sqlSchemaName
string A string that represents this data source's schema name in the Query Engine.
sqlTableName
string The SQL table name in the Query Engine.
blobHandler
array[object] A list of full URLs providing the locations of all blob store handlers to use with this data source.
blobHandlerType
string Describes the type of underlying blob handler that will be used with this data source (e.g., MS SQL).
createdBy
integer The ID of the profile creating the data source.
deleted
boolean If true, the data source was deleted.
type
string The data source type, such as queryable or ingested.
rowCount
integer The number of rows.
documentation
string Documentation associated with the data source.
id
integer The data source ID.
policyHandlerType
string The type of policy handler applied to the data source: Builder.
subscriptionType
string The type of subscription policy on the data source. The type can be automatic (which allows anyone to subscribe), approval (which requires the subscriber to be manually approved), policy (which only allows users with specific groups or attributes to subscribe), or manual (which requires users to be manually added).
subscriptionPolicy
array Details about the Subscription Policy applied to the data source.
globalPolicies
string Details about the Global Policies applied to the data source.
status
string The data source health status.
Request example
The following request gets a data source based on the name Public Barfoo.
string The data format of blobs in the data source, such as json, xml, html, or jpeg.
description
string The description of the data source.
policyHandler
array The ID of the policy handler and details about the data policies enforced on the data source.
sqlSchemaName
string A string that represents this data source's schema name in the Query Engine.
sqlTableName
string The SQL table name in the Query Engine.
blobHandler
array[object] A list of full URLs providing the locations of all blob store handlers to use with this data source.
blobHandlerType
string Describes the type of underlying blob handler that will be used with this data source (e.g., MS SQL).
createdBy
integer The ID of the profile creating the data source.
deleted
boolean If true, the data source was deleted.
type
string The data source type, such as queryable or ingested.
rowCount
integer The number of rows.
documentation
string Documentation associated with the data source.
id
integer The data source ID.
policyHandlerType
string The type of policy handler applied to the data source: Builder.
subscriptionType
string The type of subscription policy on the data source. The type can be automatic (which allows anyone to subscribe), approval (which requires the subscriber to be manually approved), policy (which only allows users with specific groups or attributes to subscribe), or manual (which requires users to be manually added).
subscriptionPolicy
array Details about the Subscription Policy applied to the data source.
globalPolicies
string Details about the Global Policies applied to the data source.
status
string The data source health status.
Request example
The following request gets a data source based on the SQL table name customer_data.
{"author":1,"parentId":null,"resolved":false,"body":"Should this data source be part of the Medical Claims project?","id":2,"createdAt":"2021-09-02T14:14:31.228Z","updatedAt":"2021-09-02T14:14:31.228Z"}
Get All Comments for a Data Source
GET/dataSource/{dataSourceId}/comments
Get all the comments for the data source.
Query Parameters
Attribute
Description
Required
dataSourceId
integer The data source ID.
Yes
Response Parameters
Attribute
Description
author
array The id, name, and email of the author.
resolved
boolean If true, the comment has been resolved.
id
integer The comment ID.
createdAt
timestamp The date and time the comment was created.
updatedAt
timestamp The date and time the comment was updated.
models
array The modelType (such as datasource), modelId, and modelName.
totalreplies
integer The number of replies to the comment.
lastreply
timestamp The date and time of the last reply.
public
boolean If true, the comment is public.
Request example
The following request adds a comment to the data source.
[{"author": {"id":2,"name":"Jane Doe","email":"jane.doe@immuta.com" },"body":"Actually, Billing does not need access, but Customer Service does.","resolved":false,"id":8,"createdAt":"2021-10-21T17:03:31.174Z","updatedAt":"2021-10-21T17:03:31.174Z","models": [{"modelType":"datasource","modelId":"22","modelName":"Fake Medical Claims 2017" }],"totalreplies":0,"lastreply":"0001-01-01T00:00:00.000Z","public":true }, {"author": {"id":2,"name":"Jane Doe","email":"jane.doe@immuta.com" },"body":"This data source should be accessible to the Docs team and Billing.","resolved":false,"id":7,"createdAt":"2021-10-21T17:02:41.390Z","updatedAt":"2021-10-21T17:02:41.390Z","models": [{"modelType":"datasource","modelId":"22","modelName":"Fake Medical Claims 2017" }],"totalreplies":0,"lastreply":"0001-01-01T00:00:00.000Z","public":true }]
Count the comments for a data source
GET/dataSource/{dataSourceId}/comments/count
Count the comments for a data source.
Query Parameters
Attribute
Description
Required
dataSourceId
integer The data source ID.
Yes
columns
boolean When true, retrieves comments for columns.
No
queries
array[string] The queries for which to retrieve comments.
No
resolved
boolean If true, will retrieve only resolved comments. If false, will retrieve only unresolved comments. If not set, will retrieve all comments.
No
Response Parameters
Attribute
Description
modelId
integer The model ID.
modelType
string The model type.
count
integer The number of comments on the data source.
Request example
The following request counts the comments for data source 1.
Get all users with the provided access level for this data source.
Query Parameters
Attribute
Description
Required
dataSourceId
integer The data source ID.
Yes
states
Array[string] The status levels to include when querying for user access.
No
approved
boolean Denotes whether the returned access objects should be approved.
No
searchText
string A string used to filter returned users. The query is executed with a wildcard prefix and suffix.
No
size
integer The number of results to return.
No
offset
integer The number of results to skip (for paging).
No
sortField
string The field on which to sort the result set.
No
sortOrder
string The order in which to sort the results.
No
expandGroups
boolean If true will return individual members of any group subscribed.
No
ignoreSystemGenerated
boolean If true, will not return system generated accounts.
No
filterBySchemaEvolution
boolean If true, will only return users who have the specified level of access across ALL data sources within the same schema evolution group as this one.
No
Response Parameters
Attribute
Description
count
integer The number of users with access to the data source.
users
string The metadata regarding the users with access to the data source.
Request example
The following request gets all users with the provided access level for this data source.
Retrieves the visibilities, masking information, and filters that the passed in user has access to in the specified data source.
Query Parameters
Attribute
Description
Required
dataSourceId
integer The data source ID.
Yes
profileId
integer The profile ID of the user.
Yes
projectId
integer The project ID. If provided, this project will be used when evaluating the user's visibilities.
No
Response Parameters
Attribute
Description
visibilities
array Details of the user's visibilities, including anyKey.
visibilityRuleApplies
boolean If true, a visibility rule exists and the user is not excepted from it.
masked
array Masking information for the data source, including metadata, name, type, and actionType.
additionalFilters
array Policy information for the data source, including customWhere, differentialPrivacy, eventTimeColumn, minimization, time, filterSeconds, and isOlderOrNewer.
Request example
The following request gets the visibility information for the user with the profile ID 2 on the data source with the data source ID 16.
Retrieves a summary of total records, total visibilities (the unique values contained in a column protected by a row-level security policy that allow Immuta to determine whether or not a user can see a given row if they possess an attribute that matches the visibility of that row), and visibilities a given user has access to.
Query Parameters
Attribute
Description
Required
dataSourceId
integer The data source ID.
Yes
profileId
integer The profile ID of the user.
Yes
informationOnly
boolean If true, the query will just return information for the UI and will skip running some queries for ephemeral data sources.
No
includeNestedColumns
boolean If true, the query will return just information for the dictionary page, including the masking policies for nested columns.
No
Response Parameters
Attribute
Description
noVisibilities
boolean If true, the data source has no row-level security or purpose-based restriction policies applied to it.
dataSourceVisibilitiesCount
integer The total number of possible visibilities the given data source has.
userVisibilitiesCount
integer The number of visibilities the current user can see for the given data source.
masked
array Masking information for the data source, including metadata, name, type, and actionType.
dataSource
integer The data source ID.
dataSourceName
string The data source name.
additionalFilters
array Policy information for the data source, including customWhere, differentialPrivacy, eventTimeColumn, minimization, time, filterSeconds, and isOlderOrNewer.
allowMaskedJoins
boolean If true the data source allows masked joins.
policySet
array Details about the policies on the data source.
Request example
The following request gets all of the visibility information for the user with the profile ID 2 on the data source with the data source ID 16.
Retrieves a summary of total records, total visibilities (the unique values contained in a column protected by a row-level security policy that allow Immuta to determine whether or not a user can see a given row if they possess an attribute that matches the visibility of that row), and visibilities the current user has access to for a specified data source.
Query Parameters
Attribute
Description
Required
dataSourceId
integer The data source ID.
Yes
Response Parameters
Attribute
Description
noVisibilities
boolean If true, the data source has no row-level security or purpose-based restriction policies applied to it.
dataSourceVisibilitiesCount
integer The total number of possible visibilities the given data source has.
userVisibilitiesCount
integer The number of visibilities the current user can see for the given data source.
denialReason
string Reason the user was denied visibility.
masked
array Masking information for the data source, including metadata, name, type, and actionType.
dataSource
integer The data source ID.
dataSourceName
string The data source name.
additionalFilters
array Policy information for the data source, including customWhere, differentialPrivacy, eventTimeColumn, minimization, time, filterSeconds, and isOlderOrNewer.
allowMaskedJoins
boolean If true the data source allows masked joins.
policySet
array Details about the policies on the data source.
Request example
The following request gets all of the visibility information for the current user on the data source with the data source ID 16.
array The ID of the data source the user is subscribing to.
No
approvals
array Includes details about the Subscription policy on the data source: requiredPermissions, specificApproverRequired, specificApprover, and ownerModelId.
No
Response Parameters
Attribute
Description
body
array Contains details about the data source, including the data source ID, subscription status of the user, the profile ID of the user, and the dates the data source was created and updated.
Request example
The following request subscribes to the data source with ID 22.
string The status of the user: subscribed, owner,expert, or ingest.
Yes
profileId
integer The profile ID of the user being added to the data source.
Yes
groupId
integer The ID of the group being added to the data source.
No
approvals
array Details about the user approving access: requiredPermission, specificApproverRequired, and specificApprover.
No
expiration
date The date the user's data source subscription ends.
No
Response Parameters
Attribute
Description
id
integer The user's subscription ID.
modelId
integer The model ID.
modelType
string The model type.
state
string The user's data source role, such as subscribed.
denialReasoning
string If the user was denied access, the reason for denial.
profile
integer The user's profile ID.
group
integer If a group was added, the group ID.
expiration
date The date the user's subscription to the data source will expire.
acknowledgeRequired
boolean If the data source is associated with a project, this value will be true if the user needs to confirm they have read the project acknowledgment.
createdAt
timestamp The date and time of creation.
updatedAt
timestamp The date and time of update.
approved
boolean When true, the user's request has been approved.
Request example
The following request adds a user (saved in example-payload.json) to this data source.
string The action to perform on the data sources: add-users, disable, restore, delete, or tags.
Yes
Payload Parameters
Attribute
Description
Required
ids
array[integer] The IDs of the data sources to update.
Yes
update
array[object] Only required for add-users (includes metadata about the users' profiles: id and state) and tags (includes metadata about the tags: name and source) types.
No
Response Parameters
Attribute
Description
bulkId
string The ID of the bulk data source update.
jobsCreated
integer The number of jobs created.
Request example
The following request adds the Address.email tag to two data sources.
Trigger the schema monitoring job for the specified detection group, or all groups if no payload parameters are given.
Payload Parameters
Attribute
Description
Required
dataSourceIds
array[integer] The data source IDs to run the column detection job on. Leave empty to run this job globally on all data sources. This parameter cannot be included in the payload if schemaEvolutionId or any combination of hostname, database, port, or table is included.
No
hostname
string The hostname of the data sources. This parameter cannot be included in the payload if dataSourceIds or schemaEvolutionId is included.
No
port
integer The port used to connect the data sources to Immuta. This parameter cannot be included in the payload if dataSourceIds or schemaEvolutionId is included.
No
database
string The database name. This runs schema monitoring on the database provided. If data sources were initially registered via the V2 API, including this parameter will locate new schemas that contain tables Immuta has the ability to access, and Immuta will create a new schema project associated with these newly discovered schemas and create data sources for each table located. If data sources were initially registered via the V1 API, including this parameter will only update the columns and tables of registered schema and tables of the specified database; it will not register any new schemas. This parameter cannot be included in the payload if dataSourceIds or schemaEvolutionId is included.
No
table
string The table name. This will run column detection to just update the columns in this table. This parameter cannot be included in the payload if dataSourceIds or schemaEvolutionId is included.
No
schemaEvolutionId
integer The ID of the schema to run the schema monitoring job on. This will run on all tables associated with the specified ID. The schema ID can be found in the response body of/dataSource/{dataSourceId}. This parameter cannot be included in the payload if dataSourceIds or any combination of hostname, database, port, or table is included.
No
skipColumnDetection
boolean When true, Immuta will only pull new tables from the source server. This parameter can only be paired with schemaEvolutionId.
No
overrides.httpPath
string If Databricks ephemeral overrides are configured, provide the alternative HTTP path to trigger schema monitoring on that ephemeral cluster.
No
Response Parameters
Attribute
Description
schemaDetection
string Metadata regarding the jobs.
columnDetection
string Metadata regarding the jobs.
bulkId
string The unique identifier of the jobs running schema monitoring and column detection.
Request example
The following request triggers the schema monitoring job for the specified detection group.
The tabs below illustrate payloads for triggering schema monitoring on a host, database, or table. The request will run schema monitoring for all databases registered under the hostname provided in the payload.
The request will run schema monitoring for all databases registered under the hostname provided in the payload.
The request will run schema monitoring on the database provided in the payload. If data sources were initially registered via the V2 API, this request will locate new schemas that contain tables Immuta has the ability to access, and Immuta will create a new schema project associated with these newly discovered schemas and create data sources for each table located. If data sources were initially registered via the V1 API, this request will only update the columns and tables of registered schema and tables of the specified database; it will not register any new schemas.
{
"database": "public"
}
The request will run column detection and update the columns on the table specified in the payload.
{
"database": "public",
"table": "healthcare"
}
Response examples
The tabs below illustrate the example response for each example payload provided above.
object Indicates whether or not the blob was successfully crawled.
columnEvolution
object Indicates whether or not the job run to check for columns added or removed from the data source passed and when it was last run.
externalCatalog
object Indicates whether or not the external catalog was successfully linked to the data source.
fingerprint
object Indicates whether or not the fingerprint job was successful (passed) and when it was last run. The fingerprint captures summary statistics of the data source.
globalPolicy
object Indicates whether or not global policies were successfully applied to the data source.
highCardinality
object Indicates whether or not the job run to calculate the data source's high cardinality column passed and when it was last run.
schemaEvolution
object Indicates whether or not the job run to check if a new table had been added in the remote database passed and when it was last run. If a new table was added, Immuta automatically creates a new data source. Correspondingly, if a remote table is removed, that data source will be disabled in the console.
sdd
object Indicates whether or not sensitive data discovery was successfully run on the data source.
sql
object Indicates whether or not the SQL query run to check the data source's health passed and when it was last run.
stats
object Indicates whether or not the job run to calculate the number of rows in the data source passed and when it was last run.
Get all of the recent policy activities for a given data source.
Query Parameters
Attribute
Description
Required
dataSourceId
integer The data source ID.
Yes
offset
integer The number of results to skip (for paging).
No
size
integer The number of results to return per page.
No
Response Parameters
Attribute
Description
count
integer The number of results.
activities
array Includes details about the policy and the data source, including the policy and data source type, when the activity notification was triggered, and whether or not the policy change was triggered by a Global policy.
actionBy
array Details about who triggered the action.
targetUser
array Information about the user who received the notification.
Request example
The following request gets all of the recent policy activities for a given data source.