Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Connections allow you to register your data objects in a technology through a single connection, making data registration more scalable for your organization. Instead of registering schema and databases individually, you can register them all at once and allow Immuta to monitor your data platform for changes so that data sources are added and removed automatically to reflect the state of data on your data platform.
Once you register your connection, Immuta presents a hierarchical view of your data that reflects the hierarchy of objects in your data platform:
Account (Snowflake) or Metastore (Databricks Unity Catalog)
Database
Schema
Tables: These represent the individual objects in your data platform, and when enabled, become data sources
Beyond making the registration of your data more intuitive, connections provides more control. Instead of performing operations on individual schemas or tables, you can perform operations (such as object sync) at the connection level.
See the how-to guides for a list of requirements:
See the integration's reference guide for the supported object types for each technology:
Immuta will ensure the objects in your database stay synchronous with the registered objects in Immuta. To do this, Immuta uses the account credentials provided during registration to check the remote technology for object changes like a table being created, new columns being added to a table, or a table being deleted.
If tables are added, new data sources are created in Immuta.
If remote tables are deleted, the corresponding data sources in Immuta will become disabled; however, the data object representing the table will still appear in the connections view until manually deleted.
If a column changes in a table, those changes will be reflected in the Immuta data source data dictionary.
Your connection can be synced in two ways:
Periodic object sync: This happens once every 24 hours (at 1:00 AM UTC). Currently, updating this schedule is not configurable.
: You can manually run object sync on your whole connection or on any object in your connection.
Object sync is designed to pull in the user’s data objects from the connected backing technology, so it specifically excludes internal Immuta-managed objects. These objects reside within the Immuta database or catalog, which is , and are used solely for Immuta's internal processes. Because these objects are only for Immuta processes and cannot be queried by users, the objects will be ignored by object sync and will not be ingested into Immuta.
All data sources within the registered connection and found by object sync after will get an automated tag that represents the connection. These tags can be used like any other in Immuta to , , , etc. However, they cannot be edited or deleted.
The tag will be formatted as follows and applied to data sources from table data objects:
Immuta Connections . The Technology . Your Connection Name . Your Schema . Your Database
When new columns are detected and added to Immuta, they will be automatically tagged with the New tag. This allows governors to use the to mask columns with the New tag, since they could contain sensitive data.
The New Column Added global policy is staged (inactive) by default. See the to activate this seeded global policy if you want any columns with the New tag to be automatically masked.
When there is an active policy that targets the New tag, Immuta sends validation requests to data owners for the following changes made in the remote data platform:
Column added: Immuta applies the New tag on the column that has been added and sends a request to the data owner to validate if the new column contains sensitive data. Once the data owner confirms they have validated the content of the column, Immuta removes the New tag from it and as a result any policy that targets the New column tag no longer applies.
Column deleted: Immuta deletes the column from the data source's data dictionary in Immuta. Then, Immuta sends a request to the data owner to validate the deleted column.
For instructions on how to view and manage your tasks and requests in the Immuta UI, see the . To view and manage your tasks and requests via the Immuta API, see the section of the API documentation.
When registering a connection, Immuta sets the connection to the recommended default settings to protect your data. The recommended settings are described below:
Object sync: This setting allows Immuta to monitor the connection for changes. When Immuta identifies a new table, a data source will automatically be created. Similarly, if remote tables are deleted, the corresponding data sources and data objects will be deleted in Immuta. This setting is enabled by default and cannot be disabled.
Default run schedule: This sets the time interval for Immuta to check for new objects. By default, this schedule is set to 24 hours.
Impersonation: This setting enables and defines the role for user impersonation, available with . This setting is disabled by default.
If you want all data objects from connections to have data tags ingested from the data provider into Immuta, ensure the credentials provided on the Immuta app settings page for the external catalog feature can access all the data objects. Any data objects the credentials do not have access to will not be tagged in Immuta. In practice, it is recommended to just use the same credentials for the connection and tag ingestion.
Within the connection, the Data Owner permission can be granted on any data object, and will allow that user to manage that object and any within it. For example, granting a user Data Owner on a schema will grant them Data Owner on tables within that schema as well. Data owners can complete the following actions:
View the connections UI
Access any connection where they are granted Data Owner anywhere in the hierarchy
Trigger object sync for their data objects
Delete their data objects
Deregistering a connection automatically deletes all of its child objects in Immuta. However, Immuta will not remove the objects in your Snowflake or Databricks account.
Project workspaces: This setting enables Snowflake project workspaces. If you use Snowflake secure data sharing with Immuta, enable this setting, as project workspaces are required. If you use Snowflake table grants, disable this setting; project workspaces cannot be used when Snowflake table grants are enabled. Project workspaces are not supported in the Databricks Unity Catalog integration. This setting is disabled by default.
Requirement: GOVERNANCE or APPLICATION_ADMIN global permission or Data Owner within the hierarchy
Prerequisite: A connection registered
Click Data and select the Connections tab in the navigation menu.
Click the more actions menu for the connection you want and select Run Object Sync.
Opt to click the checkbox to Also scan all disabled data objects.
Click Run Object Sync.
Click Data and select the Connections tab in the navigation menu.
Select the connection.
Click the more actions menu in the Action column for the database you want to sync and select Run Object Sync.
Opt to click the checkbox to Also scan all disabled data objects
Click Data and select the Connections tab in the navigation menu.
Select the connection.
Select the database.
Click the more actions menu in the Action column for the schema you want to sync and select Run Object Sync.
You can run object sync from the data source health check or from the connection,
Click Data and select the Connections tab in the navigation menu.
Select the connection.
Select the database.
Select the schema
Click Run Object Sync.
Opt to click the checkbox to Also scan all disabled data objects.
Click Run Object Sync.
Click the more actions menu in the Action column for the data object you want to sync and select Run Object Sync.
Opt to click the checkbox to Also scan all disabled data objects.
Click Run Object Sync
Connections are an improvement from the existing process for not only onboarding your data sources but also managing the integration. However, there are some differences between the two processes that should be noted and understood before you start with the upgrade.
API changes: See the API changes page for a complete breakdown of the APIs that will not work once you begin the upgrade. These changes will mostly affect users with automated API calls around schema monitoring and data source registration.
Automated data source names: Previously, you could name data sources manually. However, data sources from connections are automatically named using the information (database, schema, table) and casing from your data platform. For example, on Snowflake this will typically mean that my_table will become My Connection.MY_DATABASE.MY_SCHEMA.MY_TABLE.
If you are leveraging Immuta APIs, you may need to adjust code to allow for the new data source names.
Schema projects phased out: With integrations, many settings and the connection info for data sources were controlled in the schema project. This functionality is no longer needed with connections and now you can control connection details in a central spot.
New hierarchy display: With integrations, tables were brought in as data sources and presented as a flat list on the data source list page. With connections, databases and schemas are displayed as objects too.
Change from schema monitoring to object sync: Object metadata synchronization between Immuta and your data platform is no longer optional but always required:
If schema monitoring is off before the upgrade: Once the connection is registered, everything the system user can see will be pulled into Immuta and, if it didn't already exist in Immuta, it will be a disabled object. These disabled objects exist so you can see them, but policy is not protecting the objects, and they will not appear as data sources.
If schema monitoring is on before the upgrade: Once the connection is registered, everything the system user can see will be pulled into Immuta. If it already existed in Immuta, it will be an enabled object and continue to appear as data source.
Enabling a connection will enable all databases, schemas, and tables in the hierarchy: If the connection is disabled after completing your upgrade to connections, only enable the host if you want to enable all databases, schemas, and tables within it.
Enabling a table that is ordinarily disabled will elevate it to a data source. Immuta will then apply data and subscription policies on that data source.
Snowflake
An integration enabled on the Immuta app settings page
Data sources registered
Immuta global GOVERNANCE and APPLICATION_ADMIN permissions
Select Data and then Upgrade Manager in the navigation menu. This tab will only be available if you have integrations ready for upgrade.
Click Start Upgrade.
Display Name: The display name represents the unique name of your connection and will be used as prefix in the name for all data objects associated with this connection. It will also appear as the display name in the UI and will be used in all API calls made to update or delete the connection.
Click Next.
Ensure Immuta has the correct credentials to connect to Databricks Unity Catalog or Snowflake. Select the tab below for more information:
Click Validate Credentials to ensure the access token can connect Immuta and Databricks Unity Catalog.
Create a Snowflake role with a minimum of the following permissions:
USAGE on all databases and schemas with registered data sources.
REFERENCES on all tables and views registered in Immuta.
SELECT on all tables and views registered in Immuta.
to the Immuta system user in your Snowflake environment.
Enter the new Snowflake role in the textbox.
Click Validate Credentials to ensure the role has been granted to the right user.
Click Next.
Click Upgrade Connection.
Click the link to the docs to understand the impacts of the upgrade.
Click the checkbox to confirm understanding of the upgrade effects, and click Yes, Upgrade Connection.
The upgrade manager will then begin connecting your data sources with the tables in the backing technology. This may take some time to complete.
While most upgrades will complete without any additional intervention, it may be necessary to resolve data sources that are not easily matched to the backing tables. See the Troubleshooting guide if you are prompted to Resolve in the upgrade manager.
Your connection is in an upgrade state until you finalize. In the upgrade state, policy will still be applied to your data sources, but object sync is not on. To allow Immuta to discover new objects and created data sources for them, finalize your upgrade.
Select Data and then Upgrade Manager in the navigation menu. This tab will only be available if you have integrations ready for upgrade.
Click Finalize for the finished connection.
Changing the status of a parent object
If changing the status of a parent object, all the relevant child objects' statuses will be changed. This may take time to complete with a large number of child objects.
You can check the status of the job with the gear icon in the UI, which will be spinning if jobs are active, or use the bulkId to call the .
Click Data in the navigation menu and select Connections.
Navigate to the connection and go to the level of data object you want to change the status of.
Go to the Settings tab and change the Data Object switch to the status you want:
To update the status using the API, see the .
Databricks Unity Catalog behavior
If you enable a data object and it has no subscription policy set on it, Immuta will REVOKE access to the data in Databricks for all Immuta users, even if they had been directly granted access to the table in Unity Catalog.
If you disable a Unity Catalog data source in Immuta, all existing grants and policies on that object will be removed in Databricks for all Immuta users. All existing grants and policies will be removed, regardless of whether they were set in Immuta or in Unity Catalog directly.
If a user is not registered in Immuta, Immuta will have no effect on that user's access to data in Unity Catalog. See the
Click Data in the navigation menu and select Connections.
Navigate to the connection and go to the level of data object you want to change the settings of.
Go to the Settings tab and change the Object Sync switch to the status you want:
To update the status using the API, see the .
Databricks Unity Catalog behavior
If you enable a data object and it has no subscription policy set on it, Immuta will REVOKE access to the data in Databricks for all Immuta users, even if they had been directly granted access to the table in Unity Catalog.
If you disable a Unity Catalog data source in Immuta, all existing grants and policies on that object will be removed in Databricks for all Immuta users. All existing grants and policies will be removed, regardless of whether they were set in Immuta or in Unity Catalog directly.
If a user is not registered in Immuta, Immuta will have no effect on that user's access to data in Unity Catalog. See the
Click Data in the navigation menu and select Connections.
Navigate to the connection and go to the level of data object you want to assign permissions to.
Go to the Permissions tab and click + Add Permissions.
Choose how to assign the permission:
To assign permissions using the API, see the .
Enable: The data object will be enabled until manually changed. If the data object is also a data source, policies will impact the data source.
Inherit: The data object will automatically inherit the status of the parent data object. So if it is a table data object, it will inherit the status of the parent schema data object.
Review your changes and click Save Changes.
Enable: All new data objects found within this data object will be registered in an enabled state. If the data object is also a data source, policies will impact the data source.
Inherit: All new data objects found within the data object will be registered as the same status as the data object.
Review your changes and click Save Changes.
Individual Users: Select this option from the dropdown and then search for individual users to grant the permission to.
Users in Group: Select this option from the dropdown and then search for groups to grant the permission to.
Choose the permission to assign:
Data Owner permission to allow them to manage a data object and its child objects.
Review your changes and click Grant Permissions.
If you attempted the upgrade and receive the message that your upgrade is Partially Complete, find the un-upgraded data sources by navigating to the Upgrade Manager and clicking the number in the Available column for the relevant connection.
Use the options below to resolve those un-upgraded data sources in order to finish your upgrade. See the linked how-to's for more details on the actions to take.
Note that these un-upgraded data sources still exist and are still protected by policy.
: The easiest solution is to delete the data sources that did not upgrade. Note that disabled data sources that no longer exist in your data platform will never be upgraded. Only do this if you no longer need these data sources in Immuta.
Expand privileges in Snowflake or Databricks (recommended): Extend the Immuta system user's privileges in your data platform by granting it access to all remaining un-upgraded data sources.
Change the system user credentials used by Immuta: You can also provide Immuta with a different set of credentials that already have the required privileges on the un-upgraded data sources.
Ensure that the role you specified in the upgrade has the required privileges to register a Snowflake connection and has been granted to the Immuta system user.
Ensure the Databricks service principal you created and connected with Immuta has the required privileges to register a Databricks Unity Catalog connection.
Schema monitoring must be turned off in the schema project to disable and delete data sources that did not upgrade.
View the data sources that were not upgraded
Find the un-upgraded data sources by navigating to the Upgrade Manager and clicking the number in the Available column.
Disable the data sources
From this data source list page, disable all the data sources to delete.
Check the top checkbox in the data source list table. Deselect the checkbox for any data sources you do not want to delete.
Click More Actions.
Click Disable and then Confirm.
Delete the data sources
From this data source list page, delete the data sources.
Check the top checkbox in the data source list table. Deselect the checkbox for any data sources you do not want to delete.
Click More Actions.
Finalize the upgrade
Once the un-upgraded data sources are deleted, you should be able to complete the upgrade.
Navigate to the Upgrade Manager.
Click Finalize.
Check your role privileges
To find the role you specified, do the following in the Immuta UI:
Navigate to Connections.
Select the connection you are trying to upgrade.
Navigate to the Connections tab.
See the Role.
Now, ensure that role has the for each data source that was not successfully upgraded. Add the privileges where needed.
Grant your role to the system account
To find the system account you specified, do the following in the Immuta UI:
Navigate to Connections.
Select the connection you are trying to upgrade.
Run object sync
Navigate to Connections.
Click on the more actions menu for the connection you are trying to upgrade.
Select Run Object Sync
Finalize the upgrade
Once the un-upgraded data sources are resolved, you can complete the upgrade.
Navigate to the Upgrade Manager.
Click Finalize.
Check your service principal privileges
To find the service principal you specified, do the following in the Immuta UI:
Navigate to Connections.
Select the connection you are trying to upgrade.
Navigate to the Connections tab.
Now, ensure that service principal has the for each data source that was not successfully upgraded. Add the privileges where needed.
Run object sync
Navigate to Connections.
Click on the more actions menu for the connection you are trying to upgrade.
Select Run Object Sync
Finalize the upgrade
Once the un-upgraded data sources are resolved, you can complete the upgrade.
Navigate to the Upgrade Manager.
Click Finalize.
If you have another set of credentials on hand with wider privileges, you can edit the connection to use these credentials instead to resolve the un-upgraded data sources.
Edit the connection
Navigate to Connections.
Select the connection you are trying to upgrade.
Navigate to the Connections tab.
Click Edit and then Next
Enter the new credentials in the textbox and continue to the end to save.
Run object sync
Navigate to Connections.
Click on the more actions menu for the connection you are trying to upgrade.
Select Run Object Sync
Finalize the upgrade
Once the un-upgraded data sources are resolved, you can complete the upgrade.
Navigate to the Upgrade Manager.
Click Finalize.
See the Setup: Username.
Now, in Snowflake, grant the role to the system account:
Click the checkbox to Also scan all disabled data objects.
Click Run Object Sync.
Now, navigate back to the Upgrade Manager tab, and if all your data sources are successfully upgraded, finalize the upgrade.
Click the checkbox to Also scan all disabled data objects.
Click Run Object Sync.
Now, navigate back to the Upgrade Manager tab, and if all your data sources are successfully upgraded, finalize the upgrade.
Click the checkbox to Also scan all disabled data objects.
Click Run Object Sync.
Now, navigate back to the Upgrade Manager tab, and if all your data sources are successfully upgraded, finalize the upgrade.
GRANT ROLE <name> TO USER <user_name>;Connections allow you to register your data objects in a technology through a single connection, making data registration more scalable for your organization. Instead of registering schema and databases individually, you can register them all at once and allow Immuta to monitor your data platform for changes so that data sources are added and removed automatically to reflect the state of data on your data platform.
Register a connection:
: Register a connection with a Snowflake account and register the data objects within it.
: Register a connection with a Databricks Unity Catalog metastore and register the data objects within it.
Manage a connection:
: Change the object sync settings and manage user permissions for the connection.
: Trigger object sync manually for the entire connection or a single object to sync your remote data platform objects with Immuta.
: Complete the upgrade path from the existing integrations and data sources to a connection.
: This reference guide discusses the major concepts, design, and settings of connections.
: This reference guide discusses the differences when upgrading from the existing integrations and data sources to a connection.
The following endpoints have been deprecated with connections. Use the recommended endpoint instead.
Create a single data source
If you have any automated actions using the following APIs, ensure you do the required change after the upgrade to ensure they continue working as expected.
POST /{technology}/handler
POST /api/v2/data
Step 1: Ensure your system user has been granted access to the relevant object in the data platform.
Step 2: Wait until the next object sync or manually trigger a metadata crawl using POST /data/crawl/{objectPath*}.
Step 3: If the parent schema has activateNewChildren: false,
PUT /data/settings/{objectPath*} with settings: isActive: true.
Bulk create data sources
POST /{technology}/handler
POST /api/v2/data
Step 1: Ensure your system user has been granted access to the relevant object in the data platform.
Step 2: Wait until the next object sync or manually trigger a metadata crawl using POST /data/crawl/{objectPath*}.
Step 3: If the parent schema has activateNewChildren: false,
PUT /data/settings/{objectPath*} with settings: isActive: true.
Edit a data source connection
POST /api/v2/data
No substitute. Data sources no longer have their own separate connection details but are tied to the parent connection.
Bulk edit data source's connections
PUT /{technology}/bulk
POST /api/v2/data
PUT /{technology}/handler/{handlerId}
No substitute. Data sources no longer have their own separate connection details but are tied to the parent connection.
Run schema detection (object sync)
PUT /dataSource/detectRemoteChanges
Delete a data source
DELETE /dataSource/{dataSourceId}
Bulk delete data sources
PUT /dataSource/bulk/{delete}
DELETE /api/v2/data/{connectionKey}
DELETE /{technology}/handler/{handlerId}
DELETE /dataSource/{dataSourceId}
Enable a single data source
PUT /dataSource/{dataSourceId}
PUT /data/settings/{objectPath*} with settings: isActive: true
Bulk enable data sources
PUT /dataSource/bulk/{restore}
PUT /data/settings/{objectPath*} with settings: isActive: true
Disable a single data source
PUT /dataSource/{dataSourceId}
PUT /data/settings/{objectPath*} with settings: isActive: false
Bulk disable data sources
PUT /dataSource/bulk/{disable}
PUT /data/settings/{objectPath*} with settings: isActive: false
Edit a data source name
PUT /dataSource/{dataSourceId}
No substitute. Data source names are automatically generated based on information from your data platform.
Edit a display name
POST /api/v2/data/{connectionKey}
No substitute. Data sources no longer have their own separate connection details but are tied to the parent connection.
Override a host name
PUT /dataSource/{dataSourceId}/overrideHost
No substitute. Data sources no longer have their own separate connection details but are tied to the parent connection.
Create an integration/connection
POST /integrations
Update an integration/connection
PUT /integrations/{integrationId}
Delete an integration/connection
DELETE /integrations/{integrationId}
Delete and update a data dictionary
DELETE /dictionary/{dataSourceId}
POST /dictionary/{dataSourceId}
PUT /dictionary/{dataSourceId}
No substitute. Data source dictionaries are automatically generated based on information from your data platform.
Update a data source owner
PUT /dataSource/{dataSourceId}/access/{id}
DELETE /dataSource/{dataSourceId}/unsubscribe
PUT /data/settings/{objectPath*} with settings: dataOwners
Response to a data source owner request
POST /subscription/deny
POST /subscription/deny/bulk
PUT /data/settings/{objectPath*} with settings: dataOwners
Search for a data source
Data source names will change with the upgrade. Update {dataSourceName} in the request with the new data source name.
Data sources names will change with the upgrade. Update the searchText in the payload with the new data source name.
Apply identification frameworks to data sources
Data source names will change with the upgrade. Update the sources in the payload with the new data source names.
Run SDD on data sources
Data source names will change with the upgrade. Update the sources in the payload with the new data source names.
Search schema names
This endpoint will not search the schemas of connection data sources. Instead use the .
Connections allow you to register your data objects in a technology through a single connection, making data registration more scalable for your organization. Instead of registering schema and databases individually, you can register them all at once and allow Immuta to monitor your data platform for changes so that data sources are added and removed automatically to reflect the state of data on your data platform.
This document is meant to guide to you to connections from a configured integration. If you are a new user without any current integrations, see the instead.
Do not upgrade to Connections if you meet any of the criteria below:
You are using the Databricks Spark integration
You are using the workspace-catalog binding capability with Databricks Unity Catalog
Integrations are now connections. Once the upgrade is complete, you will control most integration settings at the connection level via the Connections tab in Immuta.
Integrations are set up from the Immuta app settings page or via the API. These integrations establish a relationship between Immuta and your data platform for policy orchestration. Then tables are registered as data sources through an additional step with separate credentials. Schemas and databases are not reflected in the UI.
Integrations and data sources are set up together with a single connection per account between Immuta and your data platform. Based on the privileges granted to the Immuta system user, metadata from databases, schemas, and tables is automatically pulled into Immuta and continuously monitored for any changes.
Snowflake OAuth
Username and password
Key pair
Personal Access Token
M2M OAuth
Unsupported technologies
The following technologies are not yet supported with connections:
Azure Synapse Analytics
Databricks Spark
Google BigQuery
Redshift
S3
Starburst (Trino)
Additional connection string options
When registering data sources using the legacy method, there is a field for Additional Connection String Options that your Immuta representative may have instructed you to use. If you did enter any additional connection information there, check to ensure the information you included is supported with connections. Only the following Additional Connection String Options input is supported:
Snowflake data sources with the private key file password set using Additional Connection String Options.
The tables below outline Immuta features, their availability with integrations, and their availability with connections.
Snowflake lineage
Supported
Supported
Query audit
Supported
Supported
Tag ingestion
Supported
Supported
Connection tags
Not supported
Query audit
Supported
Supported
Tag ingestion
Supported
Supported
Connection tags
Not supported
Supported
Workspace-catalog binding
Supported
There will be no policy downtime on your data sources while performing the upgrade.
See the integration's reference guide for the supported object types for each technology:
With connections, your data sources are ingested and presented to reflect the infrastructure hierarchy of your connected data platform. For example, this is what the new hierarchy will look like for a Snowflake connection:
Integration
Connection
-
Database
-
Schema
Data source
Data source (once enabled, becomes available for policy enforcement)
Connections will not change any tags currently applied on your data sources.
If you want all data objects from connections to have data tags ingested from the data provider into Immuta, ensure the credentials provided on the Immuta app settings page for the external catalog feature can access all the data objects. Any data objects the credentials do not have access to will not be tagged in Immuta. In practice, it is recommended to just use the same credentials for the connection and tag ingestion.
If you previously ingested data sources using the V2 /data endpoint this limitation applies to you.
The V2 /data endpoint allows users to register data sources and attach a tag automatically when the data sources are registered in Immuta.
The V2 /data endpoint is not supported with a connection, and there is no substitution for this behavior at this time. If you require default tags for newly onboarded data sources, please reach out to your Immuta support professional before upgrading.
APPLICATION_ADMIN
Configure integration
Integration
CREATE_DATA_SOURCE
Register tables
Data source
Data owner
Manage data sources
Data source
APPLICATION_ADMIN
Register the connection
Connection, database, schema, data source
GOVERNANCE or APPLICATION_ADMIN
Manage all connections
Connection, database, schema, data source
Data owner
Manage data objects
Connection, database, schema, data source
Schema monitoring is renamed to object sync with connections, as it can also monitor for changes at database and connection level.
During object sync, Immuta crawls your connection to ingest metadata for every database, schema, and table that the Snowflake role or Databricks account credentials you provided during the configuration has access to. Upon completion of the upgrade, the tables' states depend on your previous schema monitoring settings:
If you had schema monitoring enabled on a schema: All tables from that schema will be registered in Immuta as enabled data sources.
If you had schema monitoring disabled on a schema: All tables from that schema (that were not already registered in Immuta) will be registered as disabled data objects. They are visible from the Data Objects tab in Immuta, but are not listed as data sources until they are enabled.
After the initial upgrade, object sync runs on your connection every 24 hours (at 1:00 AM UTC) to keep your tables in Immuta in sync. Additionally, users can also manually run object sync via the UI or API.
With integrations, many settings and the connection details for data sources were controlled in the schema project, including schema monitoring. This functionality is no longer needed with connections and now you can control connection details in a central spot.
Schema project owners
With integrations, schema project owners can become schema monitoring owners, control connection settings, and manage subscription policies on the schema project.
These schema project owners will not be represented in connections, and if you want them to have similar abilities, you must make them Data Owner on the schema.
Object sync provides additional controls compared to schema monitoring:
Object status: Connections, databases, schemas and tables can be marked enabled, which for tables make them appear as data sources, or disabled. These statuses are inherited to all lower objects by default, but that can be overridden. For example, if you make a database disabled, all schemas and tables within that database will inherit the status to be disabled. However, if you want one of those tables to be a data source, you can manually enable it.
Enable new data objects: This setting controls what state new objects are registered as in Immuta when found by object sync.
Enable: New data objects found by object sync will automatically be enabled and tables will be registered as data sources.
Disable: This is the default. New data objects found by object sync will be disabled.
Name
Schema monitoring and column detection
Object sync
Where to turn on
Enable (optionally) when configuring a data source
Enabled by default
Where to update the feature
Enable or disable from the schema project
Object sync cannot be disabled
Default schedule
Every 24 hours
Connections use a new architectural pattern resulting in an improved performance when monitoring for metadata changes in your data platform, particularly with large numbers of data sources. The following scenarios are regularly tested in an isolated environment in order to provide a benchmark. Please note, that these numbers can vary based on a number of factors such as (but not limited to) number and type of policies applied, overall API and user activity in the system, connection latency to your data platform.
Data sources with integrations, required users to manually create the schema monitoring job in Databricks. However, this job has been fully automated on data sources with connections, and this step is no longer necessary.
Consolidating integration setup and data source registration into a single connection significantly simplifies programmatic interaction with the Immuta APIs. Actions that used to be managed through multiple different endpoints can now be achieved through one simple and standardized one. As a result, multiple API endpoints are blocked once a user has upgraded their connection.
All blocked APIs will send an error indicating "400 Bad Request - [...]. Use the /data endpoint." This error indicates that you will need to update your processes that are calling the Immuta APIs to leverage the new /data endpoint instead. For details, see the API changes page.
Supported
Project workspaces
Not supported
Not supported
User impersonation
Not supported
Not supported
Not supported
Project workspaces
Not supported
Not supported
User impersonation
Not supported
Not supported
Every 24 hours (at 1:00 AM UTC)
Can you adjust the default schedule?
No
No
New tags applied automatically
New tags are applied automatically for a data source being created, a column being added, or a column type being updated on an existing data source
New tags are applied automatically for a column being added or a column type being updated on an existing data source
Scenario 1 Running object sync on a schema with 10,000 data sources with 50 columns each
172.2 seconds on average
Scenario 2 Running object sync on a schema with 1,000 data sources with 10 columns each
9.38 seconds on average
Scenario 3 Running object sync on a schema with 1 data source with 50 columns
0.512 seconds on average