> For the complete documentation index, see [llms.txt](https://documentation.immuta.com/2024.3/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://documentation.immuta.com/2024.3/data-and-integrations/registering-metadata/data-source-settings/data-source-health-checks.md).

# Data Source Health Checks Reference Guide

When an Immuta data source is created, background jobs use the connection information provided to compute health checks dependent on the type of data source created and how it was configured. These data source health checks include the

* **blob crawl status**: indicates whether the blob was successfully crawled.
* **column detection status**: indicates whether the job run to determine if a column was added or removed from the remote table registered as an Immuta data source was successful.
* **external catalog link status**: indicates whether or not the external catalog was successfully linked to the data source.
* **fingerprint generation status**: indicates whether or not the data source fingerprint was successfully generated.
* **framework classification status**: indicates whether classification was successfully run on the data source to determine the sensitivity of the data source.
* **global policy applied status**: indicates whether global policies were successfully applied to the data source.
* **high cardinality calculation status**: indicates whether the data source's high cardinality column was successfully calculated.
* **SQL sync status** (for Snowflake data sources): indicates whether Snowflake governance policies have been successfully synced.
* **SQL view creation status** (for Redshift data sources): indicates whether views were properly created for Redshift tables registered in Immuta.
* **row count status**: indicates whether the number of rows in the data source was successfully calculated.
* **schema detection status**: indicates whether the job run to determine if a remote table was added or removed from the schema was successful.
* **sensitive data discovery status**: indicates whether sensitive data discovery was successfully run on the data source.

After these jobs complete, the health status for each is updated to indicate whether the status check passed, was skipped, is unknown, or failed.

These background jobs can be disabled during data source creation by adding a specific tag to prevent automatic table statistics. This prevent statistics tag can be set on the [app settings page](/2024.3/application-settings/how-to-guides/config-builder-guide.md#prevent-automatic-table-statistics) by a system administrator. However, with automatic table statistics disabled these policies will be unavailable until the data source owner [manually generates the fingerprint](/2024.3/secure-your-data/data-consumers/subscribe-to-data-source.md#manually-run-health-jobs):

* Masking with format preserving masking
* Masking with k-anonymization
* Masking using randomized response

## Unhealthy Databricks data sources

Unhealthy data sources may fail their row count queries if they run against a cluster that has the Databricks query watchdog enabled.

## Limitations

Data sources with over 1600 columns will not have health checks run, but will still appear as healthy. The health check cannot be run automatically or manually.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://documentation.immuta.com/2024.3/data-and-integrations/registering-metadata/data-source-settings/data-source-health-checks.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.