Data Source Request Payload Examples
Basic Data Source
connectionKey: my-databricks
connection:
hostname: your.databricks.hostname.com
port: 443
ssl: true
database: tpc
username: token
password: "${DATABRICKS_PASSWORD}"
httpPath: sql/protocolv1/o/0/11101101
handler: Databricks
Data Source (with More Options)
connectionKey: my-databricks
nameTemplate:
dataSourceFormat: Databricks <Tablename>
tableFormat: <tablename>
schemaFormat: databricks
connection:
hostname: your.databricks.hostname.com
port: 443
ssl: true
database: data
username: token
password: "${DATABRICKS_PASSWORD}"
httpPath: sql/protocolv1/o/0/1110-11123
handler: Databricks
sources:
- table: credit_card_transactions
schema: data
tags:
table:
- PCI
- SENSITIVE
columns:
- columnName: transaction_date
tags:
- PCI
- DATE
- table: crime_data
schema: data
naming:
datasource: Crime Data
table: crime_data
schema: databricks
Databricks Data Source (Override Naming Convention)
connectionKey: ebock-databricks
nameTemplate:
dataSourceFormat: Databricks <Tablename>
tableFormat: <tablename>
schemaFormat: databricks
connection:
hostname: your.databricks.hostname.com
port: 443
ssl: true
database: ebock
username: token
password: "${DATABRICKS_PASSWORD}"
httpPath: sql/protocolv1/o/0/1110-185737-wove
handler: Databricks
sources:
- table: credit_card_transactions
schema: ebock
- table: crime_data_delta
schema: ebock
naming:
datasource: Crime Data
table: crime_data
schema: databricks
- table: hipaa_data
schema: ebock
Impala Data Source (with userFile
)
userFile
)connectionKey: cdh-impala
nameTemplate:
dataSourceFormat: Impala <Tablename>
tableFormat: <tablename>
schemaFormat: impala
connection:
hostname: your.hadoop.hostname.example.com
port: 21050
ssl: true
database: default
handler: Apache Impala
authenticationMethod: kerberos
username: usera
userFiles:
- keyName: TrustedCerts
content: <Base64 encoded contents of file go here>
userFilename: tls-ca-bundle.pem
sources:
- table: medical_records_parquet
schema: default
- table: nyc_taxi_fare_parquet
schema: default
- table: nyc_taxi_trip_parquet
schema: default
Redshift Spectrum Data Sources
Your nativeSchemaFormat
must contain _immuta
to avoid schema name conflicts.
connectionKey: redshift
connection:
hostname: your-redshift-cluster.djie25k.us-east-1.redshift.amazonaws.com
port: 5439
ssl: true
database: your_database_with_external_schema
username: awsuser
password: your_password
handler: Redshift
schema: external_schema
nameTemplate:
dataSourceFormat: <Tablename>
schemaFormat: <schema>
tableFormat: <tablename>
schemaProjectNameFormat: <Schema>
nativeSchemaFormat: <schema>_immuta
nativeViewFormat: <tablename>
sources:
- all: true
Snowflake Data Source (Specify Sources)
connectionKey: tpc-snowflake
nameTemplate:
dataSourceFormat: Snowflake <Tablename>
tableFormat: <tablename>
schemaFormat: snowflake
connection:
hostname: example.hostname.snowflakecomputing.com
port: 443
ssl: true
database: TPC
username: USERA
password: "${SNOWFLAKE_PASSWORD}"
schema: PUBLIC
warehouse: IT_WH
handler: Snowflake
sources:
- table: CASE
schema: PUBLIC
- table: CASE2
schema: PUBLIC
- table: CUSTOMER
schema: PUBLIC
- table: WEB_SALES
schema: PUBLIC