Skip to content

Data Source Request Payload Examples

Audience: Data Engineers

Content Summary: This page contains example request payloads for creating data sources.

Basic Data Source

connectionKey: my-databricks
connection:
    hostname: your.databricks.hostname.com
    port: 443
    ssl: true
    database: tpc
    username: token
    password: "${DATABRICKS_PASSWORD}"
    httpPath: sql/protocolv1/o/0/11101101
    handler: Databricks

Data Source (with More Options)

connectionKey: my-databricks
nameTemplate:
  dataSourceFormat: Databricks <Tablename>
  tableFormat: <tablename>
  schemaFormat: databricks
connection:
  hostname: your.databricks.hostname.com
  port: 443
  ssl: true
  database: data
  username: token
  password: "${DATABRICKS_PASSWORD}"
  httpPath: sql/protocolv1/o/0/1110-11123
  handler: Databricks
sources:
  - table: credit_card_transactions
    schema: data
    tags:
      table:
        - PCI
        - SENSITIVE
      columns:
      - columnName: transaction_date
        tags:
          - PCI
          - DATE
  - table: crime_data
    schema: data
    naming:
        datasource: Crime Data
        table: crime_data
        schema: databricks

Databricks Data Source (Override Naming Convention)

connectionKey: ebock-databricks
nameTemplate:
  dataSourceFormat: Databricks <Tablename>
  tableFormat: <tablename>
  schemaFormat: databricks
connection:
  hostname: your.databricks.hostname.com
  port: 443
  ssl: true
  database: ebock
  username: token
  password: "${DATABRICKS_PASSWORD}"
  httpPath: sql/protocolv1/o/0/1110-185737-wove
  handler: Databricks
sources:
  - table: credit_card_transactions
    schema: ebock
  - table: crime_data_delta
    schema: ebock
    naming:
        datasource: Crime Data
        table: crime_data
        schema: databricks
  - table: hipaa_data
    schema: ebock

Impala Data Source (with userFile)

connectionKey: cdh-impala
nameTemplate:
  dataSourceFormat: Impala <Tablename>
  tableFormat: <tablename>
  schemaFormat: impala
connection:
  hostname: your.hadoop.hostname.example.com
  port: 21050
  ssl: true
  database: default
  handler: Apache Impala
  authenticationMethod: kerberos
  username: usera
  userFiles:
    - keyName: TrustedCerts
      content: <Base64 encoded contents of file go here>
      userFilename: tls-ca-bundle.pem
sources:
  - table: medical_records_parquet
    schema: default
  - table: nyc_taxi_fare_parquet
    schema: default
  - table: nyc_taxi_trip_parquet
    schema: default

Redshift Spectrum Data Sources

Your nativeSchemaFormat must contain _immuta to avoid schema name conflicts.

connectionKey: redshift
connection:
  hostname: your-redshift-cluster.djie25k.us-east-1.redshift.amazonaws.com
  port: 5439
  ssl: true
  database: your_database_with_external_schema
  username: awsuser
  password: your_password
  handler: Redshift
  schema: external_schema
nameTemplate:
  dataSourceFormat: <Tablename>
  schemaFormat: <schema>
  tableFormat: <tablename>
  schemaProjectNameFormat: <Schema>
  nativeSchemaFormat: <schema>_immuta
  nativeViewFormat: <tablename>
sources:
  - all: true

Snowflake Data Source (Specify Sources)

connectionKey: tpc-snowflake
nameTemplate:
  dataSourceFormat: Snowflake <Tablename>
  tableFormat: <tablename>
  schemaFormat: snowflake
connection:
  hostname: example.hostname.snowflakecomputing.com
  port: 443
  ssl: true
  database: TPC
  username: USERA
  password: "${SNOWFLAKE_PASSWORD}"
  schema: PUBLIC
  warehouse: IT_WH
  handler: Snowflake
sources:
  - table: CASE
    schema: PUBLIC
  - table: CASE2
    schema: PUBLIC
  - table: CUSTOMER
    schema: PUBLIC
  - table: WEB_SALES
    schema: PUBLIC