Skip to content

keboola/component-delta-lake-writer

Repository files navigation

Delta Tables Extractor

Component supports two access modes:

1. Direct Access to Delta Tables

Direct access to delta tables in your blob storage. We currently support the following providers:

In this mode, the Delta Table path is defined by specifying the bucket/container and blob location where the table data is stored.

2. Unity Catalog

Currently we support only Azure Blob Storage backend.

Setup Requirements:

Unity Catalog supports two table types

  • External tables - Writing from the component directly to the underlying blob storage, and updating metadata in Unity Catalog.
  • Native (Databricks) tables - Data are loaded using the selected DBX Warehouse.

Input Mapping

Component can have mapped either one input table, or one or multiple parquet files with the same schema (not supported by the Native Databricks write mode).

Data Destination Options

  • Load Type Append, Overwrite, Upsert (supported only for native tables, using the PK from the input table), Raise error when existing (supported only for external table).
  • Columns to partition by [optional] - List of columns to partition the table.
  • Warehouse DBX Warehouse to use for loading data (only for native tables).
  • Batch size - Bigger batch will increase speed but also can cause out-of-memory issues more likely.
  • Preserve Insertion Order - Disabling this option may help prevent out-of-memory issues.

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •