- Load/Import csv from
ABStoSnowflake - Load/Import csv from
GCStoBigquery - Unload/Export table from
SnowflaketoABS
- Full load - destination table is truncated before load
- Incremental load - data are merged
- Primary key dedup for all engines
- Convert empty values to NULL (using convertEmptyValuesToNull option)
- Full unload - destination csv is always rewriten
Prepare .env (copy of .env.dist) and set up AWS keys which has access to keboola-drivers bucket in order to build this image. Also add this user to group ci-php-import-export-lib witch will allow you to work with newly created bucket for tests.
User can be created in Dev - Main legacy, where are also groups for keboola-drivers and ci-php-import-export-lib.
Then run docker compose build
The AWS credentials have to also have access to bucket specified in AWS_S3_BUCKET. This bucket has to contain testing data. Run docker compose run --rm dev composer loadS3 to load them up.
Run terraform to create resources for tests. See Terraform.md for details.
- Create storage account template can be found in provisioning ABS create template
- Create container in storage account
Blob service -> Containersnote: for tests this step can be skiped container is created withloadAbscmd - Fill env variables in .env file
ABS_ACCOUNT_NAME=storageAccount
ABS_ACCOUNT_KEY=accountKey
ABS_CONTAINER_NAME=containerName
- Upload test fixtures to ABS
docker compose run --rm dev composer loadAbs
-
Create bucket in GCS set bucket name in .env variable
GCS_BUCKET_NAME -
Create service account in IAM
-
In bucket permissions grant service account admin access to bucket
-
Create new service account key
-
Convert key to string
awk -v RS= '{$1=$1}1' <key_file>.json >> .env(orcat file.json | jq -c | jq -R) -
Set content on last line of .env as variable
GCS_CREDENTIALS -
Upload test fixtures to GCS
docker compose run --rm dev composer loadGcs-bigqueryordocker compose run --rm dev composer loadGcs-snowflake(depending on backend)
Role, user, database and warehouse are required for tests. You can create them:
CREATE ROLE "KEBOOLA_DB_IMPORT_EXPORT";
CREATE DATABASE "KEBOOLA_DB_IMPORT_EXPORT";
GRANT ALL PRIVILEGES ON DATABASE "KEBOOLA_DB_IMPORT_EXPORT" TO ROLE "KEBOOLA_DB_IMPORT_EXPORT";
GRANT USAGE ON WAREHOUSE "DEV" TO ROLE "KEBOOLA_DB_IMPORT_EXPORT";
CREATE USER "KEBOOLA_DB_IMPORT_EXPORT"
PASSWORD = 'Password'
DEFAULT_ROLE = "KEBOOLA_DB_IMPORT_EXPORT";
GRANT ROLE "KEBOOLA_DB_IMPORT_EXPORT" TO USER "KEBOOLA_DB_IMPORT_EXPORT";
-- For GCS create storage integration https://docs.snowflake.com/en/user-guide/data-load-gcs-config.html#creating-a-custom-iam-role
CREATE STORAGE INTEGRATION "KEBOOLA_DB_IMPORT_EXPORT"
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = GCS
ENABLED = TRUE
STORAGE_ALLOWED_LOCATIONS = ('gcs://<your gcs bucket>/');
-- set integration name to env GCS_INTEGRATION_NAME in .env file
-- get service account id `STORAGE_GCP_SERVICE_ACCOUNT`
DESC STORAGE INTEGRATION "KEBOOLA_DB_IMPORT_EXPORT";
-- continue according manual ^ in snflk documentation assign roles for Data loading and unloadingInstall Google Cloud client (via Brew), initialize it and log in to generate default credentials.
To prepare the backend you can use Terraform template.
You must have the resourcemanager.folders.create permission for the organization.
# you can copy it to a folder somewhere and make an init
terraform initRun terraform apply with following variables:
- folder_id: Go to GCP Resource Manager and select your team dev folder ID (e.g. find 'KBC Team Dev' and copy ID)
- backend_prefix: your_name, all resources will create with this prefix
- billing_account_id: Go to Billing and copy your Billing account ID
terraform apply -var folder_id=<folder_id> -var backend_prefix=<your_prefix> -var billing_account_id=<billing_account_id>For missing pieces see Connection repository. After terraform apply ends go to the service project in folder created by terraform.
- convert key to string and save to
.envfile:awk -v RS= '{$1=$1}1' <key_file>.json >> .env - set content on the last line of
.envas variableBQ_KEY_FILE - set env variable
BQ_BUCKET_NAMEgenerated from TF templatefile_storage_bucket_id
Run tests with following command.
note: azure credentials must be provided and fixtures uploaded
docker compose run --rm dev composer tests
Unit and functional test can be run sepparetly
#unit test
docker compose run --rm dev composer tests-unit
#functional test
docker compose run --rm dev composer tests-functional
#phplint
docker compose run --rm dev composer phplint
#phpcs
docker compose run --rm dev composer phpcs
#phpstan
docker compose run --rm dev composer phpstan
This command will run all checks load fixtures and run tests
docker compose run --rm dev composer ci
ABS -> Snowflake import/load
use Keboola\Db\ImportExport\Backend\Snowflake\Importer;
use Keboola\Db\ImportExport\ImportOptions;
use Keboola\Db\ImportExport\Storage;
$absSourceFile = new Storage\ABS\SourceFile(...);
$snowflakeDestinationTable = new Storage\Snowflake\Table(...);
$importOptions = new ImportOptions(...);
(new Importer($snowflakeConnection))->importTable(
$absSourceFile,
$snowflakeDestinationTable,
$importOptions
);Snowflake -> Snowflake copy
use Keboola\Db\ImportExport\Backend\Snowflake\Importer;
use Keboola\Db\ImportExport\ImportOptions;
use Keboola\Db\ImportExport\Storage;
$snowflakeSourceTable = new Storage\Snowflake\Table(...);
$snowflakeDestinationTable = new Storage\Snowflake\Table(...);
$importOptions = new ImportOptions(...);
(new Importer($snowflakeConnection))->importTable(
$snowflakeSourceTable,
$snowflakeDestinationTable,
$importOptions
);Snowflake -> ABS export/unload
use Keboola\Db\ImportExport\Backend\Snowflake\Exporter;
use Keboola\Db\ImportExport\ExportOptions;
use Keboola\Db\ImportExport\Storage;
$snowflakeSourceTable = new Storage\Snowflake\Table(...);
$absDestinationFile = new Storage\ABS\DestinationFile(...);
$exportOptions = new ExportOptions(...);
(new Exporter($snowflakeConnection))->exportTable(
$snowflakeSourceTable,
$absDestinationFile,
$exportOptions
);Library consists of few simple interfaces.
Importer, Exporter Interface must be implemented in new Backed
Keboola\Db\ImportExport\Backend\ImporterInterface
Keboola\Db\ImportExport\Backend\ExporterInterface
For each backend there is corresponding adapter which supports own combination of SourceInterface and DestinationInterface. Custom adapters can be set with setAdapters method.
Storage is now file storage ABS|S3 (in future) or table storage Snowflake.
Storage can have Source and Destination which must implement SourceInterface or DestinationInterface. These interfaces are empty and it's up to adapter to support own combination.
In general there is one Import/Export adapter per FileStorage <=> TableStorage combination.
Adapter must implement:
Keboola\Db\ImportExport\Backend\BackendImportAdapterInterfacefor importKeboola\Db\ImportExport\Backend\BackendExportAdapterInterfacefor export
Backend can require own extended AdapterInterface.
MIT licensed, see LICENSE file.