GitHub - KRLY05/etl_example

ETL job example

Example of simple ETL job

Job transfers data from Postgres to Redshift

Requirements

Docker
Docker Compose
Redshift
S3 Create bucket etl (or set your bucket name in congig.py)

Deploy

fill in etl.env with your AWS and Redshift credentials
run make deploy

Execute job

To execute the whole job, run make run_etl

ETL job consists of the following steps, each can be triggered separately:

Creating tables in source database (postgres)

make create_tables
Seeding source database with random data (seed size can be adjusted in config.py)

make seed_db
Exporting data from source database to local csv file

make export_csv
Uploading file to s3 bucket

make upload_s3
Copying data from s3 to Redshift

make copy_rs

PySpark job example

Simple example of word count job using PySpark

Execute job

The easiest way to demonstrate execution of pyspark scipt is to run it in pyspark Docker container with jupyter notebook

start the container: make pyspark
Go to http://127.0.0.1:8888 followed by token according to instructions in terminal
Start the notebook work/wordcount.ipynb
Copy input data file into ./pyspark folder
Run the code in the notebook

SQL and JQ example

Directory sql_and_jq contains:

examples of ddl and select query for apps table
example of json transform script using JQ

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
etl		etl
pyspark		pyspark
sql_and_jq		sql_and_jq
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
etl.env		etl.env

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ETL job example

Requirements

Deploy

Execute job

PySpark job example

Execute job

SQL and JQ example

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ETL job example

Requirements

Deploy

Execute job

PySpark job example

Execute job

SQL and JQ example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages