DBKnitter

Knitting DBs for cross-database queries 🧶

Experiments

Prerequisites:

Docker
Bash
Python3
Make (Optional)

The following instruction measures accuracy of DBKnitter's generated cross-database programs across multiple query languages: SQL, official TPC-H description, manual English translation, and GPT's English translation.

Optionally, setup Python environment:

make virtualenv
source .venv/bin/activate

Unzip data.

unzip tpch.zip

Acquire an API key from OpenAI: https://platform.openai.com/api-keys. One may need to add an amount to credit balance: https://platform.openai.com/account/billing/overview.

Generate 27 table-platform mappings over 22 queries across 4 baselines.

API_KEY=YOUR_API_KEY_HERE
ALL_MAPPINGS="00000000,11111111,22222222,00001111,00002222,11110000,11112222,22220000,22221111,01010101,02020202,10101010,12121212,20202020,21212121,01201201,02102102,10210210,12012012,20120120,21021021,00011122,00022211,11100022,11122200,22200011,22211100"

python dbknitter/gpt_tpch.py batch --output_dir platforms/client/source/s01/exp_sql --db_splits ${ALL_MAPPINGS} --query_language sql --api_key ${API_KEY}
python dbknitter/gpt_tpch.py batch --output_dir platforms/client/source/s01/exp_engoff --db_splits ${ALL_MAPPINGS} --query_language eng-official --api_key ${API_KEY}
python dbknitter/gpt_tpch.py batch --output_dir platforms/client/source/s01/exp_engman --db_splits ${ALL_MAPPINGS} --query_language eng-manual --api_key ${API_KEY}
python dbknitter/gpt_tpch.py batch --output_dir platforms/client/source/s01/exp_enggpt --db_splits ${ALL_MAPPINGS} --query_language eng-gpt --api_key ${API_KEY}

Grade those cross-database programs.

ALL_MAPPINGS="00000000 11111111 22222222 00001111 00002222 11110000 11112222 22220000 22221111 01010101 02020202 10101010 12121212 20202020 21212121 01201201 02102102 10210210 12012012 20120120 21021021 00011122 00022211 11100022 11122200 22200011 22211100"

for m in ${ALL_MAPPINGS}; do echo ">>> ${m}"; bash cloudlab/grade_by_mapping.sh ${m} exp_sql; done
for m in ${ALL_MAPPINGS}; do echo ">>> ${m}"; bash cloudlab/grade_by_mapping.sh ${m} exp_engoff; done
for m in ${ALL_MAPPINGS}; do echo ">>> ${m}"; bash cloudlab/grade_by_mapping.sh ${m} exp_engman; done
for m in ${ALL_MAPPINGS}; do echo ">>> ${m}"; bash cloudlab/grade_by_mapping.sh ${m} exp_enggpt; done

Then, grading results will be grade_output/exp_[sql|engoff|engman|enggpt]/*/*.txt. Grab lines starting with Score. and insert them into print_plots.py accordingly. Then, print and compile the measurements in LaTeX.

python print_plots.py

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
cloudlab		cloudlab
dbknitter		dbknitter
grade_output		grade_output
ground_truth_gen		ground_truth_gen
platforms		platforms
pulp		pulp
regraded_output		regraded_output
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
STEPS_nov2.txt		STEPS_nov2.txt
TPC-H_v3.0.1.pdf		TPC-H_v3.0.1.pdf
how_to_test.md		how_to_test.md
how_to_tpch.md		how_to_tpch.md
notes_nov12.txt		notes_nov12.txt
print_plots.py		print_plots.py
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
setup.py		setup.py
tpc_query_description.csv		tpc_query_description.csv
tpch.zip		tpch.zip
tpch_query_refined_description.csv		tpch_query_refined_description.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DBKnitter

Experiments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

mIXs222/dbknitter

Folders and files

Latest commit

History

Repository files navigation

DBKnitter

Experiments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages