Assertions & Relations #8
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds the following functionality:
$ python src/prepare_data.py --threshold=50 --keyword_string="semantic parsing" --output_path="data/selected_paper_abstracts.jsonl"$ python src/get_assertions_and_relations.py --task=assertions --model_name_or_path=openai-gpt-4o --input_data_path=data/selected_paper_abstracts.jsonl --output_data_path=data/generated_assertions.jsonl$ python src/get_assertions_and_relations.py --task=relations --model_name_or_path=openai-gpt-4o --input_data_path=data/generated_assertions.jsonl --output_data_path=data/generated_relations.jsonlEDIT: I've also added the script that works directly with OpenAI, you just need to specify your OPENAI_API_KEY in the environment. The usage is the same as in the examples above, only the script name changes to
get_assertions_and_relations_openai.py. This way you do not need Unsloth+GPU or Grazie-Api-Gateway-Client :)$ python src/get_assertions_and_relations_openai.py --task=relations --model_name_or_path=gpt-4o --input_data_path=data/generated_assertions.jsonl --output_data_path=data/generated_relations.jsonlI have also uploaded some samples with 3 annotated examples (using the
--debugoption withsrc/get_assertions_and_relations.py) and can add/annotate more if needed. The examples are located in thedatafolder.