From 2c50e04f591f67c943948626b1fdba10fe3cc1d2 Mon Sep 17 00:00:00 2001 From: GP Saggese Date: Fri, 9 May 2025 22:05:16 -0400 Subject: [PATCH 1/9] Update MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pre-commit checks: - 'check_master' passed - 'check_author' passed - 'check_file_size' passed - 'check_python_compile' passed - 'check_gitleaks' passed All checks passed ✅ --- docs/tools/all.llm_transform.how_to_guide.md | 98 +++++++++++++++++--- docs/tools/all.llm_transform.reference.md | 15 +++ 2 files changed, 98 insertions(+), 15 deletions(-) create mode 100644 docs/tools/all.llm_transform.reference.md diff --git a/docs/tools/all.llm_transform.how_to_guide.md b/docs/tools/all.llm_transform.how_to_guide.md index c5d424042..1cb1dd471 100644 --- a/docs/tools/all.llm_transform.how_to_guide.md +++ b/docs/tools/all.llm_transform.how_to_guide.md @@ -1,15 +1,83 @@ -# Available prompt tags: -code_1_unit_test -code_apply_refactoring -code_comment -code_docstring -code_propose_refactoring -code_review -code_review_and_improve -code_type_hints -code_unit_test -md_rewrite -md_summarize_short -slide_colorize -slide_colorize_points -slide_improve \ No newline at end of file +I'll analyze the structure of these three files and explain how they work together. Let me break this down systematically. + +1. First, let me search for relevant code to understand the structure better. + + + +Search files... +Based on the code analysis, I can explain the structure of these three files and how they work together. Here's a comprehensive breakdown: + +1. **Overall Architecture** +The system is designed to apply LLM-based transformations to code and text files, with a focus on code quality, documentation, and formatting. The architecture consists of three main components: + +2. **llm_transform.py** (Main Entry Point) +- This is the primary script that users interact with +- Key features: + - Handles input/output from either stdin/stdout or files + - Manages Docker container setup and execution + - Provides command-line interface for various transformations +- Main functions: + - `_parse()`: Sets up command-line argument parsing + - `_run_dockerized_llm_transform()`: Manages Docker container execution + - `_main()`: Orchestrates the overall transformation process + +3. **dockerized_llm_transform.py** (Docker Container Script) +- Runs inside a Docker container +- Key features: + - Handles the actual LLM transformation + - Manages dependencies within the container + - Processes input/output files +- Main functions: + - `_parse()`: Parses command-line arguments + - `_main()`: Executes the transformation using LLM prompts + +4. **llm_prompts.py** (Prompt Management) +- Contains all the transformation prompts and their configurations +- Key features: + - Defines various transformation types (code fixes, reviews, markdown processing) + - Manages pre and post-transformation steps + - Handles prompt templates and contexts +- Main components: + - Prompt definitions (e.g., `code_fix_docstrings`, `code_review_correctness`) + - Transformation management (`run_prompt()`) + - Context definitions (`_CODING_CONTEXT`, `_MD_CONTEXT`) + +5. **Transformation Types** +The system supports several categories of transformations: +- Code Fixes: + - Docstring improvements + - Type hint additions + - Logging statement fixes + - String formatting fixes +- Code Reviews: + - Correctness review + - Refactoring suggestions +- Markdown Processing: + - Document rewriting + - Summarization + - How-to guide formatting + - Explanation document formatting + +6. **Transformation Process Flow** +1. User calls `llm_transform.py` with input/output files and transformation type +2. `llm_transform.py` sets up Docker container and passes control to `dockerized_llm_transform.py` +3. `dockerized_llm_transform.py` uses `llm_prompts.py` to: + - Apply pre-transformations + - Execute LLM transformation + - Apply post-transformations +4. Results are written back to the output file + +7. **Docker Integration** +- Uses Python 3.12 Alpine as base image +- Installs necessary dependencies (PyYAML, requests, pandas, openai) +- Manages file paths between host and container +- Handles environment variables (e.g., OPENAI_API_KEY) + +8. **Error Handling and Logging** +- Comprehensive logging system +- Type checking and assertions +- Error handling for file operations +- Docker container management error handling + +This architecture provides a robust and flexible system for applying LLM-based transformations to code and text files, with proper isolation of dependencies through Docker and a clear separation of concerns between the different components. + diff --git a/docs/tools/all.llm_transform.reference.md b/docs/tools/all.llm_transform.reference.md new file mode 100644 index 000000000..66f05712f --- /dev/null +++ b/docs/tools/all.llm_transform.reference.md @@ -0,0 +1,15 @@ +- Explain how to run from stdin / stdout + + +# Basic Usage +> llm_transform.py -i input.txt -o output.txt -p uppercase + +# List of transforms +> llm_transform.py -i input.txt -o output.txt -p list + +# Code review +> llm_transform.py -i dev_scripts_helpers/documentation/render_images.py -o cfile -p code_review + +# Propose refactoring +> llm_transform.py -i dev_scripts_helpers/documentation/render_images.py -o cfile -p code_propose_refactoring +" From 421c035d48eaa90d4ccc88d756e0eecbe926dd99 Mon Sep 17 00:00:00 2001 From: Sameer Kumar Date: Tue, 20 May 2025 06:36:24 -0400 Subject: [PATCH 2/9] Modified reference and how-to-guide of llm_transform.py MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pre-commit checks: - 'check_master' passed - 'check_author' passed - 'check_file_size' passed - 'check_python_compile' passed - 'check_gitleaks' passed All checks passed ✅ --- docs/tools/all.llm_transform.how_to_guide.md | 79 +++- docs/tools/all.llm_transform.reference.md | 404 ++++++++++++++++++- 2 files changed, 469 insertions(+), 14 deletions(-) diff --git a/docs/tools/all.llm_transform.how_to_guide.md b/docs/tools/all.llm_transform.how_to_guide.md index 1cb1dd471..906cd6de8 100644 --- a/docs/tools/all.llm_transform.how_to_guide.md +++ b/docs/tools/all.llm_transform.how_to_guide.md @@ -1,14 +1,20 @@ -I'll analyze the structure of these three files and explain how they work together. Let me break this down systematically. - -1. First, let me search for relevant code to understand the structure better. + +- [Structure of files](#structure-of-files) +- [Step-by-step execution](#step-by-step-execution) + * [Example](#example) + +# Structure of files Search files... -Based on the code analysis, I can explain the structure of these three files and how they work together. Here's a comprehensive breakdown: +Based on the code analysis, I can explain the structure of these three files +and how they work together. Here's a comprehensive breakdown: 1. **Overall Architecture** -The system is designed to apply LLM-based transformations to code and text files, with a focus on code quality, documentation, and formatting. The architecture consists of three main components: +The system is designed to apply LLM-based transformations to code and text +files, with a focus on code quality, documentation, and formatting. +The architecture consists of three main components: 2. **llm_transform.py** (Main Entry Point) - This is the primary script that users interact with @@ -60,7 +66,8 @@ The system supports several categories of transformations: 6. **Transformation Process Flow** 1. User calls `llm_transform.py` with input/output files and transformation type -2. `llm_transform.py` sets up Docker container and passes control to `dockerized_llm_transform.py` +2. `llm_transform.py` sets up Docker container and passes control to + `dockerized_llm_transform.py` 3. `dockerized_llm_transform.py` uses `llm_prompts.py` to: - Apply pre-transformations - Execute LLM transformation @@ -79,5 +86,63 @@ The system supports several categories of transformations: - Error handling for file operations - Docker container management error handling -This architecture provides a robust and flexible system for applying LLM-based transformations to code and text files, with proper isolation of dependencies through Docker and a clear separation of concerns between the different components. +This architecture provides a robust and flexible system for applying LLM-based +transformations to code and text files, with proper isolation of dependencies +through Docker and a clear separation of concerns between the different components. + +# Step-by-step execution + +Please note that same steps are followed for any transformation execution. +The input and output file extensions could be .txt, .py, .md, stdout +depending upon the type of transformation. + +- Choose the input file + - Locate the Python file you want to transform. + - Alternatively, you may use stdin to input the code manually. +- Specify the output file + - Provide a path to save the transformed output. + - Example path: research_amp/causal_kg/scrape_fred_metadata.new +- Open a terminal and run the following command: + ```bash + > llm_transform.py -i input-file -o output-file -p + ``` +- Verify the result + - The output would be copied to the output file. + +## Example + +- In case of an input file + - step 1: Input file is 'research_amp/causal_kg/scrape_fred_metadata.py'. + - Suppose, the file contains the following imports: + ```python + from utils import parser + from helpers import hopenai + ``` + - step 2: Output file is 'research_amp/causal_kg/scrape_fred_metadata_new.py'. + - step 3: Run the following command: + ```bash + > llm_transform.py -i research_amp/causal_kg/scrape_fred_metadata.py + -o research_amp/causal_kg/scrape_fred_metadata_new.py -p code_fix_from_imports + ``` + - step 4: The output is following and saved in the above-mentioned output file: + ```python + import utils.parser + import helpers.hopenai + ``` +- In case of stdin + - step 1: Input is stdin + - step 2: Output is stdout + - step 3: Run the following command: + ```bash + > llm_transform.py -i - -o - -p code_fix_from_imports + # input: + from utils import parser + from helpers import hopenai + [press Ctrl + D] + ``` + - step 4: The output is following and saved in the above-mentioned output file: + ```python + import parser.utils + import helpers.hopenai + ``` diff --git a/docs/tools/all.llm_transform.reference.md b/docs/tools/all.llm_transform.reference.md index 66f05712f..f2348d63d 100644 --- a/docs/tools/all.llm_transform.reference.md +++ b/docs/tools/all.llm_transform.reference.md @@ -1,15 +1,405 @@ -- Explain how to run from stdin / stdout + +- [Synopsis](#synopsis) +- [Run from stdin stdout](#run-from-stdin-stdout) +- [Basic Usage](#basic-usage) +- [List of transforms](#list-of-transforms) +- [Code Fixes](#code-fixes) + * [Code fix by using f strings](#code-fix-by-using-f-strings) + + [Example](#example) + * [Code fix by using perc strings](#code-fix-by-using-perc-strings) + + [Example](#example) + * [Code fix csfy style](#code-fix-csfy-style) + + [Example](#example) + * [Code fix docstrings](#code-fix-docstrings) + + [Example](#example) + * [Code fix by existing comments](#code-fix-by-existing-comments) + + [Example](#example) + * [Code fix from imports](#code-fix-from-imports) + + [Example](#example) + * [Code fix improve comments](#code-fix-improve-comments) + + [Example](#example) + * [Code fix log string](#code-fix-log-string) + + [Example](#example) + * [Code fix logging statements](#code-fix-logging-statements) + + [Example](#example) + * [Code fix star before optional parameters](#code-fix-star-before-optional-parameters) + + [Example](#example) + * [Code fix type hints](#code-fix-type-hints) + + [Example](#example) +- [Code review and refactoring](#code-review-and-refactoring) + * [Code review correctness](#code-review-correctness) + * [Code review refactoring](#code-review-refactoring) +- [Markdown Processing](#markdown-processing) + * [Md clean up how to guide](#md-clean-up-how-to-guide) + * [Md summarize short](#md-summarize-short) + + + +# Run from stdin stdout + +```bash +> llm_transform.py -i - -o - +``` +There is no need to even specify '-' as by default the input and output are +stdin and stdout respectively. + +# Synopsis + +The script is capable of performing certain transformations using LLM like +OpenAI on the input text or the stdin. The transformed output is then stored in +the output file or the stdout depending upon the arguments passed by the user. # Basic Usage + +```bash > llm_transform.py -i input.txt -o output.txt -p uppercase +``` +The script will produce output from a llm based on the prompt tage provided by +the user and the transformation takes place on the input file. # List of transforms -> llm_transform.py -i input.txt -o output.txt -p list -# Code review -> llm_transform.py -i dev_scripts_helpers/documentation/render_images.py -o cfile -p code_review +The above line will generate a list of available prompt tags that a user can +select to transform the input file. + +```bash +> llm_transform.py -p list +# Available prompt tags: +code_fix_by_using_f_strings +code_fix_by_using_perc_strings +code_fix_csfy_style +code_fix_docstrings +code_fix_existing_comments +code_fix_from_imports +code_fix_improve_comments +code_fix_log_string +code_fix_logging_statements +code_fix_star_before_optional_parameters +code_fix_type_hints +code_fix_unit_test +code_review_correctness +code_review_refactoring +code_transform_apply_csfy_style +code_transform_apply_linter_instructions +code_transform_remove_redundancy +code_write_1_unit_test +code_write_unit_test +md_clean_up_how_to_guide +md_rewrite +md_summarize_short +scratch_categorize_topics +slide_bold +slide_elaborate +slide_improve +slide_improve2 +slide_reduce +test +``` + +# Code Fixes + +## Code fix by using f strings + +It fixes the code to use f-string insted of conventional string formatting. It a +lso uses a post transformation of removal of code delimiter. + +### Example + +- Command:, + ```bash + > llm_transform.py -i input.py -o output.py -p code_fix_by_using_f_strings + ``` +- Suppose input.txt contains the following: + ```python + "Hello, %s. You are %d years old." % (name, age) + ... + ``` +- Expect the following output in output.txt + ```python + "Hello, {name}. You are {age} years old." + .... + ``` + +## Code fix by using perc strings + +This is exactly opposite to the 'code_fix_by_using_f_string' where the +transformation is to use %formatting instead of f-strings. + +### Example + +- Command:, + ```bash + > llm_transform.py -i input.py -o output.py -p code_fix_by_using_perc_strings + ``` +- Suppose input.py contains the following: + ```python + "Hello, {name}. You are {age} years old." + ... + ``` +- Expect the following output in output.py + ```python + "Hello, %s. You are %d years old." % (name, age) + .... + ``` + +## Code fix csfy style + +Apply all the transformations required to write code according to the +Causify conventions. + +### Example + +- Command:, + ```bash + > llm_transform.py -i input.py -o output.py -p code_fix_csfy_style + ``` + +## Code fix docstrings + +Ensures each function has a properly structured REST-style docstring. It adds +missing docstrings or complete partial ones according to the best practice. + +### Example + +- Command:, + ```bash + > llm_transform.py -i input.py -o output.py -p code_fix_docstrings + ``` +- Suppose input.py contains the following: + ```python + def format_greeting(name: str, *, greeting: str = "Hello") -> str: + ... + ``` +- Expect the following output in output.py + ```python + def format_greeting(name: str, *, greeting: str = "Hello") -> str: + """ + Format a greeting message with the given name. + + :param name: the name to include in the greeting (e.g., "John") + :param greeting: the base greeting message to use (e.g., "Ciao") + :return: formatted greeting (e.g., "Hello John") + """ + .... + ``` + +## Code fix by existing comments + +Fix the already existing comments in the Python code. + +### Example + +- Command:, + ```bash + > llm_transform.py -i input.py -o output.py -p code_fix_by_existing_comments + ``` + +## Code fix from imports + +Fix code to use imports instead of "from import" statements. + +### Example + +- Command: + ```bash + > llm_transform.py -i input.py -o output.py -p code_fix_from_imports + ``` +- Suppose input.py contains the following: + ```python + from X import Y + ... + ``` +- Expect the following output in output.py + ```python + import X.Y + .... + ``` + +## Code fix improve comments + +Add comments to python code. + +### Example + +- Command: + ```bash + > llm_transform.py -i input.py -o output.py -p code_fix_improve_comments + ``` +- Suppose input.txt contains the following: + ```python + import pandas + import numpy + import scipy + ... + ``` +- Expect the following output in output.txt + ```python + # Import libraries + import pandas + import numpy + import scipy + .... + ``` + +## Code fix log string + +Fix logging statements and dassert statements by using % formatting instead +of f-strings (formatted string literals) + +### Example + +- Command:, + ```bash + > llm_transform.py -i input.py -o output.py -p code_fix_log_string + ``` +- Suppose input.py contains the following: + ```python + _LOG.info(f"env_var='{str(env_var)}' is not in env_vars=\ + '{str(os.environ.keys())}'") + ``` +- Expect the following output in output.py + ```python + _LOG.info("env_var='%s' is not in env_vars='%s'", env_var, \ + str(os.environ.keys())) + ``` + +## Code fix logging statements + +Add logging statements to Python code. + +### Example + +- Command:, + ```bash + > llm_transform.py -i input.py -o output.py -p code_fix_logging_statements + ``` +- Suppose input.py contains the following: + ```python + def get_text_report(self) -> str: + """ + Generate a text report listing each module's dependencies. + + :return: Text report of dependencies, one per line. + """ + ``` +- Expect the following output in output.py + ```python + def get_text_report(self) -> str: + """ + Generate a text report listing each module's dependencies. + + :return: Text report of dependencies, one per line. + """ + _LOG.debug(hprint.func_signature_to_str()) + ``` + +## Code fix star before optional parameters + +Fix code missing the star before optional parameters. + +### Example + +- Command:, + ```bash + > llm_transform.py -i input.py -o output.py + -p code_fix_star_before_optional_parameters + ``` +- Suppose input.py contains the following: + ```python + def format_greeting(name: str, greeting: str = "Hello") -> str: + ... + ``` +- Expect the following output in output.py + ```python + def format_greeting(name: str, *, greeting: str = "Hello") -> str: + ... + ``` + +## Code fix type hints + +Add type hints to the Python code passed. + +### Example + +- Command: + ```bash + > llm_transform.py -i input.py -o output.py -p code_fix_type_hints + ``` +- Suppose input.py contains the following: + ```python + def process_data(data, threshold=0.5): + results = [] + for item in data: + if item > threshold: + results.append(item) + return results + ``` +- Expect the following output in output.py + ```python + def process_data(data: List[float], *, \ + threshold: float = 0.5) -> List[float]: + results: List[float] = [] + for item in data: + if item > threshold: + results.append(item) + return results + ``` + +# Code review and refactoring + +## Code review correctness + +Designed to review a Python script for correctness and quality, then output +line-numbered suggestions using a structured format. + +- Command:, + ```bash + > llm_transform.py -i input.py -o cfile -p code_review_correctness + ``` + +## Code review refactoring + +Review the code for refactoring opportunities. + +- Command:, + ```bash + > llm_transform.py -i input.py -o cfile -p code_review_refactoring + ``` + +# Markdown Processing + +## Md clean up how to guide + +Format the text to rewrite as a how_to_guide and contain the sections like + - Goal / Use Case + - Assumptions / Requirements + - Step-by-Step Instructions + - Alternatives or Optional Steps + - Troubleshooting + +- Command:, + ```bash + > llm_transform.py -i input.md -o output.md -p md_clean_up_how_to_guide + ``` + +## Md rewrite + + Rewrite the text passed in a technical document style to + increase clarity and readability. + +- Command:, + ```bash + > llm_transform.py -i input.md -o output.md -p md_rewrite + ``` + +## Md summarize short + + Summarize the text in less than 30 words. + +- Command:, + ```bash + > llm_transform.py -i input.md -o output.md -p md_summarize_short + ``` -# Propose refactoring -> llm_transform.py -i dev_scripts_helpers/documentation/render_images.py -o cfile -p code_propose_refactoring -" + \ No newline at end of file From a5f93b53d2988cbab260e571eeacd9c49ebb07d0 Mon Sep 17 00:00:00 2001 From: Sameer Kumar Date: Wed, 21 May 2025 17:34:11 -0400 Subject: [PATCH 3/9] Slightly modified reference markdown MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pre-commit checks: All checks passed ✅ --- docs/tools/all.llm_transform.reference.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/tools/all.llm_transform.reference.md b/docs/tools/all.llm_transform.reference.md index f2348d63d..60999a2b8 100644 --- a/docs/tools/all.llm_transform.reference.md +++ b/docs/tools/all.llm_transform.reference.md @@ -53,10 +53,10 @@ the output file or the stdout depending upon the arguments passed by the user. # Basic Usage ```bash -> llm_transform.py -i input.txt -o output.txt -p uppercase +> llm_transform.py -i input.txt -o output.txt -p ``` -The script will produce output from a llm based on the prompt tage provided by -the user and the transformation takes place on the input file. +The script generates output from an LLM based on the user-specified prompt tag, +pplying the transformation to the input file. # List of transforms From 8835ba5878e611fb79e0a4d5913f7c85ae8c0cb2 Mon Sep 17 00:00:00 2001 From: Indrayudd Roy Chowdhury Date: Fri, 6 Jun 2025 00:52:34 -0400 Subject: [PATCH 4/9] HelpersTask702: add llm_transform reference doc --- docs/tools/all.llm_transform.reference.md | 1551 ++++++++++++++++----- 1 file changed, 1191 insertions(+), 360 deletions(-) diff --git a/docs/tools/all.llm_transform.reference.md b/docs/tools/all.llm_transform.reference.md index 60999a2b8..e617b5495 100644 --- a/docs/tools/all.llm_transform.reference.md +++ b/docs/tools/all.llm_transform.reference.md @@ -1,405 +1,1236 @@ -- [Synopsis](#synopsis) -- [Run from stdin stdout](#run-from-stdin-stdout) -- [Basic Usage](#basic-usage) -- [List of transforms](#list-of-transforms) -- [Code Fixes](#code-fixes) - * [Code fix by using f strings](#code-fix-by-using-f-strings) - + [Example](#example) - * [Code fix by using perc strings](#code-fix-by-using-perc-strings) - + [Example](#example) - * [Code fix csfy style](#code-fix-csfy-style) - + [Example](#example) - * [Code fix docstrings](#code-fix-docstrings) - + [Example](#example) - * [Code fix by existing comments](#code-fix-by-existing-comments) - + [Example](#example) - * [Code fix from imports](#code-fix-from-imports) - + [Example](#example) - * [Code fix improve comments](#code-fix-improve-comments) - + [Example](#example) - * [Code fix log string](#code-fix-log-string) - + [Example](#example) - * [Code fix logging statements](#code-fix-logging-statements) - + [Example](#example) - * [Code fix star before optional parameters](#code-fix-star-before-optional-parameters) - + [Example](#example) - * [Code fix type hints](#code-fix-type-hints) - + [Example](#example) -- [Code review and refactoring](#code-review-and-refactoring) - * [Code review correctness](#code-review-correctness) - * [Code review refactoring](#code-review-refactoring) -- [Markdown Processing](#markdown-processing) - * [Md clean up how to guide](#md-clean-up-how-to-guide) - * [Md summarize short](#md-summarize-short) +- [`llm_transform.py`](#llm_transformpy) + * [Interface](#interface) + * [Basic Usage](#basic-usage) + * [List of transforms](#list-of-transforms) + * [Prompt Tags](#prompt-tags) + + [Code Prompts](#code-prompts) + - [`code_apply_cfile`](#code_apply_cfile) + * [Input](#input) + * [Output](#output) + - [`code_fix_by_using_f_strings`](#code_fix_by_using_f_strings) + * [Input](#input-1) + * [Output](#output-1) + - [`code_fix_by_using_perc_strings`](#code_fix_by_using_perc_strings) + * [Input](#input-2) + * [Output](#output-2) + - [`code_fix_code`](#code_fix_code) + * [Input](#input-3) + * [Output](#output-3) + - [`code_fix_comments`](#code_fix_comments) + * [Input](#input-4) + * [Output](#output-4) + - [`code_fix_complex_assignments`](#code_fix_complex_assignments) + * [Input](#input-5) + * [Output](#output-5) + - [`code_fix_docstrings`](#code_fix_docstrings) + * [Input](#input-6) + * [Output](#output-6) + - [`code_fix_from_imports`](#code_fix_from_imports) + * [Input](#input-7) + * [Output](#output-7) + - [`code_fix_function_type_hints`](#code_fix_function_type_hints) + * [Input](#input-8) + * [Output](#output-8) + - [`code_fix_log_string`](#code_fix_log_string) + * [Input](#input-9) + * [Output](#output-9) + - [`code_fix_logging_statements`](#code_fix_logging_statements) + * [Input](#input-10) + * [Output](#output-10) + - [`code_fix_star_before_optional_parameters`](#code_fix_star_before_optional_parameters) + * [Input](#input-11) + * [Output](#output-11) + - [`code_fix_unit_test`](#code_fix_unit_test) + * [Input](#input-12) + * [Output](#output-12) + - [`code_transform_apply_csfy_style`](#code_transform_apply_csfy_style) + * [Input](#input-13) + * [Output](#output-13) + - [`code_transform_apply_linter_instructions`](#code_transform_apply_linter_instructions) + * [Input](#input-14) + * [Output](#output-14) + - [`code_transform_remove_redundancy`](#code_transform_remove_redundancy) + * [Input](#input-15) + * [Output](#output-15) + - [`code_write_1_unit_test`](#code_write_1_unit_test) + * [Input](#input-16) + * [Output](#output-16) + - [`code_write_unit_test`](#code_write_unit_test) + * [Output for the same input as above](#output-for-the-same-input-as-above) + + [Documentation Prompts](#documentation-prompts) + - [`latex_rewrite`](#latex_rewrite) + * [Input](#input-17) + * [Output](#output-17) + - [`md_add_good_bad_examples`](#md_add_good_bad_examples) + * [Input](#input-18) + * [Output](#output-18) + - [`md_clean_up_how_to_guide`](#md_clean_up_how_to_guide) + * [Input](#input-19) + * [Output (Sections Added)](#output-sections-added) + - [`md_convert_table_to_bullet_points`](#md_convert_table_to_bullet_points) + * [Input](#input-20) + * [Output](#output-19) + - [`md_convert_text_to_bullet_points`](#md_convert_text_to_bullet_points) + * [Input](#input-21) + * [Output](#output-20) + - [`md_create_bullets`](#md_create_bullets) + * [Input](#input-22) + * [Output](#output-21) + - [`md_expand`](#md_expand) + * [Input](#input-23) + * [Output](#output-22) + - [`md_format`](#md_format) + * [Input](#input-24) + * [Output](#output-23) + - [`md_remove_formatting`](#md_remove_formatting) + * [Input](#input-25) + * [Output](#output-24) + - [`md_rewrite`](#md_rewrite) + * [Input](#input-26) + * [Output](#output-25) + - [`md_summarize_short`](#md_summarize_short) + * [Input](#input-27) + * [Output](#output-26) + + [Slide Prompts](#slide-prompts) + - [`slide_add_figure`](#slide_add_figure) + * [Input](#input-28) + * [Output](#output-27) + - [`slide_bold`](#slide_bold) + * [Input](#input-29) + * [Output](#output-28) + - [`slide_check`](#slide_check) + * [Input](#input-30) + * [Output](#output-29) + - [`slide_expand`](#slide_expand) + * [Input](#input-31) + * [Output](#output-30) + - [`slide_reduce`](#slide_reduce) + * [Input](#input-32) + * [Output](#output-31) + - [`slide_reduce_bullets`](#slide_reduce_bullets) + * [Input](#input-33) + * [Output](#output-32) + - [`slide_smart_colorize`](#slide_smart_colorize) + * [Input](#input-34) + * [Output](#output-33) + - [`slide_to_bullet_points`](#slide_to_bullet_points) + * [Input](#input-35) + * [Output](#output-34) + + [Text Prompts](#text-prompts) + - [`text_idea`](#text_idea) + * [Input](#input-36) + * [Output](#output-35) + - [`text_rephrase`](#text_rephrase) + * [Input](#input-37) + * [Output](#output-36) + - [`text_rewrite`](#text_rewrite) + * [Input](#input-38) + * [Output](#output-37) + + [Review Prompts](#review-prompts) + - [`review_correctness`](#review_correctness) + * [Input](#input-39) + * [Output](#output-38) + - [`review_linter`](#review_linter) + * [Input](#input-40) + * [Output](#output-39) + - [`review_llm`](#review_llm) + * [Input](#input-41) + * [Output](#output-40) + - [`review_refactoring`](#review_refactoring) + * [Input](#input-42) + * [Output](#output-41) + + [Miscellaneous Prompts](#miscellaneous-prompts) + - [`misc_categorize_topics`](#misc_categorize_topics) + * [Input](#input-43) + * [Output](#output-42) + - [`test`](#test) + * [Input](#input-44) + * [Output](#output-43) -# Run from stdin stdout +# `llm_transform.py` -```bash -> llm_transform.py -i - -o - -``` -There is no need to even specify '-' as by default the input and output are -stdin and stdout respectively. +The script is capable of performing certain transformations using OpenAI's LLMs +on the input text or the stdin. The transformed output is then stored in the +output file or the stdout depending upon the arguments passed by the user. -# Synopsis +## Interface -The script is capable of performing certain transformations using LLM like -OpenAI on the input text or the stdin. The transformed output is then stored in -the output file or the stdout depending upon the arguments passed by the user. +```bash +llm_transform.py -h +usage: llm_transform.py [-h] -i INPUT -o OUTPUT -p PROMPT + [--compare] + [-b | --bold_first_level_bullets] + [-s | --skip-post-transforms] + [--dockerized_force_rebuild] + [--dockerized_use_sudo] + [-v {TRACE,DEBUG,INFO,WARNING,ERROR,CRITICAL}] + +Apply an LLM prompt‑based transformation to text input and save the result. + +options: + -h, --help show this help message and exit + -i INPUT, --input INPUT + Source text ("-" = stdin) + -o OUTPUT, --output OUTPUT + Destination ("-" = stdout) + -p PROMPT, --prompt PROMPT + Prompt tag to apply (`list`, `code_review`, `slide_colorize`, ...) + -c, --compare Print both the original & transformed blocks to stdout + -b, --bold_first_level_bullets + Post‑format tweak for slide prompts + -s, --skip-post-transforms + Return raw LLM output, skip prettier/cleanup + --dockerized_force_rebuild + Force rebuild of the Docker container + --dockerized_use_sudo + Run the container with sudo inside + -v {TRACE,DEBUG,INFO,WARNING,ERROR,CRITICAL} + Set the logging level +``` -# Basic Usage +## Basic Usage ```bash > llm_transform.py -i input.txt -o output.txt -p +# or +> llm_transform.py -i - -o - ``` -The script generates output from an LLM based on the user-specified prompt tag, -pplying the transformation to the input file. -# List of transforms +The script generates output from an LLM based on the user-specified prompt tag, +pplying the transformation to the input file. Using - for `-i` and `-o` +signifies stdin and stdout. + +> Note: Use the -s flag to see the LLM output without any post-processing. -The above line will generate a list of available prompt tags that a user can +## List of transforms + +The above line will generate a list of available prompt tags that a user can select to transform the input file. ```bash > llm_transform.py -p list # Available prompt tags: +code_apply_cfile code_fix_by_using_f_strings code_fix_by_using_perc_strings -code_fix_csfy_style +code_fix_code +code_fix_comments +code_fix_complex_assignments code_fix_docstrings -code_fix_existing_comments code_fix_from_imports -code_fix_improve_comments +code_fix_function_type_hints code_fix_log_string code_fix_logging_statements code_fix_star_before_optional_parameters -code_fix_type_hints code_fix_unit_test -code_review_correctness -code_review_refactoring code_transform_apply_csfy_style code_transform_apply_linter_instructions code_transform_remove_redundancy code_write_1_unit_test code_write_unit_test +latex_rewrite +md_add_good_bad_examples md_clean_up_how_to_guide +md_convert_table_to_bullet_points +md_convert_text_to_bullet_points +md_create_bullets +md_expand +md_format +md_remove_formatting md_rewrite md_summarize_short -scratch_categorize_topics +misc_categorize_topics +review_correctness +review_linter +review_llm +review_refactoring +slide_add_figure slide_bold -slide_elaborate -slide_improve -slide_improve2 +slide_check +slide_expand slide_reduce +slide_reduce_bullets +slide_smart_colorize +slide_to_bullet_points test +text_idea +text_rephrase +text_rewrite +``` + +## Prompt Tags + +### Code Prompts + +#### `code_apply_cfile` + +- Correct `from ... import ...` statements + +##### Input + +```python +from X import Y as y +``` + +##### Output + +```python +import X + +y = X.Y +``` + +#### `code_fix_by_using_f_strings` + +- Replace `%%`/`.format()` strings with f‑strings + +##### Input + +```python +print("Hello %s, you are %d years old" % (name, age)) +``` + +##### Output + +```python +print(f"Hello {name}, you are {age} years old") +``` + +#### `code_fix_by_using_perc_strings` + +- Convert f‑strings to lazy %% formatting for logging + +##### Input + +```python +_LOG.debug(f"Rows processed: {rows}") +``` + +##### Output + +```python +_LOG.debug("Rows processed: %s" % rows) +``` + +#### `code_fix_code` + +- Apply the standard bundle of minor Causify code fixes + +##### Input + +```python +from os.path import join + +def foo(x,y=2):return x+y +``` + +##### Output + +```python +import os + +def foo(x: int, *, y: int = 2) -> int: + """ + Add two numbers, with the second number being optional. + + :param x: the first number to add + :param y: the second number to add, default is 2 + :return: the sum of the two numbers + """ + return x + y +``` + +#### `code_fix_comments` + +- Rewrite and add clear imperative comments + +##### Input + +```python +# does sum +result = a+b +``` + +##### Output + +```python +# Calculate the sum of `a` and `b` and store it in `result`. +result = a + b +``` + +//TODO(indro): Replace with a better generated example. + +#### `code_fix_complex_assignments` + +- Expand inline conditionals or comprehensions into explicit blocks + +##### Input + +```python +sign = 1 if value >= 0 else -1 +``` + +##### Output + +```python +if value >= 0: + sign = 1 +else: + sign = -1 +``` + +#### `code_fix_docstrings` + +- Insert or correct REST‑style docstrings + +##### Input + +```python +def inc(x): + return x+1 +``` + +##### Output + +```python +def inc(x): + """ + Increment the given number by one. + + :param x: the number to be incremented + :return: the incremented number + """ + return x + 1 +``` + +#### `code_fix_from_imports` + +- Corrects `from ... import ...` statements + +##### Input + +```python +from X import Y as y +``` + +##### Output + +```python +import X + +y = X.Y +``` + +#### `code_fix_function_type_hints` + +- Add type hints to function signatures + +##### Input + +```python +def area(r): + return 3.14*r*r +``` + +##### Output + +```python +def area(r: float) -> float: + return 3.14 * r * r +``` + +#### `code_fix_log_string` + +- Convert eager f‑string logs to lazy %% logging format + +##### Input + +```python +_LOG.info(f"Loaded {n_rows} rows from {path}") +``` + +##### Output + +```python +_LOG.info("Loaded %d rows from %s", n_rows, path) +``` + +#### `code_fix_logging_statements` + +- Inject informative logging statements + +##### Input + +```python +data = load(path) +``` + +##### Output + +```python +data = load(path) +print(f"[DEBUG] Loaded data from {path}: {data}") +``` + +#### `code_fix_star_before_optional_parameters` + +- Enforce \*, before keyword‑only optional parameters + +##### Input + +```python +def greet(name, greeting="Hi"): + ... +``` + +##### Output + +```python +def greet(name, *, greeting="Hi"): + ... +``` + +#### `code_fix_unit_test` + +- Repair broken unit‑test expectations or structure + +##### Input + +```python +``` + +##### Output + +```python +``` + +//TODO(indro): Add example. + +#### `code_transform_apply_csfy_style` + +- Reformat Python code to match Causify's style + +##### Input + +```python +from os.path import join + +def Foo(x):return x+1 +``` + +##### Output + +```python +from os.path import join as tecode + +def Foo(x): + return x + 1 +``` + +#### `code_transform_apply_linter_instructions` + +- Auto‑fix code based on provided linter error list + +##### Input + +```python +``` + +##### Output + +```python +``` + +//TODO(indro): Add examples after getting the command working. + +#### `code_transform_remove_redundancy` + +- Remove duplicated or dead code logic + +##### Input + +```python +if mode=="train": + verbose=True +elif mode=="train": + verbose=True +``` + +##### Output + +```python +The code contains redundancy because the same condition and action are repeated. You can refactor it by removing the duplicate `elif` block: + +'''python +if mode == "train": + verbose = True +''' + +This single `if` statement is sufficient to achieve the desired behavior. +``` + +//TODO(indro): Improve example. + +#### `code_write_1_unit_test` + +- Generate a single illustrative unit test + +##### Input + +```python +def add(a,b):return a+b +``` + +##### Output + +```python +import unittest +import uut + +class TestAddFunction(unittest.TestCase): + def test_add_positive_numbers(self): + self.assertEqual(uut.add(2, 3), 5) + +if __name__ == '__main__': + unittest.main() +``` + +#### `code_write_unit_test` + +- Generate a full unit‑test suite + +##### Output for the same input as above + +```python +import unittest +import uut + +class TestAddFunction(unittest.TestCase): + + def test_add_positive_numbers(self): + self.assertEqual(uut.add(2, 3), 5, "Adding 2 and 3 should be 5") + + def test_add_negative_numbers(self): + self.assertEqual(uut.add(-1, -1), -2, "Adding -1 and -1 should be -2") + + def test_add_positive_and_negative(self): + self.assertEqual(uut.add(5, -3), 2, "Adding 5 and -3 should be 2") + + def test_add_zero(self): + self.assertEqual(uut.add(0, 5), 5, "Adding 0 and 5 should be 5") + + def test_add_large_numbers(self): + self.assertEqual(uut.add(1000000, 2000000), 3000000, "Adding 1000000 and 2000000 should be 3000000") + +if __name__ == '__main__': + unittest.main() +``` + +### Documentation Prompts + +#### `latex_rewrite` + +- Rewrite LaTeX prose for clarity while preserving math + +##### Input + +```latex +We shows that $x$ converges. +``` + +##### Output + +```latex +We demonstrate that $x$ converges. +``` + +#### `md_add_good_bad_examples` + +- Append Good/Bad example bullets to guidelines text + +##### Input + +```md +Variable names should be descriptive. +``` + +##### Output + +```md +Variable names should be descriptive. + +- Good: `userAge`, `totalPrice`, `isVerified` +- Bad: `x`, `temp`, `data` +``` + +#### `md_clean_up_how_to_guide` + +- Restructure markdown into a how‑to guide format + +##### Input + +```md +Install foo by cloning repo. Run make. +``` + +##### Output (Sections Added) + +```md +# How to Install Foo by Cloning the Repository + +## Goal / Use Case + +The goal of this guide is to help you install the software package "Foo" by +cloning its repository and compiling it using `make`. This is useful if you want +to access the latest version of Foo directly from the source or if the package +is not available through your package manager. + +## Assumptions / Requirements + +Before you begin, ensure you have the following: + +- A Unix-like operating system (Linux, macOS, etc.) +- Git installed on your system to clone the repository +- `make` utility installed to compile the software +- Internet connection to access the repository + +## Step-by-Step Instructions + +1. **Open a Terminal:** + - Access your terminal application to enter commands. + +2. **Clone the Repository:** + - Use the following command to clone the Foo repository: bash git clone + + - Replace `` with the actual URL of the Foo repository. + +3. **Navigate to the Cloned Directory:** (...) + +By following these steps, you should be able to successfully install Foo from +its repository. If you encounter any issues, refer to the troubleshooting +section or consult the repository's documentation for further assistance. +``` + +#### `md_convert_table_to_bullet_points` + +- Convert markdown tables into bullet lists + +##### Input + +```md +| Key | Desc | +| --- | -------- | +| a | letter a | +``` + +##### Output + +```md +- Key: + - a +- Desc: + - letter a +``` + +#### `md_convert_text_to_bullet_points` + +- Split prose paragraphs into bullet points + +##### Input + +```md +Foo is fast. Foo is easy. Foo is scalable. +``` + +##### Output + +```md +- Foo + - Fast + - Easy + - Scalable +``` + +#### `md_create_bullets` + +- Convert a chunk of text directly into bullet points + +##### Input + +```md +Foo is fast and easy. It simplifies a complicated process. +``` + +##### Output + +```md +- Foo + - is fast and easy + - simplifies a complicated process +``` + +#### `md_expand` + +- Add missing bullets or examples without altering structure + +##### Input + +```md +Use `foo()` to parse data. +``` + +##### Output + +```md +- Use `foo()` to parse data + - `foo()` is a function designed to process and interpret data + - It can handle various data formats, such as JSON, XML, or CSV + - The function is efficient and optimized for large datasets + - It provides error handling to manage unexpected data formats E.g., if you + have a JSON string, you can use `foo()` to convert it into a Python + dictionary for easier manipulation: json_data = '{"name": "John", "age": + 30}' parsed_data = foo(json_data) print(parsed_data) # Output: {'name': + 'John', 'age': 30} +- Ensure that the data passed to `foo()` is correctly formatted to avoid parsing + errors +- Consider using `foo()` in conjunction with other data processing functions for + comprehensive data analysis +``` + +#### `md_format` + +- Normalize markdown bullet syntax and formatting + +##### Input + +```md +* item one +``` + +##### Output + +```md +- item one +``` + +#### `md_remove_formatting` + +- Strip bold/italic and other non‑essential markup + +##### Input + +```md +**Bold text** _italicized text_ +``` + +##### Output + +```md +Bold text +italicized text +``` + +#### `md_rewrite` + +- Rewrite markdown for improved clarity and flow + +##### Input + +```md +Foo is very very powerful tool. ``` -# Code Fixes +##### Output -## Code fix by using f strings +```md +Foo is an incredibly powerful tool. +``` + +#### `md_summarize_short` + +- Summarize content in 30 words or fewer + +##### Input + +```md +Foo accelerates data processing by parallelising I/O operations and distributing +computational tasks simultaneously. This means that instead of executing +processes sequentially, the system can handle multiple data streams at once, +which significantly reduces wait times and improves overall throughput. + +- Key benefits include: + - Reduced latency due to concurrent data access. + - Improved efficiency in utilizing CPU and other hardware resources. + - Enhanced scalability when handling large or complex datasets. + +By leveraging parallelisation, Foo optimizes both data input/output and internal +processing, ensuring faster execution and more responsive performance even under +heavy loads. +``` + +##### Output + +```md +Foo enhances data processing by parallelizing I/O operations, reducing latency, +improving resource efficiency, and increasing scalability for large datasets, +resulting in faster and more responsive performance. +``` + +### Slide Prompts + +#### `slide_add_figure` + +- Generate a Graphviz figure to illustrate concepts + +##### Input + +```md +A simple ETL pipeline has Extract, Transform, Load stages. +``` + +##### Output + +```dot +digraph ETL_Pipeline { + node [style=filled, shape=box, color="#FFD1A6", fontname="Helvetica"]; + + Extract [label="Extract", shape=box, style=rounded, color="#A6E7F4"]; + Transform [label="Transform", shape=box, style=rounded, color="#A6C8F4"]; + Load [label="Load", shape=box, style=rounded, color="#B2E2B2"]; + + Extract -> Transform -> Load; +} +``` + +#### `slide_bold` + +- Bold the most critical phrases in slide markdown + +##### Input + +```md +- Foo improves accuracy +``` + +##### Output + +```md +- **Foo improves accuracy** +``` + +#### `slide_check` + +- Assess slide clarity and suggest concise fixes + +##### Input +``` +Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris pharetra, velit id sollicitudin placerat, lectus augue facilisis lacus, ut pulvinar eros urna non sem. Cras ut purus vitae metus convallis accumsan. Suspendisse potenti. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Sed tempus, justo id vestibulum malesuada, urna justo pretium libero, a feugiat justo nulla a sem. +``` + +##### Output +``` +- Is the content of the slide clear and correct? + - The slide is not clear + +- Is there anything that can be clarified? + - The slide uses placeholder text (Lorem Ipsum) which does not convey any meaningful information. + - Replace the placeholder text with actual content relevant to the topic being discussed. + - Ensure that the slide has a clear title or heading to indicate its subject. + - Provide context or examples if necessary to support the information presented. + - Check for any specific terminology or jargon that may need explanation for clarity. +``` + +#### `slide_expand` + +- Add supporting bullets or examples to slides + +##### Input + +```md +- Foo benefits +``` + +##### Output + +```md +- Foo benefits + - Foo is a versatile tool that can be used in various applications + - It enhances productivity by automating repetitive tasks + - It is user-friendly and easy to integrate with existing systems + - It provides robust security features to protect data + - It is cost-effective, reducing the need for additional resources + - E.g., In a software development environment, Foo can automate the build and + deployment process, saving time and reducing errors + - E.g., In a data analysis context, Foo can quickly process large datasets, + providing insights faster than manual methods +``` + +#### `slide_reduce` + +- Condense verbose slide text while keeping meaning + +##### Input + +```md +- Foo benefits + - Foo is a versatile tool that can be used in various applications + - It enhances productivity by automating repetitive tasks + - It is user-friendly and easy to integrate with existing systems + - It provides robust security features to protect data + - It is cost-effective, reducing the need for additional resources + - E.g., In a software development environment, Foo can automate the build and + deployment process, saving time and reducing errors + - E.g., In a data analysis context, Foo can quickly process large datasets, + providing insights faster than manual methods +``` + +##### Output + +```md +- Foo benefits + - Foo is versatile for various applications + - Enhances productivity by automating tasks + - User-friendly and integrates with systems + - Provides robust security for data + - Cost-effective, reducing resource needs + - E.g., In software development, Foo automates build and deployment, saving + time and reducing errors + - E.g., In data analysis, Foo processes large datasets quickly, providing + insights faster than manual methods +``` -It fixes the code to use f-string insted of conventional string formatting. It a -lso uses a post transformation of removal of code delimiter. +#### `slide_reduce_bullets` -### Example +- Remove redundant bullets, keep essentials -- Command:, - ```bash - > llm_transform.py -i input.py -o output.py -p code_fix_by_using_f_strings - ``` -- Suppose input.txt contains the following: - ```python - "Hello, %s. You are %d years old." % (name, age) - ... - ``` -- Expect the following output in output.txt - ```python - "Hello, {name}. You are {age} years old." - .... - ``` - -## Code fix by using perc strings - -This is exactly opposite to the 'code_fix_by_using_f_string' where the -transformation is to use %formatting instead of f-strings. - -### Example - -- Command:, - ```bash - > llm_transform.py -i input.py -o output.py -p code_fix_by_using_perc_strings - ``` -- Suppose input.py contains the following: - ```python - "Hello, {name}. You are {age} years old." - ... - ``` -- Expect the following output in output.py - ```python - "Hello, %s. You are %d years old." % (name, age) - .... - ``` - -## Code fix csfy style - -Apply all the transformations required to write code according to the -Causify conventions. - -### Example - -- Command:, - ```bash - > llm_transform.py -i input.py -o output.py -p code_fix_csfy_style - ``` - -## Code fix docstrings - -Ensures each function has a properly structured REST-style docstring. It adds -missing docstrings or complete partial ones according to the best practice. - -### Example - -- Command:, - ```bash - > llm_transform.py -i input.py -o output.py -p code_fix_docstrings - ``` -- Suppose input.py contains the following: - ```python - def format_greeting(name: str, *, greeting: str = "Hello") -> str: - ... - ``` -- Expect the following output in output.py - ```python - def format_greeting(name: str, *, greeting: str = "Hello") -> str: - """ - Format a greeting message with the given name. - - :param name: the name to include in the greeting (e.g., "John") - :param greeting: the base greeting message to use (e.g., "Ciao") - :return: formatted greeting (e.g., "Hello John") - """ - .... - ``` - -## Code fix by existing comments - -Fix the already existing comments in the Python code. - -### Example - -- Command:, - ```bash - > llm_transform.py -i input.py -o output.py -p code_fix_by_existing_comments - ``` - -## Code fix from imports - -Fix code to use imports instead of "from import" statements. - -### Example - -- Command: - ```bash - > llm_transform.py -i input.py -o output.py -p code_fix_from_imports - ``` -- Suppose input.py contains the following: - ```python - from X import Y - ... - ``` -- Expect the following output in output.py - ```python - import X.Y - .... - ``` - -## Code fix improve comments - -Add comments to python code. - -### Example - -- Command: - ```bash - > llm_transform.py -i input.py -o output.py -p code_fix_improve_comments - ``` -- Suppose input.txt contains the following: - ```python - import pandas - import numpy - import scipy - ... - ``` -- Expect the following output in output.txt - ```python - # Import libraries - import pandas - import numpy - import scipy - .... - ``` - -## Code fix log string - -Fix logging statements and dassert statements by using % formatting instead -of f-strings (formatted string literals) - -### Example - -- Command:, - ```bash - > llm_transform.py -i input.py -o output.py -p code_fix_log_string - ``` -- Suppose input.py contains the following: - ```python - _LOG.info(f"env_var='{str(env_var)}' is not in env_vars=\ - '{str(os.environ.keys())}'") - ``` -- Expect the following output in output.py - ```python - _LOG.info("env_var='%s' is not in env_vars='%s'", env_var, \ - str(os.environ.keys())) - ``` - -## Code fix logging statements - -Add logging statements to Python code. - -### Example - -- Command:, - ```bash - > llm_transform.py -i input.py -o output.py -p code_fix_logging_statements - ``` -- Suppose input.py contains the following: - ```python - def get_text_report(self) -> str: - """ - Generate a text report listing each module's dependencies. - - :return: Text report of dependencies, one per line. - """ - ``` -- Expect the following output in output.py - ```python - def get_text_report(self) -> str: - """ - Generate a text report listing each module's dependencies. - - :return: Text report of dependencies, one per line. - """ - _LOG.debug(hprint.func_signature_to_str()) - ``` - -## Code fix star before optional parameters - -Fix code missing the star before optional parameters. - -### Example - -- Command:, - ```bash - > llm_transform.py -i input.py -o output.py - -p code_fix_star_before_optional_parameters - ``` -- Suppose input.py contains the following: - ```python - def format_greeting(name: str, greeting: str = "Hello") -> str: - ... - ``` -- Expect the following output in output.py - ```python - def format_greeting(name: str, *, greeting: str = "Hello") -> str: - ... - ``` - -## Code fix type hints - -Add type hints to the Python code passed. - -### Example - -- Command: - ```bash - > llm_transform.py -i input.py -o output.py -p code_fix_type_hints - ``` -- Suppose input.py contains the following: - ```python - def process_data(data, threshold=0.5): - results = [] - for item in data: - if item > threshold: - results.append(item) - return results - ``` -- Expect the following output in output.py - ```python - def process_data(data: List[float], *, \ - threshold: float = 0.5) -> List[float]: - results: List[float] = [] - for item in data: - if item > threshold: - results.append(item) - return results - ``` - -# Code review and refactoring - -## Code review correctness - -Designed to review a Python script for correctness and quality, then output -line-numbered suggestions using a structured format. - -- Command:, - ```bash - > llm_transform.py -i input.py -o cfile -p code_review_correctness - ``` - -## Code review refactoring - -Review the code for refactoring opportunities. - -- Command:, - ```bash - > llm_transform.py -i input.py -o cfile -p code_review_refactoring - ``` - -# Markdown Processing - -## Md clean up how to guide - -Format the text to rewrite as a how_to_guide and contain the sections like - - Goal / Use Case - - Assumptions / Requirements - - Step-by-Step Instructions - - Alternatives or Optional Steps - - Troubleshooting - -- Command:, - ```bash - > llm_transform.py -i input.md -o output.md -p md_clean_up_how_to_guide - ``` - -## Md rewrite - - Rewrite the text passed in a technical document style to - increase clarity and readability. - -- Command:, - ```bash - > llm_transform.py -i input.md -o output.md -p md_rewrite - ``` - -## Md summarize short - - Summarize the text in less than 30 words. - -- Command:, - ```bash - > llm_transform.py -i input.md -o output.md -p md_summarize_short - ``` - - \ No newline at end of file +##### Input + +```md +- Foo benefits + - Foo is versatile for various applications + - Enhances productivity by automating tasks + - User-friendly and integrates with systems + - Provides robust security for data + - Cost-effective, reducing resource needs + - E.g., In software development, Foo automates build and deployment, saving + time and reducing errors + - E.g., In data analysis, Foo processes large datasets quickly, providing + insights faster than manual methods +``` + +##### Output + +```md +- Foo benefits + - Enhances productivity by automating tasks + - User-friendly and integrates with systems + - Provides robust security for data + - Cost-effective, reducing resource needs + - E.g., In software development, Foo automates build and deployment, saving + time and reducing errors + - E.g., In data analysis, Foo processes large datasets quickly, providing + insights faster than manual methods +``` + +#### `slide_smart_colorize` + +- Apply color‑tag markup to highlight repeated tokens + +##### Input + +```md +P(A|B) = P(B|A)P(A)/P(B) +``` + +##### Output + +```md +\red{P(A|B)} = \frac{\green{P(B|A)}\blue{P(A)}}{\violet{P(B)}} +``` + +#### `slide_to_bullet_points` + +- Convert prose into structured slide bullet points + +##### Input + +```md +Foo is fast. Foo is scalable. +``` + +##### Output + +```md +- Foo is fast +- Foo is scalable +``` + +### Text Prompts + +#### `text_idea` + +- Brainstorm or list creative ideas on a topic + +##### Input +``` +Write about edge‑AI. +``` + +##### Output + +```md +Edge AI refers to the deployment of artificial intelligence algorithms and +models directly on devices at the edge of a network, rather than relying on +centralized cloud-based systems. This approach allows data to be processed +locally on devices such as smartphones, IoT devices, sensors, and other edge +hardware, enabling faster decision-making, reduced latency, and improved +privacy. + +### Key Features of Edge AI: + +1. **Low Latency**: By processing data locally, edge AI reduces the time it + takes to analyze data and make decisions. This is crucial for applications + requiring real-time responses, such as autonomous vehicles, industrial + automation, and augmented reality. + +2. **Improved Privacy and Security**: Since data is processed on the device + itself, sensitive information does not need to be transmitted to the cloud, + reducing the risk of data breaches and enhancing user privacy. + +(...) +``` + +#### `text_rephrase` + +- Paraphrase text while preserving meaning + +##### Input +``` +Turkey is a melting pot of cultures. +``` + +##### Output +``` +Yes, Turkey is indeed a melting pot of cultures, owing to its unique geographical location and rich history. Situated at the crossroads of Europe and Asia, Turkey has been a bridge between different civilizations for centuries. This has resulted in a diverse cultural heritage that includes influences from the Greeks, Romans, Byzantines, and Ottomans, among others. + +(...) +``` + +#### `text_rewrite` + +- Rewrite text for clarity and readability + +##### Input +``` +Yes, Turkey is indeed a melting pot of cultures, owing to its unique geographical location and rich history. Situated at the crossroads of Europe and Asia, Turkey has been a bridge between different civilizations for centuries. This has resulted in a diverse cultural heritage that includes influences from the Greeks, Romans, Byzantines, and Ottomans, among others. +``` + +##### Output +``` +Turkey is a true melting pot of cultures, thanks to its unique geographical position and rich historical background. Located at the intersection of Europe and Asia, Turkey has served as a bridge between various civilizations for centuries. This has led to a diverse cultural heritage, with influences from: + +- Greeks +- Romans +- Byzantines +- Ottomans + +And many others. +``` + +### Review Prompts + +#### `review_correctness` + +- Report potential logical errors in code + +##### Input + +```py +return x/len(lst) +``` + +##### Output +``` +/app/helpers_root/docs/tools/scratch.txt:1: Add error handling for division by zero and ensure `lst` is not empty before performing the division. +``` + +#### `review_linter` + +- Report style/lint violations in vim‑cfile format + +##### Input + +```python +from os.path import join + +def Foo(x):return x+1 +``` + +##### Output +``` +/app/helpers_root/docs/tools/scratch.txt:3: Naming-93: Function name `Foo` should be a verb or verb/action, not a noun. +``` + +#### `review_llm` + +- Fact‑check or critique statements with LLM guidelines + +##### Input + +```python +``` + +##### Output +``` +``` + +//TODO(indro): Add examples after prompt works. + +#### `review_refactoring` + +- Highlight refactoring opportunities for readability/DRY + +##### Input + +```python +if mode==1: do_a() else: do_a() +``` + +##### Output +``` +/app/helpers_root/docs/tools/scratch.txt:1: Simplify the conditional statement since both branches execute the same function `do_a()`. +``` + +### Miscellaneous Prompts + +#### `misc_categorize_topics` + +- Classify article titles into predefined topics + +##### Input +``` +Reinforcement Learning with Large Language Models +``` + +##### Output +``` +Reinforcement Learning with | LLM Reasoning +``` + +#### `test` + +- Return SHA‑256 hash of the input + +##### Input +``` +Information +``` + +##### Output +``` +d25c105adac26e714d55906fdb5d3451a12483f71948db1d7fd6cbdaa8ee231a +``` From 463a562c6d1c2806f400f4d1fa47d4c99db92a7b Mon Sep 17 00:00:00 2001 From: Indrayudd Roy Chowdhury Date: Fri, 6 Jun 2025 01:15:43 -0400 Subject: [PATCH 5/9] HelpersTask702: Slight changes in formatting --- docs/tools/all.llm_transform.reference.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/docs/tools/all.llm_transform.reference.md b/docs/tools/all.llm_transform.reference.md index e617b5495..30c4a5793 100644 --- a/docs/tools/all.llm_transform.reference.md +++ b/docs/tools/all.llm_transform.reference.md @@ -830,8 +830,7 @@ Use `foo()` to parse data. ##### Output ```md -Bold text -italicized text +Bold text italicized text ``` #### `md_rewrite` @@ -1048,7 +1047,7 @@ P(A|B) = P(B|A)P(A)/P(B) ##### Output -```md +```latex \red{P(A|B)} = \frac{\green{P(B|A)}\blue{P(A)}}{\violet{P(B)}} ``` From e17f8fae03f64b822f8e2bfb626ed49cd488522b Mon Sep 17 00:00:00 2001 From: Indrayudd Roy Chowdhury Date: Fri, 6 Jun 2025 01:44:28 -0400 Subject: [PATCH 6/9] HelpersTask702: Make a how-to guide for llm_transform --- docs/tools/all.llm_transform.how_to_guide.md | 253 ++++++++----------- 1 file changed, 112 insertions(+), 141 deletions(-) diff --git a/docs/tools/all.llm_transform.how_to_guide.md b/docs/tools/all.llm_transform.how_to_guide.md index 906cd6de8..f9db24682 100644 --- a/docs/tools/all.llm_transform.how_to_guide.md +++ b/docs/tools/all.llm_transform.how_to_guide.md @@ -1,148 +1,119 @@ -- [Structure of files](#structure-of-files) -- [Step-by-step execution](#step-by-step-execution) - * [Example](#example) +- [`llm_transform.py` - How-to Guide](#llm_transformpy---how-to-guide) + * [Goal / Use Case](#goal--use-case) + * [Assumptions / Requirements](#assumptions--requirements) + * [Step-by-Step Instructions](#step-by-step-instructions) + * [Examples](#examples) + + [Using an Input File](#using-an-input-file) + + [Using `stdin`](#using-stdin) + + [Using Vim](#using-vim) + * [Prompts](#prompts) -# Structure of files -Search files... -Based on the code analysis, I can explain the structure of these three files -and how they work together. Here's a comprehensive breakdown: - -1. **Overall Architecture** -The system is designed to apply LLM-based transformations to code and text -files, with a focus on code quality, documentation, and formatting. -The architecture consists of three main components: - -2. **llm_transform.py** (Main Entry Point) -- This is the primary script that users interact with -- Key features: - - Handles input/output from either stdin/stdout or files - - Manages Docker container setup and execution - - Provides command-line interface for various transformations -- Main functions: - - `_parse()`: Sets up command-line argument parsing - - `_run_dockerized_llm_transform()`: Manages Docker container execution - - `_main()`: Orchestrates the overall transformation process - -3. **dockerized_llm_transform.py** (Docker Container Script) -- Runs inside a Docker container -- Key features: - - Handles the actual LLM transformation - - Manages dependencies within the container - - Processes input/output files -- Main functions: - - `_parse()`: Parses command-line arguments - - `_main()`: Executes the transformation using LLM prompts - -4. **llm_prompts.py** (Prompt Management) -- Contains all the transformation prompts and their configurations -- Key features: - - Defines various transformation types (code fixes, reviews, markdown processing) - - Manages pre and post-transformation steps - - Handles prompt templates and contexts -- Main components: - - Prompt definitions (e.g., `code_fix_docstrings`, `code_review_correctness`) - - Transformation management (`run_prompt()`) - - Context definitions (`_CODING_CONTEXT`, `_MD_CONTEXT`) - -5. **Transformation Types** -The system supports several categories of transformations: -- Code Fixes: - - Docstring improvements - - Type hint additions - - Logging statement fixes - - String formatting fixes -- Code Reviews: - - Correctness review - - Refactoring suggestions -- Markdown Processing: - - Document rewriting - - Summarization - - How-to guide formatting - - Explanation document formatting - -6. **Transformation Process Flow** -1. User calls `llm_transform.py` with input/output files and transformation type -2. `llm_transform.py` sets up Docker container and passes control to - `dockerized_llm_transform.py` -3. `dockerized_llm_transform.py` uses `llm_prompts.py` to: - - Apply pre-transformations - - Execute LLM transformation - - Apply post-transformations -4. Results are written back to the output file - -7. **Docker Integration** -- Uses Python 3.12 Alpine as base image -- Installs necessary dependencies (PyYAML, requests, pandas, openai) -- Manages file paths between host and container -- Handles environment variables (e.g., OPENAI_API_KEY) - -8. **Error Handling and Logging** -- Comprehensive logging system -- Type checking and assertions -- Error handling for file operations -- Docker container management error handling - -This architecture provides a robust and flexible system for applying LLM-based -transformations to code and text files, with proper isolation of dependencies -through Docker and a clear separation of concerns between the different components. - -# Step-by-step execution - -Please note that same steps are followed for any transformation execution. -The input and output file extensions could be .txt, .py, .md, stdout -depending upon the type of transformation. - -- Choose the input file - - Locate the Python file you want to transform. - - Alternatively, you may use stdin to input the code manually. -- Specify the output file - - Provide a path to save the transformed output. - - Example path: research_amp/causal_kg/scrape_fred_metadata.new -- Open a terminal and run the following command: +# `llm_transform.py` - How-to Guide + +## Goal / Use Case + +This guide explains how to use `llm_transform.py` on code and text files. The +tool focuses on improving code quality, documentation, and formatting through +various transformations such as code fixes, reviews, and markdown processing. + +## Assumptions / Requirements + +* Docker is installed and properly configured on your system. +* An OpenAI API key is available in your environment variables. + +## Step-by-Step Instructions + +1. **Choose the Input File** + +* Locate the Python or text file you want to transform. +* Alternatively, you can use `stdin` to input the code manually. + +2. **Specify the Output File** + +* Provide a path to save the transformed output. +* Example path: `research_amp/causal_kg/scrape_fred_metadata.new` + +3. **Run the Transformation Command** + +* Open a terminal and execute: + + ```bash + llm_transform.py -i -o -p + ``` + +4. **Verify the Result** + +* The transformed output is saved to the specified output file. + +## Examples + +### Using an Input File + +* **Input file:** `research_amp/causal_kg/scrape_fred_metadata.py` + + Example content: + + ```python + from utils import parser + from helpers import hopenai + ``` + +* **Output file:** `research_amp/causal_kg/scrape_fred_metadata_new.py` + +* **Command:** + + ```bash + llm_transform.py -i research_amp/causal_kg/scrape_fred_metadata.py -o research_amp/causal_kg/scrape_fred_metadata_new.py -p code_fix_from_imports + ``` + +* **Resulting output:** + + ```python + import utils.parser + import helpers.hopenai + ``` + +### Using `stdin` + +* **Input:** `stdin` + +* **Output:** `stdout` + +* **Command:** + ```bash - > llm_transform.py -i input-file -o output-file -p + llm_transform.py -i - -o - -p code_fix_from_imports + # input: + from utils import parser + from helpers import hopenai + # press Ctrl + D + ``` + +* **Resulting output:** + + ```python + import parser.utils + import helpers.hopenai ``` -- Verify the result - - The output would be copied to the output file. - -## Example - -- In case of an input file - - step 1: Input file is 'research_amp/causal_kg/scrape_fred_metadata.py'. - - Suppose, the file contains the following imports: - ```python - from utils import parser - from helpers import hopenai - ``` - - step 2: Output file is 'research_amp/causal_kg/scrape_fred_metadata_new.py'. - - step 3: Run the following command: - ```bash - > llm_transform.py -i research_amp/causal_kg/scrape_fred_metadata.py - -o research_amp/causal_kg/scrape_fred_metadata_new.py -p code_fix_from_imports - ``` - - step 4: The output is following and saved in the above-mentioned output file: - ```python - import utils.parser - import helpers.hopenai - ``` - -- In case of stdin - - step 1: Input is stdin - - step 2: Output is stdout - - step 3: Run the following command: - ```bash - > llm_transform.py -i - -o - -p code_fix_from_imports - # input: - from utils import parser - from helpers import hopenai - [press Ctrl + D] - ``` - - step 4: The output is following and saved in the above-mentioned output file: - ```python - import parser.utils - import helpers.hopenai - ``` + +### Using Vim + +Transform selected lines directly within **Vim**: + +```vim +:'<,'>!llm_transform.py -p summarize -i - -o - +``` + +This command pipes the current visual selection (denoted by `'<,'>`) to +`llm_transform.py` with the `summarize` prompt and replaces the selection with +the transformed text. + +## Prompts + +Different transformation types are selected by specifying a `` +value. Available tags include transformations for code fixes, code reviews, +markdown processing, and more, as detailed in the reference documentation. From afa1ce20167b4a66cba358e0457d5c13d222279f Mon Sep 17 00:00:00 2001 From: Indrayudd Roy Chowdhury Date: Fri, 6 Jun 2025 04:36:11 -0400 Subject: [PATCH 7/9] HelpersTask702: Add architecture md --- docs/tools/all.llm_transform.explanation.md | 118 ++++++++++++++++++++ 1 file changed, 118 insertions(+) create mode 100644 docs/tools/all.llm_transform.explanation.md diff --git a/docs/tools/all.llm_transform.explanation.md b/docs/tools/all.llm_transform.explanation.md new file mode 100644 index 000000000..c64e8acea --- /dev/null +++ b/docs/tools/all.llm_transform.explanation.md @@ -0,0 +1,118 @@ + + +- [llm_transform.py - Architecture & Flow Explanation](#llm_transformpy---architecture--flow-explanation) + * [High Level Flow](#high-level-flow) + * [Architecture Diagrams (C4)](#architecture-diagrams-c4) + + [System Context](#system-context) + + [Container](#container) + + [Component](#component) + + + +# llm_transform.py - Architecture & Flow Explanation + +## High Level Flow + +- **Argument parsing** – uses [`/helpers/hparser.py`](/helpers/hparser.py) to + normalise CLI flags. +- **Input acquisition** – [`/helpers/hio.py`](/helpers/hio.py) resolves `‑i` or + stdin and reads bytes. +- **Prompt selection** – `llm_prompts.py` maps the `‑p/--prompt-tag` value to a + concrete system/assistant prompt. +- **LLM invocation** – the request is handed to the generic client in + [`/helpers/hserver.py`](/helpers/hserver.py) (through `llm_prompts.py`). +- **Post‑processing** – raw LLM text may be re‑formatted by + [`/helpers/hmarkdown.py`](/helpers/hmarkdown.py) (e.g. bold top‑level + bullets). +- **Output emission** – [`/helpers/hio.py`](/helpers/hio.py) writes to stdout or + the `‑o` file. +- **Optional Dockerisation** – if `‑‑dockerize` is set, control reroutes via + `dockerized_llm_transform.py`, which uses + [`/helpers/hdocker.py`](/helpers/hdocker.py) to spin up a container and + re‑invoke the script inside it. + +## Architecture Diagrams (C4) + +### System Context + +```mermaid +C4Context + title LLM Transform – System Context + Person(dev, "Developer", "Invokes CLI to transform code/text") + System_Boundary(causify, "Causify CLI Tools") { + Container(llm_cli, "llm_transform.py", "Python CLI", "Coordinates LLM transformations") + } + System_Ext(openai, "LLM Provider", "REST API", "e.g. OpenAI") + Rel(dev, llm_cli, "Runs", "CLI") + Rel(llm_cli, openai, "Sends prompt & receives completion", "HTTPS") +``` + +### Container + +```plantuml +@startuml + title LLM Transform – Containers + + ' Components + component [LLM API] as openai_api + note top of openai_api : HTTPS – External large-language-model + + ' Databases + database "Local FS" as filesystem + note top of filesystem : Text/Code files\nInput & output artefacts + + ' Containers + node "llm_transform.py\n(Python – Core orchestration CLI)" as llm_transform + node "dockerized_llm_transform.py\n(Python – Optional container bootstrapper)" as docker_wrapper + + ' Relationships + llm_transform --> openai_api : Calls + llm_transform --> filesystem : Reads/Writes + docker_wrapper --> llm_transform : Executes inside container +@enduml +``` + +### Component + +```plantuml +@startuml + title llm_transform.py – Internal Components + + ' Components + component [OpenAI API] as OpenAI_API + note top of OpenAI_API : REST-based LLM provider. + + ' Containers + node "llm_transform.py\n(Python CLI)" as llm_transform_container { + [llm_transform.py] as llm_main + note left of llm_main: Main entrypoint / Coordinator + + [helpers/hparser.py] as hparser + note left of hparser: Argument parsing + + [helpers/hio.py] as hio + note left of hio: File / STDIN I/O + + [llm_prompts.py] as llm_prompts + note left of llm_prompts: Prompt templates & dispatch + + [helpers/hmarkdown.py] as hmarkdown + note left of hmarkdown: Markdown post-processing + + [helpers/hgit.py] as hgit + note left of hgit: Git diff utilities + + [helpers/hdocker.py] as hdocker + note left of hdocker: Docker helpers + } + + ' Edge labels + llm_main --> hparser : Parses flags → supplies prompt-tag + llm_main --> hio : Reads/Writes files or STDIN/STDOUT + llm_main --> llm_prompts : Selects prompt template + llm_prompts --> OpenAI_API : Calls LLM provider + llm_main --> hmarkdown : Formats output as Markdown + llm_main --> hgit : Optionally computes Git diff + llm_main --> hdocker : Spawns container run (when --dockerize) +@enduml +``` From abb2ad3d8cfcad3df334a08f1977e7b52a925647 Mon Sep 17 00:00:00 2001 From: Indrayudd Roy Chowdhury Date: Fri, 6 Jun 2025 04:44:00 -0400 Subject: [PATCH 8/9] HelpersTask702: Format architecture md --- docs/tools/all.llm_transform.explanation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/tools/all.llm_transform.explanation.md b/docs/tools/all.llm_transform.explanation.md index c64e8acea..49285b734 100644 --- a/docs/tools/all.llm_transform.explanation.md +++ b/docs/tools/all.llm_transform.explanation.md @@ -20,7 +20,7 @@ - **Prompt selection** – `llm_prompts.py` maps the `‑p/--prompt-tag` value to a concrete system/assistant prompt. - **LLM invocation** – the request is handed to the generic client in - [`/helpers/hserver.py`](/helpers/hserver.py) (through `llm_prompts.py`). + [`/helpers/hserver.py`](/helpers/hserver.py) through `llm_prompts.py`. - **Post‑processing** – raw LLM text may be re‑formatted by [`/helpers/hmarkdown.py`](/helpers/hmarkdown.py) (e.g. bold top‑level bullets). From 9ac48160e13b18e496f2c6c4631177909814d5f9 Mon Sep 17 00:00:00 2001 From: GP Saggese Date: Mon, 9 Jun 2025 19:26:12 -0400 Subject: [PATCH 9/9] Review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pre-commit checks: All checks passed ✅ --- docs/tools/all.llm_transform.how_to_guide.md | 114 ++++++++++--------- docs/tools/all.llm_transform.reference.md | 27 +++-- 2 files changed, 72 insertions(+), 69 deletions(-) diff --git a/docs/tools/all.llm_transform.how_to_guide.md b/docs/tools/all.llm_transform.how_to_guide.md index f9db24682..b8133f31a 100644 --- a/docs/tools/all.llm_transform.how_to_guide.md +++ b/docs/tools/all.llm_transform.how_to_guide.md @@ -16,61 +16,44 @@ ## Goal / Use Case -This guide explains how to use `llm_transform.py` on code and text files. The -tool focuses on improving code quality, documentation, and formatting through -various transformations such as code fixes, reviews, and markdown processing. +- This guide explains how to use `llm_transform.py` on code and text files + - This tool focuses on improving code quality, documentation, and formatting + through various transformations (such as code fixes, reviews, and markdown + processing) using LLMs and Python code ## Assumptions / Requirements -* Docker is installed and properly configured on your system. -* An OpenAI API key is available in your environment variables. +- Docker is installed and properly configured on your system. +- An OpenAI API key is available in your environment variables. ## Step-by-Step Instructions -1. **Choose the Input File** - -* Locate the Python or text file you want to transform. -* Alternatively, you can use `stdin` to input the code manually. - -2. **Specify the Output File** - -* Provide a path to save the transformed output. -* Example path: `research_amp/causal_kg/scrape_fred_metadata.new` - -3. **Run the Transformation Command** - -* Open a terminal and execute: - +- Run the transformation command: ```bash - llm_transform.py -i -o -p + > llm_transform.py -i -o -p ``` -4. **Verify the Result** - -* The transformed output is saved to the specified output file. - -## Examples +- You can use an input file or `stdin` ### Using an Input File -* **Input file:** `research_amp/causal_kg/scrape_fred_metadata.py` +- **Input file:** `research_amp/causal_kg/scrape_fred_metadata.py` - Example content: + - Example content: + ```python + from utils import parser + from helpers import hopenai + ``` - ```python - from utils import parser - from helpers import hopenai - ``` - -* **Output file:** `research_amp/causal_kg/scrape_fred_metadata_new.py` +- **Output file:** `research_amp/causal_kg/scrape_fred_metadata_new.py` -* **Command:** +- **Command:** ```bash - llm_transform.py -i research_amp/causal_kg/scrape_fred_metadata.py -o research_amp/causal_kg/scrape_fred_metadata_new.py -p code_fix_from_imports + > llm_transform.py -i research_amp/causal_kg/scrape_fred_metadata.py -o research_amp/causal_kg/scrape_fred_metadata_new.py -p code_fix_from_imports ``` -* **Resulting output:** +- **Resulting output:** ```python import utils.parser @@ -79,11 +62,7 @@ various transformations such as code fixes, reviews, and markdown processing. ### Using `stdin` -* **Input:** `stdin` - -* **Output:** `stdout` - -* **Command:** +- **Command:** ```bash llm_transform.py -i - -o - -p code_fix_from_imports @@ -93,27 +72,52 @@ various transformations such as code fixes, reviews, and markdown processing. # press Ctrl + D ``` -* **Resulting output:** +- **Resulting output:** ```python import parser.utils import helpers.hopenai ``` -### Using Vim - -Transform selected lines directly within **Vim**: - -```vim -:'<,'>!llm_transform.py -p summarize -i - -o - -``` +- You can transform selected lines directly within **Vim**: + ```vim + :'<,'>!llm_transform.py -p summarize -i - -o - + ``` -This command pipes the current visual selection (denoted by `'<,'>`) to -`llm_transform.py` with the `summarize` prompt and replaces the selection with -the transformed text. +- This command pipes the current visual selection (denoted by `'<,'>`) to + `llm_transform.py` with the `summarize` prompt and replaces the selection with + the transformed text. ## Prompts -Different transformation types are selected by specifying a `` -value. Available tags include transformations for code fixes, code reviews, -markdown processing, and more, as detailed in the reference documentation. +- Different transformation types are selected by specifying a `` + value + - Available tags include transformations for code fixes, code reviews, markdown + processing, and more, as detailed in the reference documentation. + +- You can get the current list with: + ``` + > llm_transform.py -p list + # Available prompt tags: + code_apply_cfile + code_fix_by_using_f_strings + code_fix_by_using_perc_strings + code_fix_code + code_fix_comments + code_fix_complex_assignments + code_fix_docstrings + code_fix_from_imports + code_fix_function_type_hints + code_fix_log_string + code_fix_logging_statements + code_fix_star_before_optional_parameters + code_fix_unit_test + code_transform_apply_csfy_style + code_transform_apply_linter_instructions + code_transform_remove_redundancy + code_write_1_unit_test + code_write_unit_test + latex_check + latex_rewrite + ... + ``` diff --git a/docs/tools/all.llm_transform.reference.md b/docs/tools/all.llm_transform.reference.md index 30c4a5793..27eaa2162 100644 --- a/docs/tools/all.llm_transform.reference.md +++ b/docs/tools/all.llm_transform.reference.md @@ -153,14 +153,14 @@ # `llm_transform.py` -The script is capable of performing certain transformations using OpenAI's LLMs -on the input text or the stdin. The transformed output is then stored in the -output file or the stdout depending upon the arguments passed by the user. +- The script is capable of performing certain transformations using OpenAI's LLMs + on the input text or the stdin. The transformed output is then stored in the + output file or the stdout depending upon the arguments passed by the user. ## Interface ```bash -llm_transform.py -h +> llm_transform.py -h usage: llm_transform.py [-h] -i INPUT -o OUTPUT -p PROMPT [--compare] [-b | --bold_first_level_bullets] @@ -192,17 +192,16 @@ options: Set the logging level ``` -## Basic Usage +- The basic usage + ```bash + > llm_transform.py -i input.txt -o output.txt -p + # or + > llm_transform.py -i - -o - + ``` -```bash -> llm_transform.py -i input.txt -o output.txt -p -# or -> llm_transform.py -i - -o - -``` - -The script generates output from an LLM based on the user-specified prompt tag, -pplying the transformation to the input file. Using - for `-i` and `-o` -signifies stdin and stdout. +- The script generates output from an LLM based on the user-specified prompt tag, + applying the transformation to the input file. Using - for `-i` and `-o` + signifies stdin and stdout. > Note: Use the -s flag to see the LLM output without any post-processing.