A small command-line tool for compressing prompts with LLMLingua so they use fewer tokens while keeping the important parts.
Use a virtual environment so the project dependencies stay isolated.
python3 -m venv llmenv
source llmenv/bin/activate
pip install -r requirements.txtOn Windows PowerShell:
python -m venv llmenv
.\llmenv\Scripts\Activate.ps1
pip install -r requirements.txtLLMLingua downloads its model the first time the script runs. Make sure you have an internet connection and enough free disk space.
Compress direct text:
python3 compress.py --text "Create a README file for this project." --show-statsCompress a file:
python3 compress.py --file prompt.txt --rate 0.6Compress from stdin:
cat prompt.txt | python3 compress.py --rate 0.7Preserve important words or phrases:
python3 compress.py \
--file prompt.txt \
--force-token "API key" \
--force-token "database,backup,production"| Option | Description |
|---|---|
--text |
Prompt text to compress. |
--file |
Text file containing the prompt. |
--rate |
Fraction of tokens to keep. 0.7 keeps about 70%. |
--force-token |
Token or comma-separated tokens that must be preserved. |
--model |
Hugging Face model name. |
--device |
Device map for loading the model. Default is cpu. |
--show-stats |
Print original and compressed word counts. |
| Rate | Meaning |
|---|---|
0.8 |
Light compression; keeps more of the original prompt. |
0.6 |
Balanced compression. |
0.3 |
Aggressive compression; review the output carefully. |