Honest benchmark: does DSPy beat a hand-written prompt, and at what cost? Manual vs. BootstrapFewShot vs. MIPROv2 on 3 real PT-BR tasks, reporting accuracy gains and USD cost.
python benchmark nl2sql cost-analysis dspy llm prompt-engineering llm-evaluation prompt-optimization mipro portuguese-nlp
-
Updated
Jun 10, 2026 - Python