Refactor input control with shared _common library and add EPR, CPO, GEPA, PRewrite methods#18
Open
emiehling wants to merge 9 commits into
Open
Refactor input control with shared _common library and add EPR, CPO, GEPA, PRewrite methods#18emiehling wants to merge 9 commits into
emiehling wants to merge 9 commits into
Conversation
Rename the shared state_control library from common/ to _common/ to mark it private, and update all importing controls (act_add, caa, cast, iti) and tests to the new path. Pure rename plus import-path updates; no behavior change. Signed-off-by: Erik Miehling <emiehling@gmail.com>
Introduce the input_control _common library (memory, formatters, proposers, scorers, selectors, pareto/budget utilities) and harden the InputControl base (adapt / adapt_messages application). Add supporting evaluation-metric helpers (judge base, short-answer match) that the scorers build on, plus core type and pipeline updates. Add econml/einops dependencies. Includes _common and core tests. Signed-off-by: Erik Miehling <emiehling@gmail.com>
Rework FewShot to route adapt/adapt_messages through the shared FewShotBlockFormatter and pluggable selectors, replace the template arg with a formatter arg, and add the EPR learned dense retriever (Rubin et al. 2021) as a BaseSelector. Includes FewShot and EPR tests. Signed-off-by: Erik Miehling <emiehling@gmail.com>
Add PRewrite (Kong et al. 2024): an LLM instruction rewriter with greedy inference and best-of-K search strategies, plus an optional GRPO-trained rewriter using a metric-in-the-loop reward. Add PPO and GRPO TRL wrappers (and remove the unused SPPO wrapper) to support rewriter training. Includes PRewrite, PPO, and GRPO wrapper tests. Signed-off-by: Erik Miehling <emiehling@gmail.com>
Add CPO (Chen et al. 2026): offline causal reward training (Double ML over PCA-reduced embeddings, GBR fallback) with per-query tree search, routed through the shared TaskEvaluationScorer. Includes CPO tests. Signed-off-by: Erik Miehling <emiehling@gmail.com>
Add GEPA (Agrawal et al. 2025): reflective genetic prompt optimization (single-system-prompt variant) with Pareto-based parent selection over a held-out set and a budgeted reflective-mutation loop. Includes GEPA tests. Signed-off-by: Erik Miehling <emiehling@gmail.com>
Add reference pages and example notebooks for FewShot/PRewrite/CPO/GEPA, wire all four into the nav and examples landing page, and rewrite the structural/ state/output sections of controls.md to the per-method bullet style with reference links and citations. Fix the STEERING_METHOD registry name and field in the add-a-method tutorials and the _common building-blocks path. Signed-off-by: Erik Miehling <emiehling@gmail.com>
…oint The self-built attention mask used a square (seq_len, seq_len) shape from the current hidden states, which is wrong during decoding where the single query token must attend to the full kv cache. SDPA's fused CUDA kernel rejected the mismatched bias (reported as a contiguity error), failing all PASTA CUDA tests. The mask now spans the cached key positions via cache_position. Also replace the bare 'except: breakpoint()' in get_hooks with a chained ValueError so substring re-tokenization failures surface clearly.
Signed-off-by: Erik Miehling <emiehling@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Input control
input_control/_commonlibrary: formatters, selectors, proposers, scorers, memory, pareto, and budget primitives reused across methods.FewShotonto_common: pluggable selectors (replacing the template arg with a formatter arg) and add the EPR learned dense retriever (Rubin et al. 2021).CPO(causal prompt optimization).GEPA(reflective prompt evolution).PRewrite(prompt rewriter with optional GRPO-trained rewriter).Other
generateinSteeringPipeline(+types.py).state_control/common/to_common/for consistency with input control.short_answer_matchgeneric metric; refactorbase_judge._commonlibraries.