Refactor input control with shared _common library and add EPR, CPO, GEPA, PRewrite methods by emiehling · Pull Request #18 · IBM/AISteer360

emiehling · 2026-06-25T14:37:30Z

Input control

Add shared input_control/_common library: formatters, selectors, proposers, scorers, memory, pareto, and budget primitives reused across methods.
Rework FewShot onto _common: pluggable selectors (replacing the template arg with a formatter arg) and add the EPR learned dense retriever (Rubin et al. 2021).
Add CPO (causal prompt optimization).
Add GEPA (reflective prompt evolution).
Add PRewrite (prompt rewriter with optional GRPO-trained rewriter).

Other

Polymorphic generate in SteeringPipeline (+ types.py).
Rename state_control/common/ to _common/ for consistency with input control.
Replace the SPPO trainer wrapper with GRPO and PPO trainer wrappers.
Add short_answer_match generic metric; refactor base_judge.
New control notebooks (CPO, GEPA, PRewrite) and re-run of existing notebooks.
Documentation for new methods and the _common libraries.
Test coverage for all new controls, common libraries, wrappers, and the polymorphic generate path.

Rename the shared state_control library from common/ to _common/ to mark it private, and update all importing controls (act_add, caa, cast, iti) and tests to the new path. Pure rename plus import-path updates; no behavior change. Signed-off-by: Erik Miehling <emiehling@gmail.com>

Introduce the input_control _common library (memory, formatters, proposers, scorers, selectors, pareto/budget utilities) and harden the InputControl base (adapt / adapt_messages application). Add supporting evaluation-metric helpers (judge base, short-answer match) that the scorers build on, plus core type and pipeline updates. Add econml/einops dependencies. Includes _common and core tests. Signed-off-by: Erik Miehling <emiehling@gmail.com>

Rework FewShot to route adapt/adapt_messages through the shared FewShotBlockFormatter and pluggable selectors, replace the template arg with a formatter arg, and add the EPR learned dense retriever (Rubin et al. 2021) as a BaseSelector. Includes FewShot and EPR tests. Signed-off-by: Erik Miehling <emiehling@gmail.com>

Add PRewrite (Kong et al. 2024): an LLM instruction rewriter with greedy inference and best-of-K search strategies, plus an optional GRPO-trained rewriter using a metric-in-the-loop reward. Add PPO and GRPO TRL wrappers (and remove the unused SPPO wrapper) to support rewriter training. Includes PRewrite, PPO, and GRPO wrapper tests. Signed-off-by: Erik Miehling <emiehling@gmail.com>

Add CPO (Chen et al. 2026): offline causal reward training (Double ML over PCA-reduced embeddings, GBR fallback) with per-query tree search, routed through the shared TaskEvaluationScorer. Includes CPO tests. Signed-off-by: Erik Miehling <emiehling@gmail.com>

Add GEPA (Agrawal et al. 2025): reflective genetic prompt optimization (single-system-prompt variant) with Pareto-based parent selection over a held-out set and a budgeted reflective-mutation loop. Includes GEPA tests. Signed-off-by: Erik Miehling <emiehling@gmail.com>

Add reference pages and example notebooks for FewShot/PRewrite/CPO/GEPA, wire all four into the nav and examples landing page, and rewrite the structural/ state/output sections of controls.md to the per-method bullet style with reference links and citations. Fix the STEERING_METHOD registry name and field in the add-a-method tutorials and the _common building-blocks path. Signed-off-by: Erik Miehling <emiehling@gmail.com>

…oint The self-built attention mask used a square (seq_len, seq_len) shape from the current hidden states, which is wrong during decoding where the single query token must attend to the full kv cache. SDPA's fused CUDA kernel rejected the mismatched bias (reported as a contiguity error), failing all PASTA CUDA tests. The mask now spans the cached key positions via cache_position. Also replace the bare 'except: breakpoint()' in get_hooks with a chained ValueError so substring re-tokenization failures surface clearly.

Signed-off-by: Erik Miehling <emiehling@gmail.com>

emiehling added 9 commits June 24, 2026 10:37

chore: final re-run of relevant notebooks

716bca3

Signed-off-by: Erik Miehling <emiehling@gmail.com>

emiehling requested a review from ingelise June 25, 2026 14:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor input control with shared _common library and add EPR, CPO, GEPA, PRewrite methods#18

Refactor input control with shared _common library and add EPR, CPO, GEPA, PRewrite methods#18
emiehling wants to merge 9 commits into
IBM:mainfrom
emiehling:input-control-refactor

emiehling commented Jun 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

emiehling commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Input control

Other

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

emiehling commented Jun 25, 2026 •

edited

Loading