Skip to content

feat(steering): dynamic steering — activation-conditioned steering#180

Draft
RhizoNymph wants to merge 125 commits into
feat/integrationfrom
feat/dynamic-steering
Draft

feat(steering): dynamic steering — activation-conditioned steering#180
RhizoNymph wants to merge 125 commits into
feat/integrationfrom
feat/dynamic-steering

Conversation

@RhizoNymph

@RhizoNymph RhizoNymph commented Jun 17, 2026

Copy link
Copy Markdown
Owner

Draft. Ties activation capture to activation steering so activations decide when/how to steer. Three controller tiers (async → sync → in-graph), each configuring the one below. Design authority: docs/design/dynamic_steering.md (+ dynamic_steering_apc_notification.md, dynamic_steering_row_gating.md).

What's here

  • Phase 0 — async transport. In-process steering_action_queue (bounded, decode-tier-only validation), drained at the top of _update_steering_buffers.
  • Phase 1a — sync consumers + per-request actuation. execution="sync" consumer axis (every TP rank, 1-step latency); dynamic-override row pool (pure routing); observability + GET /v1/steering/dynamic; event-based on_step timing.
  • Phase 1b — gain primitives. Per-row strength scale (§5.3) + dedicated-gather dynamic additive tier (§5.4).
  • Phase 2 — in-graph monitor. Graph-safe monitor op computes a per-token gate sigmoid(sharpness·(residual·probe − threshold)) and modulates the §5.4 tier same-forward; and per-request rows (decode-only, prefill protected via a decode mask) when gate_rows is set.
  • APC correctness. Worker→scheduler effective-decode-steering-signature notification so decode KV produced under dynamic steering is not falsely reused — resolves the streaming-continuation prefix-cache hole.
  • Example controller emit_mode = scale | monitor.

Status / validation

GPU-validated on gemma4-31B: tp=1 (per-request actuation, tier, APC reuse), tp=2 cross-node (rank-replication + APC re-keying), pp=2, active in-graph monitor (tier + row gating), and row-gating kernel/op/cudagraph parity. Extensive CPU suites.

Notes for review

…d activation_reward_producer

Co-authored-by: Claude
RhizoNymph and others added 30 commits July 1, 2026 22:58
fix(steering): content-keyed bounded probe tensor cache
fix(steering): bridged overrides preserve compose_admitted
fix(steering): fail-safe declarative gate resolution at admission
…arity

fix(steering): port declarative override parity (compose+precedence) to v2 runner
test(steering): update stale fixtures for post-#217/#219 runner state
…declarative-gate-fail-closed

# Conflicts:
#	tests/v1/test_steering_schema.py
fix(steering): declarative probe gates fail closed
chore(steering): typed RowOwner state + refcount-0 purge + dirty-state grouping
…ness

test(steering): cross-runner conformance harness for the control plane
fix(capture): port client_request_id sidecar + streaming metadata refresh to v1
…cksum

feat(steering): cross-rank applied-action checksum in dynamic status
fix(steering): warmup matches runtime row-monitor specialization; single-source op args
feat(steering): worker-registered named vectors + latch-by-reference
test(steering): drop stale second arg from scheduler override hook call
…e/steering-trust-hardening

# Conflicts:
#	vllm/v1/worker/steering_model_runner_mixin.py
test(steering): conformance harness tracks typed RowOwner keys
Bound latch memory + document dynamic-steering trust model
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant