Skip to content

voice context reset + boot-time sub-agent discovery#7

Open
zdql wants to merge 1 commit into
mainfrom
feature/voice-context-reset-and-agent-discovery
Open

voice context reset + boot-time sub-agent discovery#7
zdql wants to merge 1 commit into
mainfrom
feature/voice-context-reset-and-agent-discovery

Conversation

@zdql
Copy link
Copy Markdown
Owner

@zdql zdql commented May 27, 2026

Summary

Two related additions, sharing the control socket and progress store already plumbed by #4.

1. Voice context reset

Long realtime sessions accumulate enough conversation history that the voice model starts drifting and (occasionally) the server rejects further turns. Three new ways to wipe the conversation back to a safe baseline without disturbing audio or the websocket itself:

  • Voice tool reset_voice_context(reason?) — the realtime model can self-reset when it senses overload or the user asks for a fresh start. Acked via function_call_output so the model keeps speaking naturally afterward.
  • Control socket Request::Reset exposed as gamechat reset [--reason TEXT] from a second terminal.
  • Auto-reset once tracked items exceed settings.auto_reset_after_items (clamped to [20, 2000]; omit or set to 0 to disable; disabled by default).

Mechanics (src/voice_loop/reset.rs):

  1. response.cancel if a response is in flight, so the server stops emitting audio for a turn whose item id we're about to delete.
  2. conversation.item.delete for every id observed via conversation.item.created (tracked in ConversationItemTracker, FIFO-capped at 4000 entries).
  3. Re-send the original session.update to re-baseline instructions, voice, tools, and turn detection.

The local playback buffer and mic input are deliberately left alone, so audio already handed to cpal finishes playing while the next turn comes back clean. Any queued response.create events that pre-dated the reset are dropped — the conversation they would have continued no longer exists.

2. Boot-time sub-agent discovery

When a new gamechat --realtime starts, it now scans runtime_dir() for sockets owned by other live gamechats and asks each peer for its active slug list (750ms per-peer timeout). Discovered slugs are stamped into the local ProgressStore under peer_<pid>_<slug> with a discovered:<pid> provider tag so they are visible in gamechat inspect but unambiguously not locally owned.

Standalone CLI: gamechat discover walks the runtime dir and prints every peer's slugs in one table — useful when triaging "who's running the background work I see in claude?" across multiple terminals.

Enabled by default; flip settings.discover_existing_subagents: false to opt out.

Drive-by

  • BASE_INSTRUCTIONS now tells the voice model when to call reset_voice_context and not to announce it.
  • Removed a duplicate spawn_server call in run_realtime_voice (the second bind was always failing on EADDRINUSE).

Risks

  • conversation.item.delete race. The server may have already produced audio for items we're deleting. The local playback buffer is intentionally untouched, so the cancelled turn's audio finishes; the next turn starts from a clean conversation. If OpenAI later changes conversation.item.delete semantics for items currently being voiced, behaviour could regress — reset_trigger_string_repr_is_stable and the e2e reset event order test will surface that on the next API smoke.
  • Item tracker capacity (4000) is heuristic. Long sessions that go past it before a reset will miss-delete the oldest items, leaving them in server context. Acceptable — auto-reset's default ceiling (2000) keeps the tracker well inside the cap.
  • Peer discovery does NOT authenticate sockets. Any process that can bind a *.sock under $XDG_RUNTIME_DIR/gamechat-$USER/ can answer the List probe. Same trust boundary as the existing inspect/tail clients — single-user runtime dir.
  • Leaked provider tags. seed_discovered_subagents uses Box::leak to satisfy register_job's &'static str requirement. Bounded by the number of distinct peer pids ever seen during the binary's lifetime — at most a handful.

Usage

# from a second terminal
gamechat reset                       # silent reset, conversation cleared
gamechat reset --reason ux_request   # logs the reason on the server
gamechat discover                    # table of every peer's active slugs

Optional ~/.config/gamechat/settings.json:

{
  \"auto_reset_after_items\": 400,
  \"discover_existing_subagents\": true
}

The voice model can also self-reset: it will call reset_voice_context (with reason=\"context_overload\" or reason=\"user_requested\") per the updated BASE_INSTRUCTIONS.

Test plan

  • `cargo build` clean, no warnings
  • `cargo test` — 71 tests pass (30 new), including:
    • reset event ordering (cancel → deletes → session.update)
    • ConversationItemTracker dedup + FIFO eviction at capacity
    • auto-reset threshold clamping (below min, above max, zero disables)
    • reset_voice_context tool definition + call handler (both event shapes)
    • control-socket Reset handler (dispatched + channel-closed cases)
    • discovery seed (multi-entry, empty input, missing last_message)
    • peer-query timeout when socket has no listener
  • Manual: run gamechat --realtime, talk for a while, run gamechat reset from another terminal, verify voice keeps speaking but next turn doesn't reference earlier context
  • Manual: start two gamechat --realtime processes; run gamechat discover and verify both peer's slugs appear
  • Manual: set auto_reset_after_items: 20 and confirm the auto-reset log line fires once the threshold is crossed

🤖 Generated with Claude Code

Two related additions, sharing the control socket and progress store
already plumbed by #4.

== voice context reset ==

Long realtime sessions accumulate enough conversation history that the
voice model starts drifting and (occasionally) the server rejects
further turns. Three new ways to wipe the conversation back to a safe
baseline without disturbing audio or the websocket itself:

  - voice tool `reset_voice_context(reason?)` — the realtime model can
    self-reset when it senses overload or the user asks for a fresh
    start. Acked with a function_call_output so the model keeps
    speaking naturally afterward.
  - control socket `Request::Reset { reason }` exposed as
    `gamechat reset [--reason TEXT]` from a second terminal.
  - auto-reset once tracked items exceed
    `settings.auto_reset_after_items` (clamped to [20, 2000]; omit or
    set to 0 to disable). Disabled by default.

Mechanics (see `src/voice_loop/reset.rs`):

  1. `response.cancel` if a response is in flight, so the server stops
     emitting audio for a turn whose item id we're about to delete.
  2. `conversation.item.delete` for every id we've observed via
     `conversation.item.created` (tracked in `ConversationItemTracker`,
     capped at 4000 entries with FIFO eviction).
  3. Re-send the original `session.update` to re-baseline instructions,
     voice, tools, and turn detection.

The local playback buffer and mic input are deliberately left alone, so
audio already handed to cpal finishes playing while the next turn comes
back clean. Any queued `response.create` events that pre-dated the
reset are dropped — the conversation they would have continued no
longer exists.

== boot-time sub-agent discovery ==

When a new `gamechat --realtime` starts, it now scans `runtime_dir()`
for sockets owned by other live gamechats and asks each peer for its
active slug list (with a 750ms per-peer timeout). Discovered slugs are
stamped into the local `ProgressStore` under `peer_<pid>_<slug>` with a
`discovered:<pid>` provider tag so they are visible in `gamechat
inspect` but unambiguously not locally owned.

Standalone CLI: `gamechat discover` walks the runtime dir and prints
every peer's slugs in one table — useful when triaging "who's running
the background work I see in claude?" across multiple terminals.

Discovery is enabled by default; flip
`settings.discover_existing_subagents: false` to opt out.

== other ==

  - `BASE_INSTRUCTIONS` updated to tell the voice model when to call
    `reset_voice_context` and not to announce it.
  - duplicate `spawn_server` call in `run_realtime_voice` removed (was
    binding the control socket twice, second bind always failed).
  - 30 new tests covering the reset event sequence, item tracker
    bounding/dedup, auto-reset threshold clamping, discovery seeding,
    peer-query timeout, the new `reset_voice_context` tool definition,
    and the control-socket Reset handler (both happy path and
    voice-loop-channel-closed). Total: 71 tests, all passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/main.rs
return Err(format!(
"reset takes no positional arguments, got: {}. Use --reason <text> if you want to record one.",
positional.join(" ")
));
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't make any sense, gamechat can't call itself?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant