Skip to content

Hermes Runner: data race on scanner + r.running across Prompt/Start #20

@rafeegnash

Description

@rafeegnash

Problem

The Hermes Runner's Prompt goroutine reads r.scanner.Scan() without holding r.mu, while Start also uses the same scanner during the init handshake. Two overlapping prompts could race on the same scanner and deliver a response to the wrong consumer (matched by ID, but only after stealing lines from the other prompt). r.running is also touched lock-free.

go test -race would flag this — it isn't run on the package.

Where

internal/hermes/runner.go:268Prompt reading r.scanner.Scan()
internal/hermes/runner.go:115-181Start handshake on same scanner
internal/hermes/runner.gor.running writes/reads without atomic or mutex

Fix

  1. Move scanner reads into a single goroutine started in Start (a dispatch loop)
  2. Maintain a map[int64]chan response under r.mu keyed by request ID
  3. Each Prompt allocates an ID, registers a channel, sends the request, and reads its own channel
  4. Convert r.running to atomic.Bool

Acceptance criteria

  • go test -race ./internal/hermes/... clean
  • New test: two concurrent Prompt calls on the same Runner each get their own response, no cross-talk
  • Bridge process death is observed by the dispatch loop and propagated to all waiting Prompt channels via context cancel

Labels

bug, priority: critical (use defaults available in clanker-cli)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpriority: criticalMust fix immediately - security or data loss risk

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions