Description
Bug report
If the provider connection closes after a streamed tool call has been delivered but before a clean message completion, Crush retries the provider interaction instead of treating the turn as failed.
Before/after behavior
Before: the local mock provider streams a Bash tool call and then closes the SSE connection mid-stream.
After: Crush makes repeated provider requests against the same partial turn before exiting non-zero. In the latest local verification run, the reproducer observed 9 mid-stream connection closures and Crush exited with code 1.
Expected: Crush should fail the turn boundedly and avoid replaying the same partial streamed tool call as a fresh action.
Minimal reproducible example
Prerequisites: Docker, Python 3, and the GitHub CLI (gh) for the clone command below. The linked reproducer is self-contained and uses only Python standard-library modules plus Docker. It builds @charmland/crush@0.76.0 from the public npm package, starts a local mock provider, and runs the CLI in an isolated workspace. The Docker run is limited to 2 CPUs and 4 GiB RAM by default.
Complete self-contained reproducer: https://gist.github.com/N0zoM1z0/f6763bf4b628ed5beea9080e81c7b39d
gh gist clone f6763bf4b628ed5beea9080e81c7b39d crush-post-tool-transient-disconnect-reproducer
cd crush-post-tool-transient-disconnect-reproducer
python3 crush-post-tool-transient-disconnect.reproduce.py
To reuse an already-built local image:
python3 crush-post-tool-transient-disconnect.reproduce.py --skip-build
Expected successful reproduction output includes:
provider_requests=9
closed_mid_stream=9
process_exit=1
REPRODUCED
The target stderr includes:
Agent processing failed: failed to start agent processing stream: retry error: stream transport error: unexpected EOF
Version
0.76.0
Environment
OS/arch: Linux x86_64 Docker, using node:24-bookwormbased target image; Interface: CLI; Provider/model: OpenAI Chat-compatible local mock provider, gpt-4
Description
Bug report
If the provider connection closes after a streamed tool call has been delivered but before a clean message completion, Crush retries the provider interaction instead of treating the turn as failed.
Before/after behavior
Before: the local mock provider streams a
Bashtool call and then closes the SSE connection mid-stream.After: Crush makes repeated provider requests against the same partial turn before exiting non-zero. In the latest local verification run, the reproducer observed 9 mid-stream connection closures and Crush exited with code 1.
Expected: Crush should fail the turn boundedly and avoid replaying the same partial streamed tool call as a fresh action.
Minimal reproducible example
Prerequisites: Docker, Python 3, and the GitHub CLI (
gh) for the clone command below. The linked reproducer is self-contained and uses only Python standard-library modules plus Docker. It builds@charmland/crush@0.76.0from the public npm package, starts a local mock provider, and runs the CLI in an isolated workspace. The Docker run is limited to 2 CPUs and 4 GiB RAM by default.Complete self-contained reproducer: https://gist.github.com/N0zoM1z0/f6763bf4b628ed5beea9080e81c7b39d
gh gist clone f6763bf4b628ed5beea9080e81c7b39d crush-post-tool-transient-disconnect-reproducer cd crush-post-tool-transient-disconnect-reproducer python3 crush-post-tool-transient-disconnect.reproduce.pyTo reuse an already-built local image:
Expected successful reproduction output includes:
The target stderr includes:
Version
0.76.0
Environment
OS/arch: Linux x86_64 Docker, using node:24-bookwormbased target image; Interface: CLI; Provider/model: OpenAI Chat-compatible local mock provider, gpt-4