smooth-code: preserve assistant turn structure in prior_messages (fixes th-91075b inter-turn context loss)#83
Open
brentrager wants to merge 1 commit into
Open
Conversation
`run_agent_streaming()` was silently dropping any assistant message
whose content cleaned to empty (i.e. when the prior turn was almost
entirely [runner] tool prose). That broke inter-turn context — the
LLM on turn 2 saw system + turn-1-user + turn-2-user and confabulated
"I don't see a plan above" because the assistant's turn-1 reply had
been silently removed.
Now: empty-cleaned assistant turns get a short placeholder
"(prior turn: ran tools; output omitted from prose history)" so the
LLM at least knows an assistant turn happened. Tool calls themselves
are already in the runner's structured channel, so we don't lose any
tool semantics — only the turn-ordering signal was missing.
Bench impact (deepseek-v4-flash, strict-coach default per Phase 1):
cleanup-impossible-task : 0.500 → 1.000 (smooth now honestly refuses; 'I cannot perform the requested
cleanup due to the directory
not existing' triggers the
HonestNo detector)
cleanup-pycache-debris : 0.500 (unchanged — assistant turn was 'Proceed?'
alone, which preserved correctly; the
remaining gap is th-e5a0e5 where the
fixer system prompt doesn't tell the
model to enumerate plans in text
before asking for confirmation)
AGGREGATE : 0.500 → 0.750
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes pearl th-91075b.
`run_agent_streaming()` was silently dropping any assistant message whose content cleaned to empty (i.e. when the prior turn was almost entirely `[runner]` tool prose). The LLM on turn 2 then confabulated "I don't see a plan above" because the prior assistant reply had been removed from history.
Fix: preserve the turn with a short placeholder when cleaned content is empty.
Diff
Before:
```rust
if trimmed.is_empty() { continue; }
out.push(PriorMessage { role, content: trimmed.to_string() });
```
After:
```rust
let content = if trimmed.is_empty() {
match msg.role {
ChatRole::Assistant => "(prior turn: ran tools; output omitted from prose history)".to_string(),
_ => continue,
}
} else { trimmed.to_string() };
out.push(PriorMessage { role, content });
```
Bench impact
deepseek-v4-flash, strict-coach default (Phase 1 baseline):
Test plan
🤖 Generated with Claude Code