ARIA: suppress '[image — caption failed]' stub from active threads#344
Open
holoduke wants to merge 1 commit into
Open
ARIA: suppress '[image — caption failed]' stub from active threads#344holoduke wants to merge 1 commit into
holoduke wants to merge 1 commit into
Conversation
…active threads
Stop polluting working-memory active-thread surface with literal
"[image — caption failed]" text on uncaptioned WhatsApp images.
Per consolidated insight n_cba1a7f0, the stub:
1. Falsely suggested unread conversation content on every reflect/think tick,
2. Resembled the prompt-injection refusal captions (n_imginject1, n_imginj02),
making real injection events harder to spot in the haystack,
3. Survived the Apr 27 saveConversationDigest fix and reappeared Apr 30 / May 9.
Changes:
- backend/integrations/whatsapp.ts: emit a clean "[image]" marker (no error
preamble) when describeImage returns null/refusal.
- backend/observer.ts: sanitizeImageCaption() now strips legacy stubs and
emits "[image]" instead. Added assertNoCaptionFailedStub() runtime
assertion in recordObservation() — strips and logs any observation whose
text matches /caption failed/i so future regressions are visible.
- backend/memory/working-memory.ts: updateConversationThreads() skips bare
media markers ("[image]" / "[voice]" / "[document]") for new threads, never
overwrites an informative topic with a bare marker, and rewrites any
persisted "caption failed" topic from prior runs to "(image)".
Verified with: npx tsc --noEmit (passes).
Intent-summary: Image-only WhatsApp messages whose vision-LLM caption fails were emitting a literal "[image — caption failed]" stub that polluted active-thread topics and resembled prompt-injection text.
Intent-tokens: image, caption, stub, pollution, threads, vision, marker
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stops the literal
[image — caption failed]text from polluting working-memory active-thread topics and from resembling prompt-injection captions in the observation stream.Per consolidated insight
n_cba1a7f0, the original Apr 27 fix did not hold: +9 stubs landed Apr 30 and a fresh one appeared May 9 (~11:16 CEST) on an Ilse image. This PR addresses the recurrence at three layers.Changes
backend/integrations/whatsapp.ts— whendescribeImage()returns null/refusal for an uncaptioned image, emit a clean[image]marker instead of[image — caption failed]. The clean marker keeps the structured prefix that downstream consumers already recognise but drops the misleading error text.backend/observer.ts—sanitizeImageCaption()now strips any legacy stub text (including replayed observations fromobservations.jsonl) and emits[image].assertNoCaptionFailedStub()runtime assertion runs insiderecordObservation(). If any observation'stextever matches/caption failed/iagain, it is stripped to a clean image marker and the writer is logged so future regressions are immediately traceable.backend/memory/working-memory.ts—updateConversationThreads():[image],[voice],[document]) when creating new threads — uncaptioned images from known contacts no longer pollute the active-thread surface.caption failedtopic from prior runs (rewrites to(image)) so the active-thread surface stops surfacing the stub on every reflect/think tick.Why option (a) over (b)
The task offered two paths: (a) suppress the stub, (b) replace with a structured marker. The implementation effectively does both — the bare
[image]is the structured marker, and skipping bare-marker observations from thread building is the suppression. Captioned images ([image] <description>) remain valid topics.Test plan
npx tsc --noEmitpasses[image](no stub); active-thread surface does not gain a"[image — caption failed]"topicworking-memory.jsonrewrite to(image)on next think tickobserver.tsis still sanitized (defense in depth retained)[image] description) continue to drive thread topics normally🤖 Generated with Claude Code