Skip to content

ARIA: suppress '[image — caption failed]' stub from active threads#344

Open
holoduke wants to merge 1 commit into
mainfrom
aria/suppress-image-caption-failed-stub
Open

ARIA: suppress '[image — caption failed]' stub from active threads#344
holoduke wants to merge 1 commit into
mainfrom
aria/suppress-image-caption-failed-stub

Conversation

@holoduke

@holoduke holoduke commented May 9, 2026

Copy link
Copy Markdown
Owner

Summary

Stops the literal [image — caption failed] text from polluting working-memory active-thread topics and from resembling prompt-injection captions in the observation stream.

Per consolidated insight n_cba1a7f0, the original Apr 27 fix did not hold: +9 stubs landed Apr 30 and a fresh one appeared May 9 (~11:16 CEST) on an Ilse image. This PR addresses the recurrence at three layers.

Changes

  • backend/integrations/whatsapp.ts — when describeImage() returns null/refusal for an uncaptioned image, emit a clean [image] marker instead of [image — caption failed]. The clean marker keeps the structured prefix that downstream consumers already recognise but drops the misleading error text.
  • backend/observer.ts
    • sanitizeImageCaption() now strips any legacy stub text (including replayed observations from observations.jsonl) and emits [image].
    • New assertNoCaptionFailedStub() runtime assertion runs inside recordObservation(). If any observation's text ever matches /caption failed/i again, it is stripped to a clean image marker and the writer is logged so future regressions are immediately traceable.
  • backend/memory/working-memory.tsupdateConversationThreads():
    • Skips bare media markers ([image], [voice], [document]) when creating new threads — uncaptioned images from known contacts no longer pollute the active-thread surface.
    • Never overwrites an informative topic with a bare marker.
    • Cleans any persisted caption failed topic from prior runs (rewrites to (image)) so the active-thread surface stops surfacing the stub on every reflect/think tick.

Why option (a) over (b)

The task offered two paths: (a) suppress the stub, (b) replace with a structured marker. The implementation effectively does both — the bare [image] is the structured marker, and skipping bare-marker observations from thread building is the suppression. Captioned images ([image] <description>) remain valid topics.

Test plan

  • npx tsc --noEmit passes
  • Next incoming uncaptioned image from any contact: observation text is [image] (no stub); active-thread surface does not gain a "[image — caption failed]" topic
  • Existing polluted threads in working-memory.json rewrite to (image) on next think tick
  • Vision-LLM refusal text that reaches observer.ts is still sanitized (defense in depth retained)
  • Captioned images ([image] description) continue to drive thread topics normally

🤖 Generated with Claude Code

…active threads

Stop polluting working-memory active-thread surface with literal
"[image — caption failed]" text on uncaptioned WhatsApp images.

Per consolidated insight n_cba1a7f0, the stub:
1. Falsely suggested unread conversation content on every reflect/think tick,
2. Resembled the prompt-injection refusal captions (n_imginject1, n_imginj02),
   making real injection events harder to spot in the haystack,
3. Survived the Apr 27 saveConversationDigest fix and reappeared Apr 30 / May 9.

Changes:
- backend/integrations/whatsapp.ts: emit a clean "[image]" marker (no error
  preamble) when describeImage returns null/refusal.
- backend/observer.ts: sanitizeImageCaption() now strips legacy stubs and
  emits "[image]" instead. Added assertNoCaptionFailedStub() runtime
  assertion in recordObservation() — strips and logs any observation whose
  text matches /caption failed/i so future regressions are visible.
- backend/memory/working-memory.ts: updateConversationThreads() skips bare
  media markers ("[image]" / "[voice]" / "[document]") for new threads, never
  overwrites an informative topic with a bare marker, and rewrites any
  persisted "caption failed" topic from prior runs to "(image)".

Verified with: npx tsc --noEmit (passes).

Intent-summary: Image-only WhatsApp messages whose vision-LLM caption fails were emitting a literal "[image — caption failed]" stub that polluted active-thread topics and resembled prompt-injection text.
Intent-tokens: image, caption, stub, pollution, threads, vision, marker

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant