Skip to content

feat: fix memory conflict detection and persist tool inputs/outputs#91

Merged
cyyeh merged 1 commit into
mainfrom
feature/memory-and-tool-persistence
Mar 3, 2026
Merged

feat: fix memory conflict detection and persist tool inputs/outputs#91
cyyeh merged 1 commit into
mainfrom
feature/memory-and-tool-persistence

Conversation

@cyyeh
Copy link
Copy Markdown
Owner

@cyyeh cyyeh commented Mar 3, 2026

Summary

  • Add recall-before-save instructions to agent system prompt to prevent contradictory memories (e.g. "User likes X" and "User does not like X" coexisting)
  • Fix backend tool result parsing so non-SQL tool outputs (memory tools, etc.) are sent to the frontend instead of being silently dropped
  • Persist toolInput and output fields in SQLite metadata so tool call details survive conversation switches
  • Add vega-themes Power BI theme for Vega-Lite charts (light and dark mode)
  • Expand sidecar allowedTools with Bash, Read, Write, Edit, Glob, Grep

Test plan

  • Save a memory, then save a contradictory one — verify the agent recalls first, forgets the old, then saves the new
  • Check thinking block shows tool inputs and outputs for recall_memories, save_memory, forget_memory
  • Switch to another conversation and back — verify tool inputs/outputs are still visible
  • Verify Vega-Lite charts render with Power BI theme in both light and dark mode

🤖 Generated with Claude Code

- Add recall-before-save instruction to prevent contradictory memories
- Fix tool result parsing to preserve non-SQL output (memory tools, etc.)
- Persist toolInput and output fields in SQLite for conversation reload
- Add vega-themes Power BI theme for Vega-Lite charts
- Expand sidecar allowedTools with Bash, Read, Write, Edit, Glob, Grep

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 48fcc0da71d7
@cyyeh cyyeh merged commit 1c2df0e into main Mar 3, 2026
2 checks passed
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Mar 3, 2026

Claude finished @cyyeh's task in 2m 48s —— View job


Code Review for PR #91

  • Gather context and understand the request
  • Review backend/app/agent.py changes
  • Review frontend/src/utils/exportConversation.ts changes
  • Review frontend/src/components/VegaLiteChartWidget.tsx changes
  • Review sidecar/src/server.ts changes
  • Review frontend/package.json and package-lock.json changes
  • Compile and post comprehensive review

Summary

This PR delivers three independent improvements: memory conflict prevention, tool I/O persistence, and a Vega-Lite theme upgrade. The changes are well-scoped and address real UX gaps. A few issues are worth addressing in follow-up work.


✅ What's Good

Memory conflict detection (backend/app/agent.py:164-170)
The recall-before-save instruction sequence is the right pattern. The natural language instructions are explicit and actionable, and the "never leave contradictory memories" constraint is clearly stated.

Non-SQL tool result fix (backend/app/agent.py:714-722)
The new else branch correctly identifies non-SQL success results (memory tools, etc.) and extracts a meaningful output string rather than silently dropping them. The layered fallback parsed.get("memories") or parsed.get("message") or text is pragmatic.

VegaLiteChartWidget.tsx cleanup
The cancelled flag pattern prevents the viewRef from being set on an already-unmounted component. The finalize() call on cleanup is correct and prevents Vega's internal event listeners from leaking. The refactor from an ad-hoc theme object to powerbi from vega-themes is cleaner and more maintainable.

toolInput persistence (agent.py:561-563)

if not sql and tool_input:
    tool_seg["toolInput"] = tool_input

Correctly captures inputs for non-SQL tool calls at segment creation time, making them survive conversation switches.


⚠️ Issues to Address

1. Redundant toolInput assignment for ask_user_question (minor)
agent.py:581-584 sets seg["toolInput"] = tool_input again for ask_user_question, but this was already set at line 562 when the segment was created. This is a no-op now and should be removed to avoid confusion.

Fix this →

2. Brittle field-name hard-coding for non-SQL tool outputs
agent.py:716:

result_data["output"] = parsed.get("memories") or parsed.get("message") or text

This works for current MCP tool responses, but any new non-SQL tool that returns neither memories nor message will silently fall back to the raw JSON string in text. A more defensive approach would be to fall back to text directly (the last or text already does this, but relying on magic field names in the middle is fragile). Consider making all non-SQL tools return a consistent message field in their MCP response contract, or document this assumption.

3. as never type cast in VegaLiteChartWidget
VegaLiteChartWidget.tsx:59:

embed(containerRef.current, fullSpec as never, {

as never is a full type safety escape hatch that hides potential mismatches between your spec shape and vega-embed's expected TopLevelSpec. Prefer as TopLevelSpec from vega-lite/build/src/spec to retain some type safety:

import type { TopLevelSpec } from 'vega-lite';
// ...
embed(containerRef.current, fullSpec as unknown as TopLevelSpec, { ... })

Fix this →

4. DRY violation — dark mode overrides duplicated
The dark mode colour overrides are defined verbatim in two places:

  • VegaLiteChartWidget.tsx:8-26 (TypeScript object)
  • exportConversation.ts:216-222 (inline JS string in a template literal)

Since exportConversation.ts embeds these as a JavaScript string fragment for the exported HTML, full module sharing isn't possible. But the hex values and property names should at least be consistent and ideally referenced from a shared constants file to avoid drift.

5. Missing version pin for vega-themes CDN
exportConversation.ts:176:

<script src="https://cdn.jsdelivr.net/npm/vega-themes"><\/script>

All other CDN scripts in the same function include explicit versions (plotly-2.35.2.min.js, vega@5, vega-lite@5, vega-embed@6). An unpinned vega-themes could silently break exported conversations if a major version is published. Pin it to match the installed version from package.json (currently ^3.0.0 → pin to vega-themes@3).

Fix this →


🧪 Testing Gaps

No new tests were added for the changed functionality:

  • build_system_prompt memory instructions — the new recall-before-save instructions could be verified with a simple string assertion in a test for build_system_prompt.
  • Non-SQL tool result parsing path — the else branch at agent.py:714-716 (the memory tool output extraction) is not covered by any existing test.
  • toolInput/output persistence in segments — the segment population logic (agent.py:555-563, 770-781) is not unit-tested; it would benefit from a test that mocks the SSE stream and verifies the resulting persisted_segments structure.

📝 Nit

agent.py:238: There's an extra blank line between _build_message_with_history and stream_chat (three blank lines where two would be PEP 8 standard).


Overall this is solid, focused work. The memory conflict fix in particular closes a real correctness gap. The items above are mostly refinements rather than blockers — the PR was already merged, so these can be tracked as follow-up improvements.

@cyyeh cyyeh deleted the feature/memory-and-tool-persistence branch March 3, 2026 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant