✨(conversation) summarize messages by camilleAND · Pull Request #421 · suitenumerique/conversations

camilleAND · 2026-04-23T13:30:57Z

Summary

This PR bounds conversation context size by combining a persisted incremental summary, a message index checkpoint, and cleanup of old tool results, while keeping the full history in the database for the UI and persistence.

Summary generation is token-budget driven, not triggered on a fixed turn interval.

What changed

Backend — history compaction

New history_processors.py module that:
- compacts older ToolReturnPart entries (short trace; only the latest tool cycle stays in full detail),
- estimates token weight of the active history,
- generates an incremental summary when active history exceeds the conversation token budget,
- only summarizes messages not yet covered: messages[checkpoint:next_checkpoint], merged with the existing history_summary.
Persistence on ChatConversation:
- history_summary: current summary injected as system instructions,
- history_summary_checkpoint: end-exclusive index of pydantic messages already included in that summary.
Runtime history window after summarization:
- messages from max(0, checkpoint - CONVERSATION_SUMMARY_CONTEXT_TURNS) (by message index, not user-turn count),
- the summary carries the rest of the context.

Backend — Pydantic AI integration

Before agent.iter(): _apply_history_cleanup() cleans tools, checks the budget, generates a summary when needed, updates summary/checkpoint, and passes only the active history to the agent. The new summary is therefore available in the same turn via history_summary_instructions().
During the run (internal pydantic-ai tool steps): _history_processor() only runs clean_tool_history() — no re-summarization and no DB checkpoint manipulation on an already compacted history.
UI summarize event with summary_scope: "conversation" emitted before generation; result sent once summarization completes (before the main model call).

Token budget (conversation vs documents)

Split aligned with document context (PR #396 / hybrid document context):

usable_context = max(max_token_context - DOCUMENT_CONTEXT_SECURITY_BUFFER_TOKENS, 0)
conversation_budget = usable_context × (1 - DOCUMENT_CONTEXT_BUDGET_RATIO)
document_budget     = usable_context × DOCUMENT_CONTEXT_BUDGET_RATIO

The security buffer is subtracted once from max_token_context; the remainder is shared between conversation and documents.

Frontend

Dedicated label for conversation summarization: “Summarizing conversation...” (summary_scope === 'conversation').

Tests

test_history_processors.py: budget, checkpoint, active window, no re-summarization when checkpoint is current, tool compaction.
Document context tests updated for the new budget split.

Why

Long conversations (especially with tools) were inflating the prompt quickly: large tool returns, full pydantic history sent to the model. We keep the full history in the DB for display and continuity, but limit what the model receives each turn via summary + recent window + lighter tools.

Test plan

pytest src/backend/chat/tests/agents/test_history_processors.py
Long conversation: budget exceeded → one summary per run, not on every following message while the active window stays under budget
After summarization: system prompt includes the new summary in the same turn
Tool calls: correct loading state, “Summarizing conversation...” label for conversation scope
Migration: history_summary + history_summary_checkpoint on existing conversations
With documents: verify conversation and documents correctly share usable_context (no double subtraction of the buffer)

Notes for reviewers

During tool runs, pydantic-ai may call the history processor again on an already reduced history; this is intentional and limited to tool cleanup only.

Summary by CodeRabbit

New Features
- Automatic conversation summarization and history compaction to keep chat context within token budgets; conversation summaries are persisted and checkpointed.
Improvements
- More accurate document/context budgeting and model-configuration parsing.
- UI messaging clarifies conversation vs general summarization.
Tests
- New comprehensive tests for history cleanup, summarization triggers, and document-context behavior.
Documentation
- Updated docs and changelog describing summarization and configuration settings.
Chores
- Database migrations to add summary fields.

coderabbitai · 2026-04-23T13:31:10Z

Warning

Rate limit exceeded

@camilleAND has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 11 minutes and 13 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 282c7469-bf2f-4d13-94a2-019d49e99735

📥 Commits

Reviewing files that changed from the base of the PR and between 9452fb5 and 89dde54.

📒 Files selected for processing (16)

CHANGELOG.md
docs/attachments.md
docs/env.md
src/backend/chat/agents/history_processors.py
src/backend/chat/clients/pydantic_ai.py
src/backend/chat/document_context_builder.py
src/backend/chat/llm_configuration.py
src/backend/chat/migrations/0008_chatconversation_history_summary.py
src/backend/chat/migrations/0009_chatconversation_history_summary_checkpoint.py
src/backend/chat/models.py
src/backend/chat/tests/agents/test_history_processors.py
src/backend/chat/tests/clients/pydantic_ai/test_document_context_window.py
src/backend/chat/tests/clients/pydantic_ai/test_thinking_history_stripping.py
src/backend/chat/tests/test_document_context_builder.py
src/backend/conversations/settings.py
src/frontend/apps/conversations/src/features/chat/components/MessageItem.tsx

Walkthrough

Adds history compaction and checkpointed summarization, integrates summarization into the Pydantic‑AI agent pipeline with in-memory/persisted summary state, adjusts document-context budgeting and settings, adds migrations and tests, and refines a frontend loader text/selection.

Changes

History processing and integration

Layer / File(s)	Summary
History processors (core) `src/backend/chat/agents/history_processors.py`	`HistoryCleanupResult` dataclass; stringification helpers; `conversation_summarization` calling SummarizationAgent; `clean_tool_history` and `safe_clean_tool_history`; token estimation; active-history builders; `maybe_summarize_history` and `should_generate_conversation_summary`.
History processors tests `src/backend/chat/tests/agents/test_history_processors.py`	Pytest suite with fixtures capturing provider-received ModelMessages and tests for compaction, summarization triggers, checkpoint advancement, and fallback behavior; related test updates in document-context and thinking-history suites.
Pydantic-AI client integration `src/backend/chat/clients/pydantic_ai.py`	AIAgentService loads model configuration once, tracks `_history_summary` and `_history_summary_checkpoint`, registers history_processors, snapshots/validates `raw_history`, computes `usable_token_context`, decides/emits `summarize` tool calls, applies cleanup via `_apply_history_cleanup`, persists advances, and injects `SUMMARY_SYSTEM_PREFIX`+summary into dynamic instructions.
LLModel config & document budgeting `src/backend/chat/llm_configuration.py`, `src/backend/chat/document_context_builder.py`	`validate_max_token_context` accepts Any and resolves strings/env values; `build_document_context_instruction` drops `security_buffer_tokens` argument and computes `document_budget = max(int(max_token_context * budget_ratio), 0)`.
Conversation summary settings & docs `src/backend/conversations/settings.py`, `docs/attachments.md`, `docs/env.md`	Adds `CONVERSATION_SUMMARY_CONTEXT_MESSAGES` and `CONVERSATION_SUMMARY_MAX_TOKENS`, increases `DOCUMENT_CONTEXT_SECURITY_BUFFER_TOKENS` default, and documents the summarization trigger, budgeting split (usable_context → documents vs messages), and related env vars.
Database fields and migrations `src/backend/chat/models.py`, `src/backend/chat/migrations/*`	Adds `ChatConversation.history_summary` (TextField) and `history_summary_checkpoint` (PositiveIntegerField) with migrations `0008`/`0009`.
Frontend: MessageItem summarize UI `src/frontend/apps/conversations/src/features/chat/components/MessageItem.tsx`	Select active tool invocation by reverse scan filtering `document_parsing` and `state==='result'`; require truthy `activeToolInvocation` to show loader; loader text distinguishes conversation-scoped summarization.
Changelog entry `CHANGELOG.md`	Added unreleased entry: ✨(conversation) summarize messages.

Sequence Diagram

sequenceDiagram
  participant Client as ConversationAgent / AIAgentService
  participant Cleaner as clean_tool_history
  participant Decider as should_generate_conversation_summary
  participant Summarizer as SummarizationAgent
  participant DB as ChatConversation (persist)
  Client->>Cleaner: cleaned = clean_tool_history(raw_history)
  Cleaner->>Decider: check token budget & active window
  alt need summary
    Decider->>Summarizer: summarize(messages[from checkpoint:])
    Summarizer->>Client: summary text
    Client->>DB: persist history_summary and history_summary_checkpoint
    Client->>Client: return HistoryCleanupResult(history, summary, checkpoint)
  else no summary
    Cleaner->>Client: return HistoryCleanupResult(cleaned_history, None, None)
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

suitenumerique/conversations#412: Related changes to the summarize tool registration/prompt metadata used by conversation summarization.
suitenumerique/conversations#200: Overlaps in Pydantic‑AI instruction assembly paths that this PR modifies to inject summary context.
suitenumerique/conversations#396: Overlaps on document-context instruction budgeting and the build_document_context_instruction signature changes.

Suggested labels

backend, enhancement, frontend

Suggested reviewers

providenz
qbey

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 73.02% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main feature addition: conversation message summarization. It directly reflects the primary change across the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch camand/clean_history

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (9)

src/backend/chat/agents/history_processors.py (2)

28-41: Dead fallback: _extract_latest_summary_from_instructions.

conversation_summarization only falls back to this helper when previous_summary is None, but history_cleanup always forwards previous_summary from the caller, and pydantic_ai.py always provides it from conversation.history_summary. The only scenario where this fallback runs is direct external callers of conversation_summarization, which don't exist in this PR. Consider either:

removing the fallback (and the helper) to simplify, or

adding a test that exercises it so it doesn't bit-rot.

Also on line 30, marker = f"{SUMMARY_SYSTEM_PREFIX}" can be marker = SUMMARY_SYSTEM_PREFIX (no interpolation).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/agents/history_processors.py` around lines 28 - 41, The
helper _extract_latest_summary_from_instructions is effectively dead
(conversation_summarization only uses it when previous_summary is None but
history_cleanup and pydantic_ai always supply previous_summary), so either
remove the helper and its call sites to simplify the codebase (delete
_extract_latest_summary_from_instructions and any logic that invokes it), or add
a unit test that exercises _extract_latest_summary_from_instructions via an
external caller to prevent bit-rot; additionally, simplify the marker assignment
inside that function by replacing marker = f"{SUMMARY_SYSTEM_PREFIX}" with
marker = SUMMARY_SYSTEM_PREFIX to avoid unnecessary interpolation.
19-19: Long line; consider splitting for readability.

SUMMARY_SYSTEM_PREFIX is a single ~100-char string literal. Not a bug, but breaking it over two lines keeps grep-ability while respecting line length:
-SUMMARY_SYSTEM_PREFIX = "[Conversation summary from previous turns] (context only, not a user request):\n"
+SUMMARY_SYSTEM_PREFIX = (
+    "[Conversation summary from previous turns] "
+    "(context only, not a user request):\n"
+)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/agents/history_processors.py` at line 19,
SUMMARY_SYSTEM_PREFIX is a long single-line string; split it into two (or more)
shorter adjacent string literals inside parentheses so Python concatenates them
(preserving the exact value) to improve readability while keeping the literal
searchable; update the declaration of SUMMARY_SYSTEM_PREFIX to use this
parenthesized multi-part string form so the prefix text remains identical but is
wrapped across lines.

src/backend/chat/tests/agents/test_history_processors.py (2)

113-127: Nit: lexical ordering of messages in the closure.

fake_summary references the outer messages before it's bound on line 121 — this works because the closure is called later (during history_cleanup), but flipping the order makes the dependency obvious and avoids tripping readers/linters:

♻️ Proposed reorder

 async def test_history_cleanup_turn_6_updates_summary_and_keeps_last_two_turns(monkeypatch):
+    messages = _build_turns(6)
+
     async def fake_summary(summary_messages, *, max_tokens=300, previous_summary=None):
         assert previous_summary == "summary-v1"
         assert len(summary_messages) == 6
         assert summary_messages == messages[-6:]
         return "summary-v2"

     monkeypatch.setattr(history_processors, "conversation_summarization", fake_summary)
-    messages = _build_turns(6)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/tests/agents/test_history_processors.py` around lines 113 -
127, In the test
test_history_cleanup_turn_6_updates_summary_and_keeps_last_two_turns, define
messages before the fake_summary closure so the closure doesn't capture an
unbound outer variable; move the messages = _build_turns(6) line above the async
def fake_summary(...) that asserts and references messages, then keep the
monkeypatch.setattr(history_processors, "conversation_summarization",
fake_summary) and the call to history_processors.history_cleanup(...) as-is so
fake_summary uses the already-bound messages when invoked.

58-158: Cadence coverage LGTM, but consider adding a no-previous-summary-at-turn-5 case.

The suite exercises turn 3/4/5/6/8/9 well. One gap: behavior at a non-interval turn (e.g., turn 5) with previous_summary=None. Per history_cleanup, that path returns HistoryCleanupResult(history=cleaned_history) with the full history — worth locking in with a test so future refactors that trim unconditionally are caught.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/tests/agents/test_history_processors.py` around lines 58 -
158, Add a test that verifies behavior at turn 5 when previous_summary is None:
create a new async test (e.g.,
test_history_cleanup_turn_5_no_previous_summary_keeps_full_history) that builds
messages via _build_turns(5), monkeypatches
history_processors.conversation_summarization to raise if called (ensuring
summarization is not invoked), calls await
history_processors.history_cleanup(messages, previous_summary=None), and asserts
result.summary is None and result.history == messages (full history preserved).

src/backend/chat/llm_configuration.py (2)

150-158: Unit/range validation.

int(value) will happily accept 0 or negatives, which are nonsensical for a context window. Since the field is optional, consider either using Field(gt=0) on the attribute or asserting a positive value inside the validator, so misconfigurations fail loudly at config load.

♻️ Proposed tightening

-    max_token_context: int | None = None
+    max_token_context: int | None = Field(default=None, gt=0)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/llm_configuration.py` around lines 150 - 158, The validator
validate_max_token_context for the max_token_context field should reject zero or
negative values; update it to parse the value as now but then assert it's > 0
(raising a ValueError with a clear message) before returning, or alternatively
add a pydantic Field(gt=0) constraint to the max_token_context attribute so
pydantic enforces positivity; ensure None is still allowed and that
_get_setting_or_env_or_value(value) is used before the positivity check.

144-158: Field naming: consider max_tokens_context (plural) for consistency.

The PR description and the planned follow-up (#396) refer to max_tokens_context, and the existing setting in LLMSettings is the plural max_tokens. The new field uses the singular max_token_context, which will be baked into the JSON LLM config schema once released. Renaming later is a breaking change for any deployment that adopts this key. Consider aligning now:

♻️ Suggested rename

-    max_token_context: int | None = None
+    max_tokens_context: int | None = None
     system_prompt: SettingEnvValue
     tools: list[str]
     web_search: SettingEnvValue | None = None
     concatenate_instruction_messages: bool | None = None

-    `@field_validator`("max_token_context", mode="before")
+    `@field_validator`("max_tokens_context", mode="before")
     `@classmethod`
-    def validate_max_token_context(cls, value: Any) -> int | None:
-        """Parse max_token_context from literal, setting, or env value."""
+    def validate_max_tokens_context(cls, value: Any) -> int | None:
+        """Parse max_tokens_context from literal, setting, or env value."""

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/llm_configuration.py` around lines 144 - 158, Rename the
field max_token_context to max_tokens_context to match existing LLMSettings
naming and PR `#396`; update the dataclass attribute, the validator decorator and
its name (e.g., validate_max_token_context → validate_max_tokens_context) so the
`@field_validator` targets "max_tokens_context", and update all references/usages
of max_token_context across the module (including any schema generation or
LLMSettings mappings) to the new plural name to avoid introducing a breaking
JSON key.

src/backend/chat/clients/pydantic_ai.py (2)

1141-1167: Initial "already done" check is effectively dead; simplify the summary tool lifecycle.

Pydantic-AI's history_processors are invoked when the first model request is processed inside the async iterator, not at conversation_agent.iter(...) context entry. At line 1142 _history_summary_generated_for_run was just reset to False on line 1082 and _history_processor has not yet been called, so this branch never fires in practice — the in-loop check on lines 1152-1161 and the fallback on 1163-1167 cover all real cases.

Consider collapsing to a single emit point after the node loop finishes (or right after the first yielded event) to reduce three near-duplicate emit sites to one:

♻️ Suggested simplification

-                state = StreamingState()
-                summary_tool_result_sent = False
-                if (
-                    should_emit_summary_event
-                    and self._history_summary_generated_for_run
-                ):
-                    yield events_v4.ToolResultPart(
-                        tool_call_id=tool_call_id,
-                        result={"state": "done"},
-                    )
-                    summary_tool_result_sent = True
-                async for event in self._process_agent_nodes(run, state, langfuse):
-                    if (
-                        should_emit_summary_event
-                        and not summary_tool_result_sent
-                        and self._history_summary_generated_for_run
-                    ):
-                        yield events_v4.ToolResultPart(
-                            tool_call_id=tool_call_id,
-                            result={"state": "done"},
-                        )
-                        summary_tool_result_sent = True
-                    yield event
-                if should_emit_summary_event and not summary_tool_result_sent:
-                    yield events_v4.ToolResultPart(
-                        tool_call_id=tool_call_id,
-                        result={"state": "done"},
-                    )
+                state = StreamingState()
+                summary_tool_result_sent = False
+                async for event in self._process_agent_nodes(run, state, langfuse):
+                    if should_emit_summary_event and not summary_tool_result_sent:
+                        yield events_v4.ToolResultPart(
+                            tool_call_id=tool_call_id,
+                            result={"state": "done"},
+                        )
+                        summary_tool_result_sent = True
+                    yield event
+                if should_emit_summary_event and not summary_tool_result_sent:
+                    yield events_v4.ToolResultPart(
+                        tool_call_id=tool_call_id,
+                        result={"state": "done"},
+                    )

Note: the gating on _history_summary_generated_for_run can be dropped entirely because the ToolCallPart on line 1089 is only emitted when should_emit_summary_event is true; the UI just needs a matching ToolResultPart regardless of whether the backend actually produced a new summary (e.g., summarization agent error returns None). The existing fallback at 1163 already relies on this.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/clients/pydantic_ai.py` around lines 1141 - 1167, The
initial pre-loop check that emits a ToolResultPart is effectively dead because
_history_summary_generated_for_run is reset before history_processors run;
remove the early branch that yields events_v4.ToolResultPart before iterating
_process_agent_nodes and instead emit the ToolResultPart exactly once after
processing agent nodes (or immediately after the first yielded event) when
should_emit_summary_event is true; update the logic in the method that contains
the loop over self._process_agent_nodes to only use the in-loop guard (or a
single post-loop fallback) and drop the redundant check of
_history_summary_generated_for_run so ToolResultPart emissions remain correct
and deduplicated.

484-497: Nit: trailing blank lines.

Extra blank line at 497-498 after _history_processor. Minor formatting.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/clients/pydantic_ai.py` around lines 484 - 497, Remove the
extraneous trailing blank line after the _history_processor method to satisfy
formatting; open the pydantic_ai.py file, locate the async def
_history_processor(self, history: list[ModelMessage]) -> list[ModelMessage]:
method, and delete the extra empty line(s) following the return
cleanup_result.history so the function ends cleanly with a single newline/EOF.

src/frontend/apps/conversations/src/features/chat/components/MessageItem.tsx (1)

388-396: Flatten the nested ternary for readability.

SonarCloud flags the nested ternary in the loader label. Extracting the label into a small helper (or a useMemo) improves readability without changing behavior and keeps t() call sites discoverable for i18n tooling.

♻️ Proposed refactor

+              {(() => {
+                if (activeToolInvocation.toolName !== 'summarize') {
+                  return <Text $variation="600" $size="md">{t('Search...')}</Text>;
+                }
+                const scope = (
+                  activeToolInvocation.args as { summary_scope?: string } | undefined
+                )?.summary_scope;
+                return (
+                  <Text $variation="600" $size="md">
+                    {scope === 'conversation'
+                      ? t('Summarizing conversation...')
+                      : t('Summarizing...')}
+                  </Text>
+                );
+              })()}
-                  <Text $variation="600" $size="md">
-                    {activeToolInvocation.toolName === 'summarize'
-                      ? (activeToolInvocation.args as { summary_scope?: string })
-                          ?.summary_scope === 'conversation'
-                        ? t('Summarizing conversation...')
-                        : t('Summarizing...')
-                      : t('Search...')}
-                  </Text>

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/frontend/apps/conversations/src/features/chat/components/MessageItem.tsx`
around lines 388 - 396, The nested ternary in MessageItem's loader label makes
the JSX hard to read; extract the logic into a small helper or a useMemo (e.g.,
const loaderLabel = useMemo(() => getLoaderLabel(activeToolInvocation, t),
[activeToolInvocation, t]) or a function getLoaderLabel) that returns the
correct translated string based on activeToolInvocation.toolName and
(activeToolInvocation.args as { summary_scope?: string })?.summary_scope, and
then render <Text>{loaderLabel}</Text>; keep each branch calling t(...) directly
so i18n tooling still detects the strings.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/backend/chat/agents/history_processors.py`:
- Around line 77-81: Remove the commented-out argument and either delete the
line or replace it with a clarifying comment: in the call to
summarization_agent.run (the resp = await summarization_agent.run(...) call),
remove "#message_history=messages," and, if keeping a comment, explain briefly
that messages are included in the prompt intentionally to avoid recursively
summarizing a history that already contains the summary marker (instead of
passing them via the message_history parameter).
- Around line 55-87: The prompt currently injects messages as a raw repr
(f"{messages}") which exposes internal metadata and bloats tokens; replace this
by formatting the list[ModelMessage] into a plain role-prefixed transcript
before interpolation (implement a helper like
_format_messages_for_prompt(messages) that extracts only user/assistant/tool
text parts and returns a clean "\n"-joined string), use
f"{_format_messages_for_prompt(messages)}\n\n" in the prompt construction (refer
to symbols messages and prompt), and also add the missing newline after the
"Only answer..." instruction so it doesn't run into "## Updated Summary:"; keep
the rest of the summarization_agent.run call and error handling unchanged.
- Around line 219-254: The summary branch can drop newly-accumulated turns when
summarization fails and gives no signal to retry; modify the
HistoryCleanupResult returned from the summarization branch (the block using
conversation_summarization and returning HistoryCleanupResult(...)) to include a
retry flag (e.g., add a retry_summary: bool field to HistoryCleanupResult) that
is set to True when summary is None and previous_summary is None so callers can
retry/log; also change the summary_input selection logic around
_take_last_turns(cleaned_history, turns=summary_interval_turns) to use the
window "since last checkpoint" (e.g., compute turns_since_last_checkpoint rather
than always summary_interval_turns) or at minimum add a clear comment in the
conversation_summarization branch documenting the assumption and the potential
loss when summarization fails, referencing conversation_summarization,
_take_last_turns, summary_interval_turns, previous_summary and
HistoryCleanupResult so reviewers can locate the change.
- Around line 103-156: The cleanup is introducing UserPromptPart placeholders in
_clean_request_parts which makes previously tool-only ModelRequest messages
count as user turns; change the replacement to a non-user part (e.g., use
SystemPromptPart or a TextPart placed in a non-request response role) instead of
UserPromptPart so _count_turns (which looks for UserPromptPart inside
ModelRequest) remains stable; update _clean_request_parts to create that
non-counting placeholder for non-latest ToolReturnPart instances (referencing
_clean_request_parts, ToolReturnPart, UserPromptPart, SystemPromptPart/TextPart,
and _count_turns/ModelRequest).

In `@src/backend/chat/clients/pydantic_ai.py`:
- Around line 1080-1093: Wrap the call to
ModelMessagesTypeAdapter.validate_python(self.conversation.pydantic_messages) in
a try/except that catches pydantic ValidationError (and a broad Exception
fallback), log the exception (including context such as conversation id or
self.conversation) via the existing logger, and on error set raw_history = [] so
execution continues; keep the surrounding logic that computes
should_emit_summary_event and yields the ToolCallPart (tool_name="summarize")
unchanged so a single bad persisted message row won’t abort the stream.

In `@src/backend/conversations/settings.py`:
- Around line 685-699: The defaults CONVERSATION_SUMMARY_INTERVAL_TURNS and
CONVERSATION_SUMMARY_CONTEXT_TURNS are currently 25/25 which may exceed tight
model context budgets; change these defaults to a safer lower value (e.g.,
10/10) and validate them against your smallest target max_tokens_context or add
a note to revisit when the token-aware trigger is implemented; also fix the typo
in the comment from "docoument_context_ratio" to "document_context_ratio" so the
explanatory comment is correct.

---

Nitpick comments:
In `@src/backend/chat/agents/history_processors.py`:
- Around line 28-41: The helper _extract_latest_summary_from_instructions is
effectively dead (conversation_summarization only uses it when previous_summary
is None but history_cleanup and pydantic_ai always supply previous_summary), so
either remove the helper and its call sites to simplify the codebase (delete
_extract_latest_summary_from_instructions and any logic that invokes it), or add
a unit test that exercises _extract_latest_summary_from_instructions via an
external caller to prevent bit-rot; additionally, simplify the marker assignment
inside that function by replacing marker = f"{SUMMARY_SYSTEM_PREFIX}" with
marker = SUMMARY_SYSTEM_PREFIX to avoid unnecessary interpolation.
- Line 19: SUMMARY_SYSTEM_PREFIX is a long single-line string; split it into two
(or more) shorter adjacent string literals inside parentheses so Python
concatenates them (preserving the exact value) to improve readability while
keeping the literal searchable; update the declaration of SUMMARY_SYSTEM_PREFIX
to use this parenthesized multi-part string form so the prefix text remains
identical but is wrapped across lines.

In `@src/backend/chat/clients/pydantic_ai.py`:
- Around line 1141-1167: The initial pre-loop check that emits a ToolResultPart
is effectively dead because _history_summary_generated_for_run is reset before
history_processors run; remove the early branch that yields
events_v4.ToolResultPart before iterating _process_agent_nodes and instead emit
the ToolResultPart exactly once after processing agent nodes (or immediately
after the first yielded event) when should_emit_summary_event is true; update
the logic in the method that contains the loop over self._process_agent_nodes to
only use the in-loop guard (or a single post-loop fallback) and drop the
redundant check of _history_summary_generated_for_run so ToolResultPart
emissions remain correct and deduplicated.
- Around line 484-497: Remove the extraneous trailing blank line after the
_history_processor method to satisfy formatting; open the pydantic_ai.py file,
locate the async def _history_processor(self, history: list[ModelMessage]) ->
list[ModelMessage]: method, and delete the extra empty line(s) following the
return cleanup_result.history so the function ends cleanly with a single
newline/EOF.

In `@src/backend/chat/llm_configuration.py`:
- Around line 150-158: The validator validate_max_token_context for the
max_token_context field should reject zero or negative values; update it to
parse the value as now but then assert it's > 0 (raising a ValueError with a
clear message) before returning, or alternatively add a pydantic Field(gt=0)
constraint to the max_token_context attribute so pydantic enforces positivity;
ensure None is still allowed and that _get_setting_or_env_or_value(value) is
used before the positivity check.
- Around line 144-158: Rename the field max_token_context to max_tokens_context
to match existing LLMSettings naming and PR `#396`; update the dataclass
attribute, the validator decorator and its name (e.g.,
validate_max_token_context → validate_max_tokens_context) so the
`@field_validator` targets "max_tokens_context", and update all references/usages
of max_token_context across the module (including any schema generation or
LLMSettings mappings) to the new plural name to avoid introducing a breaking
JSON key.

In `@src/backend/chat/tests/agents/test_history_processors.py`:
- Around line 113-127: In the test
test_history_cleanup_turn_6_updates_summary_and_keeps_last_two_turns, define
messages before the fake_summary closure so the closure doesn't capture an
unbound outer variable; move the messages = _build_turns(6) line above the async
def fake_summary(...) that asserts and references messages, then keep the
monkeypatch.setattr(history_processors, "conversation_summarization",
fake_summary) and the call to history_processors.history_cleanup(...) as-is so
fake_summary uses the already-bound messages when invoked.
- Around line 58-158: Add a test that verifies behavior at turn 5 when
previous_summary is None: create a new async test (e.g.,
test_history_cleanup_turn_5_no_previous_summary_keeps_full_history) that builds
messages via _build_turns(5), monkeypatches
history_processors.conversation_summarization to raise if called (ensuring
summarization is not invoked), calls await
history_processors.history_cleanup(messages, previous_summary=None), and asserts
result.summary is None and result.history == messages (full history preserved).

In
`@src/frontend/apps/conversations/src/features/chat/components/MessageItem.tsx`:
- Around line 388-396: The nested ternary in MessageItem's loader label makes
the JSX hard to read; extract the logic into a small helper or a useMemo (e.g.,
const loaderLabel = useMemo(() => getLoaderLabel(activeToolInvocation, t),
[activeToolInvocation, t]) or a function getLoaderLabel) that returns the
correct translated string based on activeToolInvocation.toolName and
(activeToolInvocation.args as { summary_scope?: string })?.summary_scope, and
then render <Text>{loaderLabel}</Text>; keep each branch calling t(...) directly
so i18n tooling still detects the strings.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b8ce5eff-5202-4dc3-8d94-f0f127c1dd54

📥 Commits

Reviewing files that changed from the base of the PR and between b8b5630 and e0ec6ed.

📒 Files selected for processing (8)

src/backend/chat/agents/history_processors.py
src/backend/chat/clients/pydantic_ai.py
src/backend/chat/llm_configuration.py
src/backend/chat/migrations/0007_chatconversation_history_summary.py
src/backend/chat/models.py
src/backend/chat/tests/agents/test_history_processors.py
src/backend/conversations/settings.py
src/frontend/apps/conversations/src/features/chat/components/MessageItem.tsx

coderabbitai · 2026-04-23T13:39:02Z

+        # Conversation summary process
+        raw_history = ModelMessagesTypeAdapter.validate_python(self.conversation.pydantic_messages)
+        self._history_summary_generated_for_run = False
+        should_emit_summary_event = should_generate_conversation_summary(
+            raw_history,
+            summary_interval_turns=self._conversation_summary_interval_turns,
+        )
+        if should_emit_summary_event:
+            tool_call_id = str(uuid.uuid4())
+            yield events_v4.ToolCallPart(
+                tool_call_id=tool_call_id,
+                tool_name="summarize",
+                args={"state": "running", "summary_scope": "conversation"},
+            )


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

pydantic-ai ModelMessagesTypeAdapter stability across versions schema migration

💡 Result:

ModelMessagesTypeAdapter from pydantic-ai provides stable (de)serialization of message histories (list[ModelRequest | ModelResponse]) to/from JSON for storage and reloading. Official docs recommend it for persisting chat history across agent runs. Stability across versions: High confidence in post-v1.0 stability (released Sep 2025). Version policy guarantees no breaking changes to public interface including deserialization until v2 (earliest Apr 2026). Team treats deserialization breaks as bugs to fix ASAP. Adding new optional fields/message parts explicitly allowed and does not break existing data. Known issues: Occasional bugs fixed quickly, e.g., v1.42→v1.52 deserialization incorrectly parsed dicts as ImageUrl objects (Issue #4190, fixed in PR #4191). Users have implemented manual migrations for field renames (e.g., vendor_details→provider_details, request_tokens→input_tokens) when needed pre-v1. Schema migration: No automated system. Handle via manual transforms before validate_json/python (as in Issue #2789 example), or rely on defaults/backwards-compatible field handling (e.g., Usage.details=None→{}). Always code defensively when consuming deserialized messages. Current version (2026-04-23): v1.85.0. Upgrade via changelog/upgrade guide for any manual steps.

Citations:

1: ModelMessagesTypeAdapter deserialization incorrectly constructs ImageUrl objects, causing LLM request failures pydantic/pydantic-ai#4190

2: https://pydantic.dev/docs/ai/api/pydantic-ai/messages/

3: https://github.com/pydantic/pydantic-ai/blob/main/pydantic_ai_slim/pydantic_ai/messages.py

4: https://pydantic.dev/docs/ai/core-concepts/message-history/

5: https://github.com/pydantic/pydantic-ai/releases

6: Backwards compatibility guarantees for storing and loading message history? pydantic/pydantic-ai#2789

7: https://pydantic.dev/docs/ai/project/changelog/

8: https://pypi.org/project/pydantic-ai/

9: https://ai.pydantic.dev/changelog/

10: https://ai.pydantic.dev/changelog

11: http://www.github.com/pydantic/pydantic-ai

12: https://ai.pydantic.dev/version-policy/

🏁 Script executed:

# Find the definition of pydantic_messages to understand its origin and type fd -type f -name "*.py" | xargs rg "pydantic_messages" -B 2 -A 2 --max-count=20

Repository: suitenumerique/conversations

Length of output: 33634

🏁 Script executed:

# Search for the conversation class definition to understand the JSONField rg "class.*[Cc]onversation" src/backend/chat -A 5

Repository: suitenumerique/conversations

Length of output: 7084

🏁 Script executed:

# Check for existing error handling around validate_python calls rg "validate_python|ValidationError" src/backend/chat/clients/pydantic_ai.py -B 2 -A 2

Repository: suitenumerique/conversations

Length of output: 368

🏁 Script executed:

# Check version pinning for pydantic-ai fd -name "requirements*.txt" -o -name "pyproject.toml" -o -name "setup.py" | xargs rg "pydantic.?ai"

Repository: suitenumerique/conversations

Length of output: 13381

🏁 Script executed:

# Get the full method context around lines 1080-1093 sed -n '1050,1120p' src/backend/chat/clients/pydantic_ai.py | cat -n

Repository: suitenumerique/conversations

Length of output: 3153

🏁 Script executed:

# Check if there's a try/except wrapping this code section sed -n '1000,1150p' src/backend/chat/clients/pydantic_ai.py | grep -A 50 -B 50 "ModelMessagesTypeAdapter" | head -120

Repository: suitenumerique/conversations

Length of output: 3889

🏁 Script executed:

# Check for try/except blocks in the file rg "try:|except.*:" src/backend/chat/clients/pydantic_ai.py -B 1 -A 3 | head -80

Repository: suitenumerique/conversations

Length of output: 1117

Wrap ModelMessagesTypeAdapter.validate_python in defensive error handling.

The unguarded validate_python() call can raise ValidationError if self.conversation.pydantic_messages contains data with schema drift from prior pydantic-ai versions. Since pydantic_messages is a JSONField persisted across upgrades (currently pinned to v1.77.0) and pydantic-ai has known deserialization bugs (e.g., v1.42→v1.52), a single corrupt row will abort the entire stream before any event is yielded. Wrap the validation in try/except, log the error, and fall back to [] so that conversation history issues don't wedge the agent run. This aligns with pydantic-ai's official guidance to "always code defensively when consuming deserialized messages."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/backend/chat/clients/pydantic_ai.py` around lines 1080 - 1093, Wrap the call to ModelMessagesTypeAdapter.validate_python(self.conversation.pydantic_messages) in a try/except that catches pydantic ValidationError (and a broad Exception fallback), log the exception (including context such as conversation id or self.conversation) via the existing logger, and on error set raw_history = [] so execution continues; keep the surrounding logic that computes should_emit_summary_event and yields the ToolCallPart (tool_name="summarize") unchanged so a single bad persisted message row won’t abort the stream.

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

src/backend/chat/clients/pydantic_ai.py (1)

1201-1201: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard persisted history deserialization.

A malformed pydantic_messages row will raise here and abort the entire stream before any event is emitted. Please catch deserialization errors, log the conversation id, and fall back to an empty history instead of wedging the run.

#!/bin/bash
set -euo pipefail

echo "Inspecting deserialization call site and nearby error handling..."
sed -n '1196,1210p' src/backend/chat/clients/pydantic_ai.py
echo
rg -nC2 'ModelMessagesTypeAdapter\.validate_python|ValidationError' src/backend/chat/clients/pydantic_ai.py

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backend/chat/clients/pydantic_ai.py` at line 1201, The call to
ModelMessagesTypeAdapter.validate_python(self.conversation.pydantic_messages)
can raise on malformed persisted rows and currently will abort the stream; wrap
that call in a try/except that catches pydantic.ValidationError (and a broad
Exception fallback), log an error including the conversation id (accessible via
self.conversation.id or self.conversation.conversation_id) and the exception
details using the existing logger, and on error assign raw_history = [] (empty
history) so the run continues and emits events instead of wedging; keep the
happy-path behavior unchanged when validation succeeds.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/backend/chat/agents/history_processors.py`:
- Around line 175-192: The _clean_response_parts function currently leaves
non-latest ToolCallPart instances intact, retaining full old tool arguments;
update it to replace any ToolCallPart whose tool_call_id is not in
latest_tool_call_ids with a compact/redacted sentinel ToolCallPart containing
the same tool_call_id, a best-effort tool_name via getattr(part, "tool_name",
"unknown_tool"), and a minimal args payload like '{"redacted": true}', while
still appending unchanged parts for non-ToolCallPart instances and for
ToolCallPart entries that are in latest_tool_call_ids.
- Around line 280-289: The current code wrongly treats a reused previous_summary
as a newly generated summary by placing previous_summary into
HistoryCleanupResult.summary; update the branch where summary is falsy so that
HistoryCleanupResult.summary is not populated with previous_summary (set summary
to None or omit the summary field) and only return the trimmed history (via
_take_last_turns(cleaned_history, turns=context_turns)); if you need to surface
the prior summary for other logic, add or populate a distinct attribute (e.g.,
previous_summary) on HistoryCleanupResult instead of reusing summary.
- Around line 111-123: The current debug logs in history_processors.py output
raw conversation text (prompt, latest_summary, updated_summary) which can leak
PII; update the logging around summarization_agent.run (the block that calls
summarization_agent.run and uses variables prompt, latest_summary, resp/output
-> updated_summary) to avoid printing full text — log safe metadata instead
(e.g., lengths or token counts of prompt/latest_summary/updated_summary and a
masked or hashed identifier) and keep the exception logging for failures as-is;
ensure no raw prompt/summary strings are included in any logger.debug or
logger.info calls.

In `@src/backend/chat/clients/pydantic_ai.py`:
- Around line 1200-1207: The summary trigger is being computed from persisted
history only, causing the UI event to fire one turn late; fix it by deriving
should_emit_summary_event from the exact same message set passed to the history
processor (i.e., include the current user turn when calling
ModelMessagesTypeAdapter.validate_python on self.conversation.pydantic_messages
before calling should_generate_conversation_summary) or alternatively suppress
emitting the UI event until _history_processor reports it actually generated a
new summary by checking the _history_summary_generated_for_run flag after
_history_processor runs; update the logic around
should_generate_conversation_summary, ModelMessagesTypeAdapter.validate_python,
_history_processor, and _history_summary_generated_for_run so the emitted event
reflects the processor's output for the same message set.

---

Duplicate comments:
In `@src/backend/chat/clients/pydantic_ai.py`:
- Line 1201: The call to
ModelMessagesTypeAdapter.validate_python(self.conversation.pydantic_messages)
can raise on malformed persisted rows and currently will abort the stream; wrap
that call in a try/except that catches pydantic.ValidationError (and a broad
Exception fallback), log an error including the conversation id (accessible via
self.conversation.id or self.conversation.conversation_id) and the exception
details using the existing logger, and on error assign raw_history = [] (empty
history) so the run continues and emits events instead of wedging; keep the
happy-path behavior unchanged when validation succeeds.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 339af8af-2fe1-42b7-aa8b-6dbd93d3ea37

📥 Commits

Reviewing files that changed from the base of the PR and between e0ec6ed and 8256568.

📒 Files selected for processing (8)

src/backend/chat/agents/history_processors.py
src/backend/chat/clients/pydantic_ai.py
src/backend/chat/llm_configuration.py
src/backend/chat/migrations/0007_chatconversation_history_summary.py
src/backend/chat/models.py
src/backend/chat/tests/agents/test_history_processors.py
src/backend/conversations/settings.py
src/frontend/apps/conversations/src/features/chat/components/MessageItem.tsx

🚧 Files skipped from review as they are similar to previous changes (1)

src/backend/chat/llm_configuration.py

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/backend/chat/tests/clients/pydantic_ai/test_document_context_window.py (1)

132-148: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make FIFO test data overflow the updated budget model.

At Line 140 the budget is documented as 1500, but with the current stub at Lines 132-135 each doc is 400 tokens (total 1200), so no eviction should happen. This no longer validates FIFO eviction behavior.

Proposed fix

-    monkeypatch.setattr(
-        "chat.document_context_builder.count_approx_tokens",
-        lambda _text: 400,
-    )
+    monkeypatch.setattr(
+        "chat.document_context_builder.count_approx_tokens",
+        lambda _text: 600,
+    )

-    # max_token_context=4000, buffer=1000, ratio=0.5 => budget=1500.
-    # With 3 docs at 400 tokens each, rolling outcome should inline doc-2 + doc-3.
+    # max_token_context=4000, buffer=1000, ratio=0.5 => budget=1500.
+    # With 3 docs at 600 tokens each, rolling outcome should inline doc-2 + doc-3.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backend/chat/tests/clients/pydantic_ai/test_document_context_window.py`
around lines 132 - 148, The test currently stubs
chat.document_context_builder.count_approx_tokens to return 400 so three docs
total 1200 do not exceed the documented budget (1500) and thus don't trigger
FIFO eviction; update that stub in the test so each doc is larger (e.g., return
700 or 800) to make total tokens >1500, causing the rolling eviction path in
service._build_document_context_instruction to inline only doc-2 and doc-3 and
preserve the existing assertions; also update the inline comment describing the
budget if needed.

🧹 Nitpick comments (1)

src/backend/chat/clients/pydantic_ai.py (1)

556-558: ⚡ Quick win

Remove the async keyword from _history_processor.

This method only calls the synchronous clean_tool_history function and doesn't use any await. While PydanticAI accepts both sync and async history processors, removing async here improves clarity by signaling that no asynchronous I/O occurs.

Proposed fix

-    async def _history_processor(self, history: list[ModelMessage]) -> list[ModelMessage]:
+    def _history_processor(self, history: list[ModelMessage]) -> list[ModelMessage]:
         """Native pydantic-ai history processor for internal tool-cycle cleanup."""
         return clean_tool_history(history)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backend/chat/clients/pydantic_ai.py` around lines 556 - 558, The
_history_processor function is marked async but only calls the synchronous
clean_tool_history and performs no awaits; change its declaration from async def
_history_processor(self, history: list[ModelMessage]) -> list[ModelMessage]: to
a plain synchronous def _history_processor(...) so it is a regular function
(leave the body returning clean_tool_history(history) and keep the same type
annotations and docstring). This makes the function signature accurately reflect
that it performs no async I/O and avoids unnecessary coroutine objects when
PydanticAI invokes it.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/backend/chat/tests/clients/pydantic_ai/test_document_context_window.py`:
- Around line 132-148: The test currently stubs
chat.document_context_builder.count_approx_tokens to return 400 so three docs
total 1200 do not exceed the documented budget (1500) and thus don't trigger
FIFO eviction; update that stub in the test so each doc is larger (e.g., return
700 or 800) to make total tokens >1500, causing the rolling eviction path in
service._build_document_context_instruction to inline only doc-2 and doc-3 and
preserve the existing assertions; also update the inline comment describing the
budget if needed.

---

Nitpick comments:
In `@src/backend/chat/clients/pydantic_ai.py`:
- Around line 556-558: The _history_processor function is marked async but only
calls the synchronous clean_tool_history and performs no awaits; change its
declaration from async def _history_processor(self, history: list[ModelMessage])
-> list[ModelMessage]: to a plain synchronous def _history_processor(...) so it
is a regular function (leave the body returning clean_tool_history(history) and
keep the same type annotations and docstring). This makes the function signature
accurately reflect that it performs no async I/O and avoids unnecessary
coroutine objects when PydanticAI invokes it.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e5e4ebef-5bd1-4de8-82f9-ee7f634b3351

📥 Commits

Reviewing files that changed from the base of the PR and between 8256568 and 9452fb5.

📒 Files selected for processing (9)

src/backend/chat/agents/history_processors.py
src/backend/chat/clients/pydantic_ai.py
src/backend/chat/document_context_builder.py
src/backend/chat/migrations/0008_chatconversation_history_summary_checkpoint.py
src/backend/chat/models.py
src/backend/chat/tests/agents/test_history_processors.py
src/backend/chat/tests/clients/pydantic_ai/test_document_context_window.py
src/backend/chat/tests/test_document_context_builder.py
src/backend/conversations/settings.py

💤 Files with no reviewable changes (1)

src/backend/conversations/settings.py

✅ Files skipped from review due to trivial changes (1)

src/backend/chat/migrations/0008_chatconversation_history_summary_checkpoint.py

Add a summarization of last messages every n turns to optimize context

sonarqubecloud · 2026-05-22T15:16:25Z

Quality Gate passed

Issues
8 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

camilleAND force-pushed the camand/clean_history branch from e0ec6ed to e0a4a2a Compare April 23, 2026 13:34

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

camilleAND force-pushed the camand/clean_history branch from e0a4a2a to 5ce0a75 Compare April 23, 2026 14:23

camilleAND force-pushed the camand/clean_history branch from 5ce0a75 to 8256568 Compare May 6, 2026 09:08

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

Comment thread src/backend/chat/agents/history_processors.py

Comment thread src/backend/chat/agents/history_processors.py Outdated

Comment thread src/backend/chat/agents/history_processors.py Outdated

Comment thread src/backend/chat/clients/pydantic_ai.py Outdated

eliott07 mentioned this pull request May 11, 2026

History management for long conversation #313

Open

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

camilleAND force-pushed the camand/clean_history branch 10 times, most recently from dd993bd to 44d1569 Compare May 22, 2026 14:26

✨(conversation) summarize messages

89dde54

Add a summarization of last messages every n turns to optimize context

camilleAND force-pushed the camand/clean_history branch from 44d1569 to 89dde54 Compare May 22, 2026 15:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨(conversation) summarize messages#421

✨(conversation) summarize messages#421
camilleAND wants to merge 1 commit into
mainfrom
camand/clean_history

camilleAND commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading

Rate limit exceeded

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 23, 2026

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

sonarqubecloud Bot commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

camilleAND commented Apr 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Backend — history compaction

Backend — Pydantic AI integration

Token budget (conversation vs documents)

Frontend

Tests

Why

Test plan

Notes for reviewers

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented May 22, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

camilleAND commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading