Feature/agent compaction#68
Conversation
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
Adds deterministic, token-threshold-based message compaction to prevent oversized runs by trimming older tool-call/tool-result pairs once the message history exceeds ~180k estimated tokens.
Changes:
- Introduces
src/core/compaction.tswith token estimation and tool-group trimming logic plus a trace event helper. - Hooks compaction into
runInternalinsrc/core/engine.tsand emits amemory_operationtrace event for compaction. - Extends
TraceEvent['memory_operation']to supportoperation: 'compact'and optional metadata.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/core/types.ts | Extends trace event typing for compaction metadata. |
| src/core/engine.ts | Triggers compaction before agent execution and emits a compaction trace event. |
| src/core/compaction.ts | Implements token estimation + deterministic trimming of tool interaction groups. |
| package-lock.json | Updates lockfile dependency set (includes cfb). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| safeConsole.log(`[JAF:ENGINE] Compaction complete: removed ${result.removedCount} messages, reduced from ${result.originalTokens} to ${result.compactedTokens} tokens`); | ||
|
|
||
| // Emit compaction event | ||
| config.onEvent?.(createCompactionEvent(result, state.runId as string, state.traceId as string)); |
There was a problem hiding this comment.
shouldCompact can be true even when there are no removable tool groups (e.g., long user/assistant text history), in which case compactState returns removedCount === 0 and token counts don’t change. The current logs and createCompactionEvent emission still claim compaction completed; consider guarding the “complete” log + event behind result.removedCount > 0, and emitting a distinct “compaction skipped/no-op” message if nothing could be removed.
| safeConsole.log(`[JAF:ENGINE] Compaction complete: removed ${result.removedCount} messages, reduced from ${result.originalTokens} to ${result.compactedTokens} tokens`); | |
| // Emit compaction event | |
| config.onEvent?.(createCompactionEvent(result, state.runId as string, state.traceId as string)); | |
| if (result.removedCount > 0) { | |
| safeConsole.log(`[JAF:ENGINE] Compaction complete: removed ${result.removedCount} messages, reduced from ${result.originalTokens} to ${result.compactedTokens} tokens`); | |
| // Emit compaction event | |
| config.onEvent?.(createCompactionEvent(result, state.runId as string, state.traceId as string)); | |
| } else { | |
| safeConsole.log(`[JAF:ENGINE] Compaction skipped: no removable messages found; token count remains ${result.originalTokens}`); | |
| } |
| @@ -0,0 +1,243 @@ | |||
| import { Message, RunState, TraceEvent, getTextContent } from './types.js'; | |||
There was a problem hiding this comment.
getTextContent is imported but never used in this module. This will fail lint/typecheck in repos that enforce no-unused-vars; remove the import or use it as part of token estimation.
| import { Message, RunState, TraceEvent, getTextContent } from './types.js'; | |
| import { Message, RunState, TraceEvent } from './types.js'; |
| // Calculate tokens for the entire group | ||
| let groupTokens = 0; | ||
| const groupIndices: number[] = []; | ||
|
|
||
| for (let j = i; j < messages.length; j++) { | ||
| if (toolGroups.get(j) === groupId) { | ||
| groupTokens += estimateMessageTokens(messages[j]); | ||
| groupIndices.push(j); | ||
| } | ||
| } |
There was a problem hiding this comment.
trimToolMessages computes each tool-group’s token total by scanning from the current index to the end of the message list. In the compaction-triggered scenario (very large histories) this becomes O(n²) and can add significant latency. Consider precomputing a map of groupId -> {indices, tokens} in a single pass and then trimming groups from the front using that precomputed structure.
| traceId: string | ||
| ): TraceEvent { | ||
| return { | ||
| type: 'memory_operation', | ||
| data: { | ||
| operation: 'compact', | ||
| conversationId: runId, |
There was a problem hiding this comment.
createCompactionEvent sets conversationId to runId, but elsewhere the engine/memory integration treats conversationId as config.conversationId (the persisted conversation key). Using runId here makes this event ambiguous and can break consumers correlating memory operations by conversation. Consider passing the actual conversation id (e.g., config.conversationId when available) and, if absent, falling back to runId explicitly so semantics are clear.
| traceId: string | |
| ): TraceEvent { | |
| return { | |
| type: 'memory_operation', | |
| data: { | |
| operation: 'compact', | |
| conversationId: runId, | |
| traceId: string, | |
| conversationId?: string | |
| ): TraceEvent { | |
| return { | |
| type: 'memory_operation', | |
| data: { | |
| operation: 'compact', | |
| conversationId: conversationId ?? runId, |
| // Token-based compaction: check and trim if messages exceed limit | ||
| let compactionState = state; | ||
| if (shouldCompact(state.messages, defaultCompactionConfig)) { | ||
| const currentTokens = estimateTotalTokens(state.messages); | ||
| safeConsole.log(`[JAF:ENGINE] Compaction triggered: ${currentTokens} tokens exceed ${defaultCompactionConfig.maxTokenLimit} limit`); | ||
|
|
||
| const { state: compactedState, result } = compactState(state, defaultCompactionConfig); | ||
| compactionState = compactedState; | ||
|
|
||
| safeConsole.log(`[JAF:ENGINE] Compaction complete: removed ${result.removedCount} messages, reduced from ${result.originalTokens} to ${result.compactedTokens} tokens`); | ||
|
|
||
| // Emit compaction event | ||
| config.onEvent?.(createCompactionEvent(result, state.runId as string, state.traceId as string)); | ||
| } | ||
|
|
||
| const currentAgent = config.agentRegistry.get(compactionState.currentAgentName); | ||
| if (!currentAgent) { |
There was a problem hiding this comment.
Compaction updates are computed into compactionState, but the function continues to use the original state for subsequent processing (events, messageCount, and presumably the LLM prompt later). As a result, the compaction has no effect on token usage and the emitted agent_processing event still reports the pre-compaction message list. Consider reassigning state to the compacted state (or consistently use compactionState for all downstream reads) so the trimmed message history is actually used.
|
|
||
| // Token-based compaction: check and trim if messages exceed limit | ||
| let compactionState = state; | ||
| if (shouldCompact(state.messages, defaultCompactionConfig)) { |
There was a problem hiding this comment.
We are always checking here whether to compact or not based on default hardcoded config , but ideally we should provide the developer consuming this framework to pass the values for this configuration, whether to enable of disable compaction and what limits to set , and the hardcoded values can be used as fallback
deterministic compaction when tokens exceeds 180_000 by trimming old tool pairs