Skip to content

fix(metrics): count MCP retrieval modes instead of bucketing them as "unknown"#151

Merged
CGFixIT merged 1 commit into
mainfrom
claude/cyclaw-optimization-review-xc121y-metrics-mode
Jun 21, 2026
Merged

fix(metrics): count MCP retrieval modes instead of bucketing them as "unknown"#151
CGFixIT merged 1 commit into
mainfrom
claude/cyclaw-optimization-review-xc121y-metrics-mode

Conversation

@CGFixIT

@CGFixIT CGFixIT commented Jun 21, 2026

Copy link
Copy Markdown
Owner

The bug

metrics.py builds its "Retrieval modes" breakdown from a single field:

mode_counts = Counter(e.get("retrieval_mode") or "unknown" for e in rag_queries)

But the two producers of RAG audit events disagree on the key name:

Producer Event Mode key
graph.audit_logger_node rag_query retrieval_mode
mcp_hybrid_server._handle_search mcp_rag_query mode

rag_queries includes both event types (event in ("rag_query", "mcp_rag_query")), so every MCP query — which has no retrieval_mode key — fell through to "unknown". The result: the mode breakdown under-reports hybrid/semantic/keyword and inflates unknown for any deployment that uses the MCP server.

The fix

Fall back to e.get("mode") before "unknown":

mode_counts = Counter(
    e.get("retrieval_mode") or e.get("mode") or "unknown" for e in rag_queries
)

Score stats were already correct (both event shapes share top_score), so only the mode tally needed fixing.

Tests

metrics.py had no test file. Adds tests/test_metrics.py:

  • load_events: missing file → [], and malformed JSON lines skipped
  • print_metrics: the no-events message
  • score aggregation across both event shapes (avg/min/max)
  • regression: a mixed rag_query + mcp_rag_query log asserts hybrid: 2, semantic: 1, keyword: 1, and "unknown" not in out — this fails on the old retrieval_mode-only code.

Registered in the CI pytest list with --cov=metrics.

Risk

Low. One-line logic fix in a read-only reporting script (no effect on the request path), plus a new test file and two additive CI lines. New tests pass locally (5 passed).

🤖 Generated with Claude Code

https://claude.ai/code/session_01NXYYNSfqvBrAgghyNmzbHs


Generated by Claude Code

…own'

print_metrics() read the retrieval mode only from the 'retrieval_mode'
field, which the graph audit path writes. The MCP server
(mcp_hybrid_server._handle_search) records the same datum under 'mode'.
As a result every mcp_rag_query was silently counted as 'unknown' in the
'Retrieval modes' breakdown — under-reporting hybrid/semantic/keyword
usage for all MCP traffic.

Fix: fall back to e.get('mode') before 'unknown'. Score stats already
covered both event types via the shared 'top_score' field, so only the
mode tally was affected.

Adds tests/test_metrics.py (metrics.py previously had no test): covers
load_events (missing file, malformed-line skip), the no-events message,
score-stat aggregation, and the mixed graph/MCP mode-tally regression
(fails on the old 'retrieval_mode'-only code). Registered in the CI
pytest list with --cov=metrics.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NXYYNSfqvBrAgghyNmzbHs
@CGFixIT CGFixIT marked this pull request as ready for review June 21, 2026 02:49
@CGFixIT CGFixIT merged commit b686682 into main Jun 21, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants