feat(server): add memory health statistics API endpoints by mvanhorn · Pull Request #706 · volcengine/OpenViking

mvanhorn · 2026-03-17T14:16:21Z

Problem Statement

OpenViking has the infrastructure for memory observability (hotness_score, retrieval_stats, eval recorder) but no API to query aggregate memory health. Operators can't answer basic questions without digging into the database directly:

How many memories exist per category?
What's the hotness distribution? Are most memories going cold?
How much storage is consumed?
Which memories haven't been accessed in 30 days?

Proposed Solution

Add two API endpoints:

GET /stats/memories - Global memory statistics

{
  "total_memories": 1247,
  "by_category": {
    "profile": 3,
    "preferences": 42,
    "entities": 186,
    "events": 89,
    "cases": 412,
    "patterns": 67,
    "tools": 298,
    "skills": 150
  },
  "hotness_distribution": {
    "cold": 312,
    "warm": 687,
    "hot": 248
  },
  "staleness": {
    "not_accessed_7d": 89,
    "not_accessed_30d": 312,
    "oldest_memory_age_days": 45
  },
  "total_vectors": 8941
}

GET /sessions/{session_id}/stats - Per-session extraction stats

{
  "session_id": "abc123",
  "total_turns": 5,
  "memories_extracted": 3,
  "contexts_used": 2,
  "skills_used": 1
}

Supports ?category=cases query parameter to filter by a single memory category.

Alternatives Considered

Extending the TUI only (#664) - but the TUI isn't programmatically accessible, and automated monitoring needs an API.

Implementation

openviking/storage/stats_aggregator.py - Core StatsAggregator class that queries VikingDB for category counts, hotness distribution (cold <0.2, warm 0.2-0.6, hot >0.6), and staleness metrics. Uses the existing hotness_score() function from memory_lifecycle.py.
openviking/server/routers/stats.py - FastAPI router with two endpoints, following the pattern in routers/sessions.py.
Router registered in app.py and routers/__init__.py following existing conventions.
No new dependencies required. No new storage introduced.

Evidence

Source	Evidence	Engagement
#640	Request-level trace metrics just merged - observability is active priority	Merged by zhoujh01
#529	Oversized prompts destabilize VLM calls - storage metrics would catch this	Closed (fixed)
#350	Ingestion/indexing decoupling - users need ingestion progress visibility	3 thumbs up
Reddit	"Why AI Coding Agents Waste Half Their Context Window" - demand for context visibility	56 upvotes, 40 comments
Discussion	Community requests for evaluation/observability module	Active discussion

Test Plan

Unit tests for StatsAggregator with mocked VikingDB (empty store, category counts, hotness buckets, staleness, error handling)
Unit tests for API router (response shape, invalid category validation, session stats, session not found)
_parse_datetime helper tested for None, datetime objects, ISO strings, invalid input
Integration test with live VikingDB (manual)

Generated with Claude Code

Add two new API endpoints for querying aggregate memory health: - GET /stats/memories - global memory stats (counts by category, hotness distribution, staleness metrics) - GET /stats/sessions/{id} - per-session extraction statistics The StatsAggregator reads from existing VikingDB indexes and the hotness_score function without introducing new storage. Includes unit tests with mocked VikingDB backend.

Replace the audio parser stub with a working implementation that: - Extracts metadata (duration, sample rate, channels, bitrate) via mutagen - Transcribes speech via Whisper API with timestamped segments - Builds structured ResourceNode tree with L0/L1/L2 content tiers - Falls back to metadata-only output when Whisper is unavailable - Adds mutagen as optional dependency under [audio] extra - Adds audio_summary prompt template for semantic indexing - Includes unit tests with mocked Whisper API and mutagen

qin-ctx

[Bug] (blocking) This PR bundles two unrelated features — memory health stats API (commit 1) and a complete audio parser rewrite with Whisper transcription (commit 2). These have no code dependency and should be separate PRs. Mixing them makes review, rollback, and changelog tracking harder.

Additional findings:

[Design] (non-blocking) PR description says GET /sessions/{session_id}/stats but the actual route is GET /api/v1/stats/sessions/{session_id}. Please update the description to match.
[Design] (non-blocking) _asr_transcribe and _asr_transcribe_with_timestamps duplicate the OpenAI client creation code (get_openviking_config() + openai.AsyncOpenAI(...)). Extract to a shared helper. Also, config.llm.api_key may not be the correct credential for OpenAI Whisper if the project is configured for a different LLM provider.
[Suggestion] (non-blocking) audio_summary.yaml prompt template is added but never referenced in any code path — dead code.
[Suggestion] (non-blocking) _generate_semantic_info accepts a viking_fs parameter that is never used in the method body.
[Suggestion] (non-blocking) CI lint / lint check is failing.

qin-ctx · 2026-03-18T09:31:26Z

openviking/storage/stats_aggregator.py

+        total_vectors = 0
+
+        for cat in categories:
+            records = await self._query_memories_by_category(ctx, cat)


[Bug] (blocking) N+1 query: _query_memories_by_category executes the same Eq("context_type", "memory") query with limit=10000 for each of the 8 categories, then filters by URI prefix in Python. This means 8 identical DB round-trips, each returning up to 10,000 records.

Fetch once and group in memory instead:

all_records = await self._query_all_memories(ctx) by_cat = defaultdict(list) for r in all_records: uri = r.get("uri", "") for cat in categories: if f"/{cat}/" in uri: by_cat[cat].append(r) break

qin-ctx · 2026-03-18T09:31:26Z

openviking/storage/stats_aggregator.py

+            "by_category": by_category,
+            "hotness_distribution": hotness_dist,
+            "staleness": staleness,
+            "total_vectors": total_vectors,


[Bug] (blocking) total_vectors is always identical to total_memories — both are sum(by_category.values()). The PR description shows them as different numbers (1247 vs 8941), implying total_vectors should represent the actual vector embedding count (a memory can have multiple vectors). The current implementation is misleading.

Either compute the real vector count from VikingDB index stats, or remove this field until it can report a meaningful value.

qin-ctx · 2026-03-18T09:31:26Z

openviking/server/routers/stats.py

+    try:
+        result = await aggregator.get_session_extraction_stats(session_id, service, _ctx)
+        return Response(status="ok", result=result)
+    except Exception as e:


[Bug] (blocking) Catching bare Exception and returning NOT_FOUND swallows all error types — DB timeouts, permission errors, serialization failures, etc. are all misreported as "session not found".

Distinguish session-not-found from other failures:

try: result = await aggregator.get_session_extraction_stats(session_id, service, _ctx) return Response(status="ok", result=result) except KeyError: return Response( status="error", error=ErrorInfo(code="NOT_FOUND", message=f"Session not found: {session_id}"), ) except Exception as e: logger.error("Failed to get session stats for %s: %s", session_id, e) return Response( status="error", error=ErrorInfo(code="INTERNAL", message="Internal error retrieving session stats"), )

(Adjust the specific exception type to match what session.load() actually raises for missing sessions.)

The audio parser feature is unrelated to memory health stats and belongs in its own PR (volcengine#707). Reverts audio.py to pre-rewrite state, removes the unused audio_summary.yaml template and audio parser tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…or handling - Replace per-category _query_memories_by_category with single _query_all_memories call, grouping by category in Python (1 DB round-trip instead of 8) - Remove misleading total_vectors field (was identical to total_memories). Will add real vector count from VikingDB index stats in a follow-up - Distinguish KeyError (session not found) from other failures in stats.py endpoint, returning INTERNAL_ERROR for unexpected exceptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mvanhorn · 2026-03-18T13:59:01Z

Addressed all feedback in 898cc8e and c0d13ad:

Reverted the audio parser commit from this branch (it belongs in feat(parse): implement audio resource parser with Whisper transcription #707)
Replaced per-category queries with a single _query_all_memories call that groups by category in Python (1 DB round-trip instead of 8)
Removed the misleading total_vectors field (was identical to total_memories)
Distinguished KeyError from other exceptions in the session stats endpoint, returning INTERNAL_ERROR for unexpected failures instead of NOT_FOUND

mvanhorn · 2026-03-18T22:06:04Z

Addressed all blocking feedback in 898cc8e and c0d13ad:

Split PR: Reverted audio parser changes from this branch. The audio parser feature lives in feat(parse): implement audio resource parser with Whisper transcription #707 as a standalone PR.
N+1 query: Replaced per-category _query_memories_by_category (8 identical DB round-trips) with a single _query_all_memories call that fetches once and groups by URI prefix in Python.
total_vectors removed: Dropped the misleading field entirely. Will add real vector count from VikingDB index stats in a follow-up if useful.
Error handling: Replaced bare Exception catch with KeyError for session-not-found, returning INTERNAL_ERROR for unexpected failures.

mvanhorn added 2 commits March 17, 2026 07:15

github-project-automation bot added this to OpenViking project Mar 17, 2026

github-project-automation bot moved this to Backlog in OpenViking project Mar 17, 2026

style: format with ruff

9d63ee5

qin-ctx requested changes Mar 18, 2026

View reviewed changes

mvanhorn and others added 2 commits March 18, 2026 06:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(server): add memory health statistics API endpoints#706

feat(server): add memory health statistics API endpoints#706
mvanhorn wants to merge 5 commits intovolcengine:mainfrom
mvanhorn:feat/memory-health-stats-api

mvanhorn commented Mar 17, 2026

Uh oh!

qin-ctx left a comment

Uh oh!

qin-ctx Mar 18, 2026

Uh oh!

qin-ctx Mar 18, 2026

Uh oh!

qin-ctx Mar 18, 2026

Uh oh!

mvanhorn commented Mar 18, 2026

Uh oh!

mvanhorn commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mvanhorn commented Mar 17, 2026

Problem Statement

Proposed Solution

Alternatives Considered

Implementation

Evidence

Test Plan

Uh oh!

qin-ctx left a comment

Choose a reason for hiding this comment

Uh oh!

qin-ctx Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

qin-ctx Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

qin-ctx Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

mvanhorn commented Mar 18, 2026

Uh oh!

mvanhorn commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants