feat(bot): single-pass LLM extraction via inline XML tags by Mathews-Tom · Pull Request #39 · Mathews-Tom/VaultMind

Mathews-Tom · 2026-03-25T11:19:57Z

Summary

Eliminate redundant LLM calls by embedding entity and relationship extraction directives directly into the thinking partner's system prompt. A single LLM call now produces both the user-facing response and optional structured XML tags (<vm:entity>, <vm:relationship>, <vm:episode>). Tags are parsed post-generation, validated, applied as knowledge graph updates, and stripped before the response reaches the user.

Inspired by SivaRamSV/paaw's structured output extraction pattern, adapted for VaultMind's existing graph and LLM abstractions.

How It Works

LLM response (with optional tags)
  → parse_extraction_tags()     # regex match + ET.fromstring validation
  → filter by confidence        # skip below threshold (default 0.7)
  → graph.add_entity()          # for each valid entity
  → graph.add_relationship()    # for each valid relationship
  → graph.save()                # persist if anything was extracted
  → return clean_response       # all vm: tags stripped

The extraction is optional and secondary — the system prompt explicitly tells the LLM that tags are bonus output with no penalty for omission. Response quality remains the primary objective.

XML Tag Protocol

<!-- Entities: people, projects, tools, concepts, organizations, events, locations -->
<vm:entity name="Obsidian" type="tool" confidence="0.95">Note-taking app</vm:entity>

<!-- Relationships: directed edges between entities -->
<vm:relationship from="Obsidian" to="VaultMind" type="part_of" confidence="0.80" />

<!-- Episodes: decision-outcome pairs for episodic memory -->
<vm:episode decision="Adopt extraction" context="Reduce LLM calls" status="pending">
  <lesson>No quality degradation observed</lesson>
  <entity>VaultMind</entity>
</vm:episode>

Validation rules:

Entity type validated against NODE_TYPES, relationship type against EDGE_TYPES
Confidence parsed as float, clamped to [0.0, 1.0], defaults to 1.0 if omitted
Self-edges (from == to) silently skipped
Missing required attributes (name, from/to, decision) → tag skipped with debug log
Malformed XML → tag skipped gracefully; never blocks response delivery

Changes

New Files

src/vaultmind/bot/extraction_parser.py (243 lines) — Stateless XML parser with 4 dataclasses (ExtractedEntity, ExtractedRelationship, ExtractedEpisode, ExtractionResult) and parse_extraction_tags() entry point. Uses regex for tag matching and xml.etree.ElementTree for attribute parsing. Includes _strip_namespace() helper to handle vm: prefix before ET parsing. Module-level NODE_TYPES, EDGE_TYPES, EPISODE_STATUSES frozensets avoid circular imports
tests/test_extraction_parser.py (431 lines) — 23 tests across 3 classes

Modified Files

src/vaultmind/bot/thinking.py — Appended extraction instructions to THINKING_SYSTEM_PROMPT; added _apply_extraction() async method that parses tags, filters by confidence threshold, applies graph updates, and returns clean response; updated think() to route raw LLM response through _apply_extraction before storing in session history
src/vaultmind/config.py — Added single_pass_extraction_enabled: bool = True and extraction_confidence_threshold: float = 0.7 to LLMConfig
config/default.toml — Added matching entries in [llm] section
tests/test_thinking.py — Updated _FakeLLMConfig with new fields (extraction disabled in existing tests)
tests/test_session_summarization.py — Updated _FakeLLMConfig with new fields

Backward Compatibility

Extraction is enabled by default but fully optional — if the LLM emits no tags, the response passes through unchanged
_apply_extraction returns raw response when single_pass_extraction_enabled=False
Existing tests disable extraction via _FakeLLMConfig to avoid interference
No changes to graph schema, session schema, or public API surfaces

Self-Reinforcing Feedback Loop

This feature creates a virtuous cycle with PR #37 (composite ranking):

Thinking sessions extract entities → graph becomes denser
Denser graph → higher connection density scores in ranking
Better ranking → more relevant context in future thinking sessions
More relevant context → richer responses with more entities to extract

Test plan

23 new tests in test_extraction_parser.py:
- Parser unit tests (16): valid/invalid entities, relationships, episodes, confidence clamping, self-edges, malformed XML, self-closing tags, tag stripping, mixed content, no-tags passthrough
- ExtractionResult defaults (2): empty initialization, newline collapsing
- Integration tests (5): entity/relationship graph addition, clean response, confidence filtering, disabled passthrough
All existing tests pass with updated fake configs (extraction disabled)
Full suite: 848/848 tests pass, 0 regressions
ruff check — clean
mypy --ignore-missing-imports — clean
Manual: verify extraction triggers on real thinking sessions with entity-rich topics
Manual: A/B compare response quality with/without extraction instructions

New module bot/extraction_parser.py parses inline XML tags (vm:entity, vm:relationship, vm:episode) from LLM responses. Validates entity types against NODE_TYPES, relationship types against EDGE_TYPES, clamps confidence to [0,1], skips self-edges and malformed tags gracefully. Add single_pass_extraction_enabled and extraction_confidence_threshold fields to LLMConfig.

Embed XML extraction directives in the thinking system prompt so a single LLM call produces both the user-facing response and structured entity/relationship tags. Tags are parsed post-generation via extraction_parser, filtered by confidence threshold, applied as graph updates, then stripped before the response is stored in session history. Extraction is optional in the prompt and non-blocking — if the LLM emits no tags or tags are malformed, the response is delivered normally.

23 tests across 3 classes: parser unit tests (16) covering valid/invalid entities, relationships, episodes, self-edges, malformed XML, and tag stripping; ExtractionResult defaults (2); integration tests (5) with real KnowledgeGraph verifying entity/relationship addition, confidence filtering, and disabled-extraction passthrough. Update fake configs in existing test files with new LLMConfig fields.

Mathews-Tom added 3 commits March 25, 2026 16:38

Mathews-Tom merged commit 5e43a04 into main Mar 25, 2026
3 checks passed

Mathews-Tom deleted the feat/single-pass-extraction branch March 25, 2026 11:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(bot): single-pass LLM extraction via inline XML tags#39

feat(bot): single-pass LLM extraction via inline XML tags#39
Mathews-Tom merged 3 commits into
mainfrom
feat/single-pass-extraction

Mathews-Tom commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Mathews-Tom commented Mar 25, 2026

Summary

How It Works

XML Tag Protocol

Changes

New Files

Modified Files

Backward Compatibility

Self-Reinforcing Feedback Loop

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant