feat(bot): single-pass LLM extraction via inline XML tags#39
Merged
Conversation
New module bot/extraction_parser.py parses inline XML tags (vm:entity, vm:relationship, vm:episode) from LLM responses. Validates entity types against NODE_TYPES, relationship types against EDGE_TYPES, clamps confidence to [0,1], skips self-edges and malformed tags gracefully. Add single_pass_extraction_enabled and extraction_confidence_threshold fields to LLMConfig.
Embed XML extraction directives in the thinking system prompt so a single LLM call produces both the user-facing response and structured entity/relationship tags. Tags are parsed post-generation via extraction_parser, filtered by confidence threshold, applied as graph updates, then stripped before the response is stored in session history. Extraction is optional in the prompt and non-blocking — if the LLM emits no tags or tags are malformed, the response is delivered normally.
23 tests across 3 classes: parser unit tests (16) covering valid/invalid entities, relationships, episodes, self-edges, malformed XML, and tag stripping; ExtractionResult defaults (2); integration tests (5) with real KnowledgeGraph verifying entity/relationship addition, confidence filtering, and disabled-extraction passthrough. Update fake configs in existing test files with new LLMConfig fields.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Eliminate redundant LLM calls by embedding entity and relationship extraction directives directly into the thinking partner's system prompt. A single LLM call now produces both the user-facing response and optional structured XML tags (
<vm:entity>,<vm:relationship>,<vm:episode>). Tags are parsed post-generation, validated, applied as knowledge graph updates, and stripped before the response reaches the user.Inspired by SivaRamSV/paaw's structured output extraction pattern, adapted for VaultMind's existing graph and LLM abstractions.
How It Works
The extraction is optional and secondary — the system prompt explicitly tells the LLM that tags are bonus output with no penalty for omission. Response quality remains the primary objective.
XML Tag Protocol
Validation rules:
typevalidated againstNODE_TYPES, relationshiptypeagainstEDGE_TYPESChanges
New Files
src/vaultmind/bot/extraction_parser.py(243 lines) — Stateless XML parser with 4 dataclasses (ExtractedEntity,ExtractedRelationship,ExtractedEpisode,ExtractionResult) andparse_extraction_tags()entry point. Uses regex for tag matching andxml.etree.ElementTreefor attribute parsing. Includes_strip_namespace()helper to handlevm:prefix before ET parsing. Module-levelNODE_TYPES,EDGE_TYPES,EPISODE_STATUSESfrozensets avoid circular importstests/test_extraction_parser.py(431 lines) — 23 tests across 3 classesModified Files
src/vaultmind/bot/thinking.py— Appended extraction instructions toTHINKING_SYSTEM_PROMPT; added_apply_extraction()async method that parses tags, filters by confidence threshold, applies graph updates, and returns clean response; updatedthink()to route raw LLM response through_apply_extractionbefore storing in session historysrc/vaultmind/config.py— Addedsingle_pass_extraction_enabled: bool = Trueandextraction_confidence_threshold: float = 0.7toLLMConfigconfig/default.toml— Added matching entries in[llm]sectiontests/test_thinking.py— Updated_FakeLLMConfigwith new fields (extraction disabled in existing tests)tests/test_session_summarization.py— Updated_FakeLLMConfigwith new fieldsBackward Compatibility
_apply_extractionreturns raw response whensingle_pass_extraction_enabled=False_FakeLLMConfigto avoid interferenceSelf-Reinforcing Feedback Loop
This feature creates a virtuous cycle with PR #37 (composite ranking):
Test plan
test_extraction_parser.py:ruff check— cleanmypy --ignore-missing-imports— clean