Skip to content

feat(bot): single-pass LLM extraction via inline XML tags#39

Merged
Mathews-Tom merged 3 commits into
mainfrom
feat/single-pass-extraction
Mar 25, 2026
Merged

feat(bot): single-pass LLM extraction via inline XML tags#39
Mathews-Tom merged 3 commits into
mainfrom
feat/single-pass-extraction

Conversation

@Mathews-Tom

Copy link
Copy Markdown
Owner

Summary

Eliminate redundant LLM calls by embedding entity and relationship extraction directives directly into the thinking partner's system prompt. A single LLM call now produces both the user-facing response and optional structured XML tags (<vm:entity>, <vm:relationship>, <vm:episode>). Tags are parsed post-generation, validated, applied as knowledge graph updates, and stripped before the response reaches the user.

Inspired by SivaRamSV/paaw's structured output extraction pattern, adapted for VaultMind's existing graph and LLM abstractions.

How It Works

LLM response (with optional tags)
  → parse_extraction_tags()     # regex match + ET.fromstring validation
  → filter by confidence        # skip below threshold (default 0.7)
  → graph.add_entity()          # for each valid entity
  → graph.add_relationship()    # for each valid relationship
  → graph.save()                # persist if anything was extracted
  → return clean_response       # all vm: tags stripped

The extraction is optional and secondary — the system prompt explicitly tells the LLM that tags are bonus output with no penalty for omission. Response quality remains the primary objective.

XML Tag Protocol

<!-- Entities: people, projects, tools, concepts, organizations, events, locations -->
<vm:entity name="Obsidian" type="tool" confidence="0.95">Note-taking app</vm:entity>

<!-- Relationships: directed edges between entities -->
<vm:relationship from="Obsidian" to="VaultMind" type="part_of" confidence="0.80" />

<!-- Episodes: decision-outcome pairs for episodic memory -->
<vm:episode decision="Adopt extraction" context="Reduce LLM calls" status="pending">
  <lesson>No quality degradation observed</lesson>
  <entity>VaultMind</entity>
</vm:episode>

Validation rules:

  • Entity type validated against NODE_TYPES, relationship type against EDGE_TYPES
  • Confidence parsed as float, clamped to [0.0, 1.0], defaults to 1.0 if omitted
  • Self-edges (from == to) silently skipped
  • Missing required attributes (name, from/to, decision) → tag skipped with debug log
  • Malformed XML → tag skipped gracefully; never blocks response delivery

Changes

New Files

  • src/vaultmind/bot/extraction_parser.py (243 lines) — Stateless XML parser with 4 dataclasses (ExtractedEntity, ExtractedRelationship, ExtractedEpisode, ExtractionResult) and parse_extraction_tags() entry point. Uses regex for tag matching and xml.etree.ElementTree for attribute parsing. Includes _strip_namespace() helper to handle vm: prefix before ET parsing. Module-level NODE_TYPES, EDGE_TYPES, EPISODE_STATUSES frozensets avoid circular imports
  • tests/test_extraction_parser.py (431 lines) — 23 tests across 3 classes

Modified Files

  • src/vaultmind/bot/thinking.py — Appended extraction instructions to THINKING_SYSTEM_PROMPT; added _apply_extraction() async method that parses tags, filters by confidence threshold, applies graph updates, and returns clean response; updated think() to route raw LLM response through _apply_extraction before storing in session history
  • src/vaultmind/config.py — Added single_pass_extraction_enabled: bool = True and extraction_confidence_threshold: float = 0.7 to LLMConfig
  • config/default.toml — Added matching entries in [llm] section
  • tests/test_thinking.py — Updated _FakeLLMConfig with new fields (extraction disabled in existing tests)
  • tests/test_session_summarization.py — Updated _FakeLLMConfig with new fields

Backward Compatibility

  • Extraction is enabled by default but fully optional — if the LLM emits no tags, the response passes through unchanged
  • _apply_extraction returns raw response when single_pass_extraction_enabled=False
  • Existing tests disable extraction via _FakeLLMConfig to avoid interference
  • No changes to graph schema, session schema, or public API surfaces

Self-Reinforcing Feedback Loop

This feature creates a virtuous cycle with PR #37 (composite ranking):

  1. Thinking sessions extract entities → graph becomes denser
  2. Denser graph → higher connection density scores in ranking
  3. Better ranking → more relevant context in future thinking sessions
  4. More relevant context → richer responses with more entities to extract

Test plan

  • 23 new tests in test_extraction_parser.py:
    • Parser unit tests (16): valid/invalid entities, relationships, episodes, confidence clamping, self-edges, malformed XML, self-closing tags, tag stripping, mixed content, no-tags passthrough
    • ExtractionResult defaults (2): empty initialization, newline collapsing
    • Integration tests (5): entity/relationship graph addition, clean response, confidence filtering, disabled passthrough
  • All existing tests pass with updated fake configs (extraction disabled)
  • Full suite: 848/848 tests pass, 0 regressions
  • ruff check — clean
  • mypy --ignore-missing-imports — clean
  • Manual: verify extraction triggers on real thinking sessions with entity-rich topics
  • Manual: A/B compare response quality with/without extraction instructions

New module bot/extraction_parser.py parses inline XML tags (vm:entity,
vm:relationship, vm:episode) from LLM responses. Validates entity types
against NODE_TYPES, relationship types against EDGE_TYPES, clamps
confidence to [0,1], skips self-edges and malformed tags gracefully.

Add single_pass_extraction_enabled and extraction_confidence_threshold
fields to LLMConfig.
Embed XML extraction directives in the thinking system prompt so a
single LLM call produces both the user-facing response and structured
entity/relationship tags. Tags are parsed post-generation via
extraction_parser, filtered by confidence threshold, applied as graph
updates, then stripped before the response is stored in session history.

Extraction is optional in the prompt and non-blocking — if the LLM
emits no tags or tags are malformed, the response is delivered normally.
23 tests across 3 classes: parser unit tests (16) covering valid/invalid
entities, relationships, episodes, self-edges, malformed XML, and tag
stripping; ExtractionResult defaults (2); integration tests (5) with
real KnowledgeGraph verifying entity/relationship addition, confidence
filtering, and disabled-extraction passthrough.

Update fake configs in existing test files with new LLMConfig fields.
@Mathews-Tom Mathews-Tom merged commit 5e43a04 into main Mar 25, 2026
3 checks passed
@Mathews-Tom Mathews-Tom deleted the feat/single-pass-extraction branch March 25, 2026 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant