Skip to content

Latest commit

 

History

History
1419 lines (1158 loc) · 81.5 KB

File metadata and controls

1419 lines (1158 loc) · 81.5 KB

Soulfield Codebase Analysis (Comprehensive)

1. Executive Summary

Soulfield is a highly sophisticated, multi-language cognitive architecture designed for advanced knowledge management and self-improving, autonomous AI agent workflows. It is not a monolithic application but a hierarchical ecosystem of services with a clear separation of concerns, built upon a robust Model-Context-Protocol (MCP) that enables deep extensibility and integration with diverse internal and external tools.

Its architecture can be summarized in six primary domains:

  1. Structured Agent & Workflow Definitions: A declarative framework using JSON configuration and Pydantic models (Agent, AgenticWorkflow, DSPyWorkflow) to define the properties, state, and reasoning structures of the entire agentic system.
  2. Strategic Orchestration (@governor agent): A top-level meta-cognitive agent that analyzes system-wide performance, directs learning, and initiates high-level workflows.
  3. Tactical Orchestration & Execution (WorkflowEngine, council.js): A control layer that executes predefined sequences of agents and manages the detailed lifecycle of each agent task.
  4. Autonomous Quality Control (RAGSwitch): A state-of-the-art self-correction mechanism that uses graph-based analysis to critique agent outputs and intelligently query external knowledge sources to fill logical gaps.
  5. Knowledge & Data Layer: A multi-modal approach to graphs (SQLite, Neptune, NetworkX) and a modular pipeline for data ingestion.
  6. Python AI Engine & MCP: A powerful toolchain for managing and executing AI models (including DSPy) and a decentralized protocol (MCP) for integrating a wide range of external and internal tools.

The system is built for resilience, introspection, and metaprogramming, allowing it to reason about and strategically improve its own components over time.

Table of Contents

  1. Executive Summary
  2. The Agentic Architecture: A 3-Layer Hierarchy
  3. Autonomous Quality Control & Learning
  4. The Knowledge & Data Layer
  5. The Model-Context-Protocol (MCP) & Python AI Engine
  6. Appendix A: Agent System Breakdown
  7. Appendix B: Data Flow Diagrams
  8. Appendix C: Key Module Reference
  9. Appendix D: Lens Framework Complete Reference
  10. Appendix E: Configuration & Environment Reference
  11. Appendix F: MCP Integration Complete Map
  12. Appendix G: Test Coverage Map
  13. Appendix H: Google Workspace Integration
  14. Appendix I: Training Data & DSPy System
  15. Appendix J: Gap Analysis & System Health
  16. Appendix K: Glossary
  17. Appendix L: Developer Quick Start Guide
  18. Appendix M: Known Technical Debt (Here Be Dragons)
  19. Appendix N: Agent Output Templates
  20. Appendix O: Workspace & Data Structure Map

2. The Agentic Architecture: A 3-Layer Hierarchy

Soulfield's core is its advanced agentic system, organized into a clear three-layer hierarchy of command and control. A detailed breakdown of each agent's capabilities can be found in Appendix A.

2.1. Layer 1: The @governor - Strategic Orchestrator

The top-level meta-cognitive agent (backend/agents/handlers/governor.cjs) responsible for observing and directing the entire ecosystem. It analyzes performance data from all other agents, identifies system-wide patterns using graph reasoning, and generates concrete proposals for improving other agents, closing the system's main feedback loop.

2.2. Layer 2: The WorkflowEngine - Tactical Executor

This Node.js service (backend/orchestration/workflow-engine.cjs) executes the high-level AgenticWorkflow definitions, running agent compositions using formal patterns like runSequential (Prompt Chaining) and runParallel.

2.3. Layer 3: The Council & Agent Handlers - Execution Lifecycle

The council.js module is the central controller that manages the detailed execution of any single agent task. A single step in an AgenticWorkflow is handled here, which may in turn trigger a low-level DSPyWorkflow. The council manages the full cognitive cycle: memory recall, context building, tool use, validation, and learning capture.

  • Anatomy of a Worker Agent (Example: @marketing): The handler for the @marketing agent (backend/agents/handlers/marketing.cjs) reveals the complete cognitive cycle of a specialized agent: memory recall, context building, reasoning, self-validation via the Lens Framework, action/tool use (AFS, Google Calendar), self-analysis via graph reasoning, and performance logging to the central learning loop.

3. Autonomous Quality Control & Learning

Soulfield possesses a state-of-the-art framework for quality assurance and self-improvement, moving beyond simple validation to active, intelligent correction.

3.1. The Lens Framework: AI Quality Assurance

A sophisticated validation layer (LensOrchestrator, LensMiddleware) inspects agent outputs using pipelines of "Lenses" (truth, causality, rights, etc.) to enforce quality, consistency, and safety with agent-specific enforcement modes.

3.2. The RAG Switch: Graph-Driven Self-Correction

The RAGSwitch (backend/services/rag-switch.js) acts as an autonomous "expert reviewer" of an agent's work. It uses graph-based analysis (StructureLensV2) to assess the logical structure of an agent's response. If it detects "structural gaps" or "echo chambers," it generates new, highly targeted queries for an external RAG provider (like Perplexity) to find the specific information needed to fix the flaws in its own reasoning.


4. The Knowledge & Data Layer

A multi-modal approach to graphs (SQLite, Amazon Neptune, NetworkX for queries, Graphviz for introspection) and a modular pipeline for ingesting diverse data types.

Key Architecture (Schema V2 - Nov 2025): The system recently migrated to a "Per-Document Mention" schema (V2), moving away from global singletons. This architecture tracks entity occurrences within specific documents (entity_mentions table) alongside unique concepts (global_entities table). This shift unlocked massive cross-document linking capabilities (195,651 relationships detected) and enables precise attribution of knowledge sources.

System Performance (Verified Nov 2025)

  • Avg Query Time: 120ms
  • Cache Hit Rate: 85%
  • Memory Usage: ~32MB (Lazy-loaded embeddings)

KG Performance Metrics

Live statistics from mcp__soulfield_kg__getStats and mcp__soulfield_kg__getPerformance:

Metric Value Notes
Documents 655 Total indexed documents
Entities 7,829 Unique concepts extracted
Relationships 195,651 Cross-document connections
Communities 212 Detected topic clusters
Graph Density 0.32% Connection saturation
Summaries 1,965 Document-level summaries
Avg Query Time 120ms Typical search latency
Cache Hit Rate 85% Query cache efficiency
Memory Usage 26MB Runtime footprint

Graph Connectivity Example:

  • @marketing agent has 17 shares_context_with relationships
  • Direct path exists from @marketing to @finance via shared business concepts
  • Cross-agent knowledge flow enables coordinated strategy execution

5. The Model-Context-Protocol (MCP) & Python AI Engine

  • MCP: A decentralized microservices layer (mcpClient.cjs) that provides a unified interface for the system to discover and invoke a wide range of external and internal tools (e.g., Perplexity, Google Workspace, Apify, Playwright, and Soulfield's own KG).
  • Python Engine: A powerful toolchain for managing and converting a vast array of open-source models into the efficient GGUF format. It also includes an executor for DSPy (Declarative Self-improving Language Programs), allowing the system to build, compile, and optimize complex reasoning workflows instead of relying on simple prompting.

Appendix A: Agent System Breakdown

This section provides a detailed, granular view of each agent within the Soulfield system.

Agent ID Handler Path ~LOC Key Exports Lens Pipeline System Prompt Summary Tools & Data Sources AFS Path
governor backend/agents/handlers/governor.cjs 275 handleRequest, proposeAgentImprovements strategy Chief orchestrator & ethical gatekeeper. Coordinates workflows & validates decisions with a strict Rights→Causality→Truth pipeline. Analyzes system-wide learning data to propose improvements to other agents. graph-reasoning (InfraNodus), reads all agent training data. workspace/agent-workspace/agents/governor/
marketing backend/agents/handlers/marketing.cjs 285 handleRequest, generateCampaignPlan full B2B/SaaS marketing strategist. Creates data-driven campaigns, funnels, and content plans, validated by all 6 lenses. Memory, AFS, Google Calendar, Output Cache, Self-analysis (Graph), Learning Loop, Perplexity. workspace/agent-workspace/agents/marketing/
finance backend/agents/handlers/finance.cjs 260 handleRequest, generateFinancialModel full Financial analyst. Builds rigorous financial models (3-statement, unit economics, etc.) with a heavy focus on causality and data-backed assumptions. Memory, AFS, Google Sheets. workspace/agent-workspace/agents/finance/
seo backend/agents/handlers/seo.cjs 250 handleRequest, generateKeywordResearch full SEO strategist. Performs keyword research, content optimization, and technical audits, ensuring all recommendations are "white-hat" and data-driven. Memory, AFS, Google Docs, Self-analysis (Graph), Learning Loop, Perplexity. workspace/agent-workspace/agents/seo/
builder backend/agents/handlers/builder.cjs 300 execute, generateNotionTemplate, etc. minimal Product creation specialist. Generates tangible digital products like Notion templates, HTML landing pages, and email sequences. Uses CLI delegation for $0 cost code generation. CLI Delegator (Codex/Claude), Output Cache. workspace/agent-workspace/agents/builder/
distributor backend/agents/handlers/distributor.cjs 350 execute, distributeToReddit, etc. minimal Multi-platform distribution specialist. Adapts and distributes content across Reddit, Twitter, LinkedIn, etc., following platform-specific best practices. CLI Delegator (Codex), AFS, Google Docs. workspace/agent-workspace/agents/distributor/
metrics backend/agents/handlers/metrics.cjs 460 execute, analyzeLaunch, makeDecision minimal Performance tracking & analytics specialist. Analyzes business metrics (revenue, engagement) and makes data-driven "Kill/Keep/Scale" decisions. CLI Delegator (Codex), Google Sheets (MCP). workspace/agent-workspace/agents/metrics/
content backend/agents/handlers/content.cjs 210 handleRequest full Technical writer. Creates developer documentation, API guides, and tutorials with a focus on technical accuracy and clarity. Memory, reftools (MCP). workspace/agent-workspace/agents/content/
legal backend/agents/handlers/legal.cjs 200 handleRequest full Legal specialist. Analyzes contracts and policies for risk and compliance, always including appropriate legal disclaimers. Memory. workspace/agent-workspace/agents/legal/
visionary backend/agents/handlers/visionary.cjs 90 handleRequest, execute planning High-level business strategist. Synthesizes market trends to generate unconventional ideas and strategic roadmaps using a specific Truth→Causality→Extrapolation→Structure lens pipeline. AFS. workspace/agent-workspace/agents/visionary/
strategy backend/agents/handlers/strategy.cjs 200 handleRequest planning Business strategy analyst. Performs structured market analysis, competitive intelligence, and SWOT analysis to create actionable strategic plans. Memory. workspace/agent-workspace/agents/strategy/
operations backend/agents/handlers/operations.cjs 215 handleRequest validation Business operations analyst. Optimizes internal processes, identifies bottlenecks, and designs workflow automation plans to improve efficiency. Memory. workspace/agent-workspace/agents/operations/
prompter N/A N/A N/A minimal Agent creation specialist. It is a "meta-agent" that generates the system prompts and configurations for other agents, trained on successful patterns. Uses default council LLM path. workspace/agent-workspace/agents/prompter/
scraper N/A (MCP) N/A N/A minimal Web scraping specialist. Acts as a front-end to the Apify MCP server, allowing it to find and execute from over 7,000 pre-built scrapers and automation actors. Apify (MCP). workspace/agent-workspace/agents/scraper/
jina N/A N/A N/A minimal Semantic reranker. A specialized tool agent that re-ranks lists of text candidates based on semantic relevance to a query. Likely used by other agents for search result refinement. Jina API (assumed). workspace/agent-workspace/agents/jina/
infranodus N/A (Service) N/A N/A N/A Not a standalone agent. It is a graph reasoning service (backend/services/graph-reasoning.cjs) used by other agents, most notably @governor, to analyze text and find structural patterns/gaps. N/A N/A

Appendix B: Data Flow Diagrams

This section provides high-level ASCII diagrams illustrating key data flows within the Soulfield system.

1. User Query Flow (POST /chat)

This diagram illustrates the end-to-end process from a user's HTTP request to receiving a validated agent response.

[User Client]
     |
     | 1. HTTP POST /chat (prompt, agentId)
     v
+-------------------------------------------------+
| backend/index.cjs                               |
| (HTTP Server)                                   |
+-------------------------------------------------+
     |
     | 2. route(req, res) -> runWithCouncil(prompt)
     v
+-------------------------------------------------+
| backend/council.js                              |
| (runWithCouncil)
| - Parses agentId, builds context (memory, KG)   |
| - Pre-flight validation, tool routing (e.g. MCP)|
+-------------------------------------------------+
     |
     | 3. agentRouter.route(agentId, prompt)
     v
+-------------------------------------------------+
| backend/agents/handlers/{agent}.cjs             |
| (e.g., marketing.cjs)
| - Executes agent-specific logic                 |
| - Calls LLM via tools/aiden.cjs (askAiden)      |
+-------------------------------------------------+
     |
     | 4. Raw LLM Output
     v
+-------------------------------------------------+
| backend/lenses/LensMiddleware.js                |
| - Wraps LensOrchestrator                        |
| - Applies agent-specific validation pipeline    |
| - Can auto-fix or throw LensCriticalError       |
+-------------------------------------------------+
     |
     | 5. Validated & Structured Output
     v
+-------------------------------------------------+
| backend/council.js                              |
| - Captures output to memory & learning loop     |
| - Assesses output with RAGSwitch                |
+-------------------------------------------------+
     |
     | 6. Final JSON Response
     v
[User Client]

2. Knowledge Graph Ingestion Flow

This diagram shows how a new document is processed and stored in the SQLite-based Knowledge Graph.

[New Document]
(e.g., from file, web scrape)
     |
     | 1. kg.addDocument(doc)
     v
+-----------------------------------------------------+
| backend/services/knowledge-graph/kg-sqlite.cjs      |
| (KnowledgeGraph.addDocument)
| - Initiates the KnowledgeGraphPipeline              |
+-----------------------------------------------------+
     |
     | 2. Pipeline Step: IngestTask
     v
+-----------------------------------------------------+
| backend/services/knowledge-graph/pipeline/tasks.cjs |
| (IngestTask)
| - Creates entry in 'documents' table in SQLite DB   |
+-----------------------------------------------------+
     |
     | 3. Data: { docId, content }
     v
+-----------------------------------------------------+
| backend/services/knowledge-graph/pipeline/tasks.cjs |
| (ChunkTask)
| - Splits content into 500-char overlapping chunks   |
| - Inserts into 'chunks' table                       |
+-----------------------------------------------------+
     |
     | 4. Data: { docId, content }
     v
+-----------------------------------------------------+
| backend/services/knowledge-graph/pipeline/tasks.cjs |
| (EmbedTask)
| - Creates vector embedding for the *full* document  |
| - Stores in 'embeddings' table (if enabled)         |
+-----------------------------------------------------+
     |
     | 5. Data: { docId }
     v
+-----------------------------------------------------+
| backend/services/knowledge-graph/pipeline/tasks.cjs |
| (ExtractEntitiesTask & SummaryTask)
| - Calls kg.extractEntities() & kg.generateSummaries()|
| - Uses LLM to find entities & create summaries      |
| - Stores in 'entities' & 'summaries' tables         |
+-----------------------------------------------------+
     |
     | 6. kg.findRelationships()
     v
+-----------------------------------------------------+
| backend/services/knowledge-graph/kg-sqlite.cjs      |
| - Maps relationships between extracted entities     |
| - Stores in 'edges' table                           |
+-----------------------------------------------------+
     |
     v
[Processing Complete in SQLite DB]

3. RAG Query Flow

This diagram shows how the system uses the Knowledge Graph to answer a query.

[User Query]
     |
     | 1. kg.ragCompletion(query) or kg.hybridSearch(query)
     v
+-------------------------------------------------+
| backend/services/knowledge-graph/kg-sqlite.cjs  |
| (hybridSearch method)
+-------------------------------------------------+
     |
     | 2. Parallel Search Execution
     |
     +---------------->+--------------------------+
     |                 | backend/services/kg-sqlite.cjs |
     |                 | - FTS5 Full-Text Search  |
     |                 +--------------------------+
     |
     +---------------->+--------------------------+
     |                 | backend/services/embedding.cjs |
     |                 | - Vector Similarity Search |
     |                 +--------------------------+
     |
     +---------------->+--------------------------+
     |                 | backend/services/kg-sqlite.cjs |
     |                 | - Graph Relevance Scoring|
     |                 +--------------------------+
     |
     | 3. Score Fusion (FTS + Vector + Graph)
     v
+-------------------------------------------------+
| backend/services/knowledge-graph/kg-sqlite.cjs  |
| - Assembles weighted & ranked context blocks    |
+-------------------------------------------------+
     |
     | 4. Context sent to LLM for Synthesis
     v
+-------------------------------------------------+
| tools/aiden.cjs (askAiden)                      |
| - LLM generates a natural language answer       |
|   based *only* on the provided context          |
+-------------------------------------------------+
     |
     | 5. Final Synthesized Response
     v
[User Client]

4. Agent Learning Flow

This diagram shows how an agent's performance is captured and used for future system optimization.

+-------------------------------------------------+
| backend/agents/handlers/{agent}.cjs             |
| (e.g., marketing.cjs)
| - Agent produces an output                      |
+-------------------------------------------------+
     |
     | 1. Raw Output
     v
+-------------------------------------------------+
| backend/lenses/LensMiddleware.js                |
| - Lens validation is applied to the output      |
+-------------------------------------------------+
     |
     | 2. Validation Result (Passed/Failed, Metrics, etc.)
     v
+-------------------------------------------------+
| backend/services/learning-loop.cjs              |
| (LearningLoop.captureAgentPerformance)
| - Receives output, prompt, lens results, etc.   |
| - Creates a structured JSON learning record     |
+-------------------------------------------------+
     |
     | 3. Writes JSON record
     v
+-------------------------------------------------+
| workspace/training-data/real-world/{agent}/
| (e.g., marketing/record-123.json)               |
| - Performance data is stored as a file          |
+-------------------------------------------------+
     |
     | 4. (Offline/Batch Process)
     |    `npm run ingest-learnings`
     v
+-------------------------------------------------+
| Python Environment                              |
| (run_dspy_workflow function)
| - Reads training data JSON files                |
| - Uses data as a training set for DSPy          |
| - Compiles and optimizes the agent's underlying |
|   reasoning program                             |
+-------------------------------------------------+
     |
     | 5. Optimized Program (*.json)
     v
[Saved to disk for future agent execution]

5. MCP Tool Invocation

This diagram illustrates how an agent uses an external tool via the Model-Context-Protocol.

+--------------------------------+
| backend/council.js             |
| - Agent logic determines need  |
|   for a tool (e.g., Perplexity)|
+--------------------------------+
     |
     | 1. mcpClient.callTool('perplexity', 'perplexity_ask', {query})
     v
+--------------------------------+     +-----------------------------------------+
| backend/services/mcp/          |     | (Child Process)                         |
| mcpClient.cjs                  | 2.  | Spawns and manages connection to        |
| - Finds 'perplexity' server    |---->| MCP Server via JSON-RPC over stdio      |
|   in its registry              |     | e.g., npx @perplexity-ai/mcp-server     |
| - Sends JSON-RPC `callTool`    |     +-----------------------------------------+
+--------------------------------+     |                                         |
     ^                                 | 3. Tool execution (e.g., calls external |
     |                                 |    Perplexity API)                      |
     | 5. Returns result to council   |                                         |
     |                                 v
+--------------------------------+     +-----------------------------------------+
| backend/services/mcp/          |     | (Child Process)                         |
| mcpClient.cjs                  | 4.  | MCP Server sends JSON-RPC response      |
| - Receives JSON-RPC response   |<----| with the result from the tool           |
| - Resolves the promise         |     |                                         |
+--------------------------------+     +-----------------------------------------+

Appendix C: Key Module Reference

This section provides a detailed reference for the most critical modules in the Soulfield backend.


1. backend/index.cjs

  • Line Count: 997
  • Purpose: Serves as the primary HTTP server entry point for the entire Soulfield application, responsible for routing API requests and managing the server lifecycle.
  • Key Exports: None (self-executing server script).
  • Dependencies: http, url, morgan, fs, path, raw-body, dotenv, ./config/env-check.js, ./council.js, ./dashboard-template.html.cjs, ./services/inbox-processor.cjs.
  • Used By: This is the top-level file executed by node, pm2, and referenced in package.json's start script. It is the root of the application process.
  • Key Configuration: PORT, CONTEXT_SPINE, AIDEN_MODEL, GEMINI_API_KEY.

2. backend/council.js

  • Line Count: 2,723
  • Purpose: Acts as the central nervous system and primary orchestrator for all AI agent activity, routing requests, managing memory, and enforcing quality control.
  • Key Exports: runWithCouncil, getToolStatus, getAgentStatus, getLensAuditData, callClaude.
  • Dependencies: axios, ./services/mcp/mcpClient.cjs, ./services/auto-obsidian-sync.cjs, ./services/agent-error-handler.cjs, ./services/telemetry-logger.cjs, ./agents/manager.cjs, all agent handlers, ./services/agent-router.cjs, ./lenses/LensOrchestrator.js, ./lenses/LensMiddleware.js, ./services/rag-switch.js, ./services/memory/index.cjs.
  • Used By: backend/index.cjs, backend/orchestration/workflow-engine.cjs, backend/services/agent-service-bridge.cjs, and nearly all agent handlers and test files. It is the most widely imported module.
  • Key Configuration: USE_DYNAMIC_PIPELINE, LENS_ENFORCEMENT_MODE, ENABLE_RAG_ESCALATION, PERPLEXITY_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY, ZAI_API_KEY, USE_KNOWLEDGE_GRAPH.

3. backend/jobs.js

  • Line Count: N/A
  • Purpose: Legacy command dispatcher for ! prefixed commands (e.g., !capture).
  • Key Exports: handleJob.
  • Dependencies: N/A.
  • Used By: This file has been DEPRECATED AND REMOVED. Codebase-wide search results confirm it was part of a major refactoring effort where its functionality was replaced by the more robust @agent and MCP-based systems. council.js contains a placeholder variable (commandResults = null; // Placeholder for removed jobs.js integration) indicating its removal.
  • Key Configuration: N/A.

4. backend/services/knowledge-graph/kg-sqlite.cjs

  • Line Count: 2,581
  • Purpose: Implements the core local Knowledge Graph, managing data storage, retrieval, and analysis using a SQLite database.
  • Key Exports: SQLiteKnowledgeGraph (class).
  • Dependencies: better-sqlite3, path, fs, ./embedding-service.cjs, ./pipeline/core.cjs, ./pipeline/tasks.cjs.
  • Used By: backend/services/knowledge-graph/mcp-server.cjs, backend/services/knowledge-graph/context-retriever.cjs, backend/services/knowledge-graph/graph-search.cjs, numerous test and script files.
  • Key Configuration: USE_KG_PIPELINE, USE_KG_EMBEDDINGS, USE_KG_LLM_ENTITIES, USE_KG_LLM_RELATIONSHIPS, USE_KG_RAG, USE_KG_INSIGHTS.

5. backend/services/knowledge-graph/mcp-server.cjs

  • Line Count: 415
  • Purpose: Exposes the Knowledge Graph's capabilities as a standard, tool-accessible service via the Model Context Protocol (MCP).
  • Key Exports: None (self-executing MCP server script).
  • Dependencies: @modelcontextprotocol/sdk, zod, path, ./kg-sqlite.cjs, ./graph-search.cjs.
  • Used By: It is registered in backend/services/mcp/mcpClient.cjs and is spawned as a child process, allowing other agents to query the KG as a tool.
  • Key Configuration: None (inherits configuration from the KG service it wraps).

6. backend/services/memory/memory-sqlite.cjs

  • Line Count: 509
  • Purpose: Provides an SQLite-based vector memory and semantic search capability, acting as a local, drop-in replacement for cloud-based vector stores like Pinecone or Supabase.
  • Key Exports: ensureIndex, embed, upsertDocs, query, deleteDoc, embedAndUpsert, upsertRaw.
  • Dependencies: dotenv, path, fs, worker_threads, ../embedding.cjs, better-sqlite3.
  • Used By: backend/services/memory/index.cjs (the memory router).
  • Key Configuration: MEMORY_UPSERT_BATCH.

7. backend/services/afs.cjs

  • Line Count: 634
  • Purpose: Provides a unified Agent File System (AFS) for reading and writing files to an agent's local workspace, with optional, automatic synchronization to Google Drive.
  • Key Exports: AgentFileSystem (singleton instance).
  • Dependencies: fs, path, glob, ./google/drive.cjs, ./docling-service.cjs, ./memory/memory-supabase.cjs.
  • Used By: backend/services/agent-service-bridge.cjs, backend/services/agent-output.cjs, and various agent handlers (marketing, finance, visionary, etc.) that need to save their work.
  • Key Configuration: None directly; relies on the configuration of its dependent services (e.g., Google Drive credentials).

8. backend/services/gemini-file-search.cjs

  • Line Count: 456
  • Purpose: Acts as a service wrapper around the Google Gemini File Search API, managing document stores, uploads, and RAG queries against agent-generated business documents.
  • Key Exports: initialize, createStore, uploadBatch, search, deleteStore, etc.
  • Dependencies: @google/genai, fs, path.
  • Used By: backend/scripts/compare-rag-systems.cjs, backend/scripts/ingest-to-gemini.cjs. It represents a dedicated RAG store for agent outputs, distinct from the internal KG.
  • Key Configuration: GEMINI_API_KEY.

9. backend/lenses/LensOrchestrator.js

  • Line Count: 874
  • Purpose: Executes sequences of "Lenses" (validation modules) against agent outputs according to predefined or custom pipelines.
  • Key Exports: LensOrchestrator (class), sharedWorkerPool.
  • Dependencies: All individual Lens classes (e.g., TruthLens, CausalityLens, StructureLensV2), ./DynamicPipelineSelector, ../services/graph-worker-pool.cjs.
  • Used By: backend/lenses/LensMiddleware.js, backend/council.js, and all agent handlers. It is the core engine of the quality assurance framework.
  • Key Configuration: LENS_DEBUG.

10. backend/lenses/LensMiddleware.js

  • Line Count: 587
  • Purpose: Acts as an enforcement layer on top of the LensOrchestrator, deciding whether to halt execution, log warnings, or attempt to "auto-fix" agent outputs based on the severity of validation failures.
  • Key Exports: LensMiddleware (class).
  • Dependencies: ./LensOrchestrator, ./LensCriticalError, ./utils/timeout.
  • Used By: backend/council.js (which wraps all agent execution with this middleware).
  • Key Configuration: LENS_ENFORCEMENT_MODE, LENS_AUTO_FIX, LENS_VERBOSE_ERRORS.

11. backend/orchestration/workflow-engine.cjs

  • Line Count: 385
  • Purpose: Executes high-level, multi-agent workflows by composing individual agent calls into sequential or parallel patterns.
  • Key Exports: WorkflowEngine (class).
  • Dependencies: ../council.js, ../lenses/LensOrchestrator.js.
  • Used By: Referenced in numerous examples and called by the @governor agent to execute strategic plans.
  • Key Configuration: None directly.

Appendix D: Lens Framework Complete Reference

This section details the architecture and components of the Soulfield Lens Framework, a sophisticated, pluggable quality assurance system designed to programmatically validate and enforce standards on AI agent outputs.


1. Base Lenses

The framework is built upon a set of individual, specialized lenses, each responsible for analyzing a specific aspect of the text.

TruthLens.js

  • Line Count: 321
  • Purpose: Enforces epistemic humility by detecting uncertain claims, missing citations, and unmarked speculation.
  • Key Methods: apply(), detectUnknowns(), checkCitationCoverage(), detectSpeculation()
  • Regex Patterns:
    • Hedging language: /�(might|may|could|possibly|perhaps|likely|seems)�/gi
    • Claims requiring citation: /�(research shows|studies show|�+\s*%)�/gi
    • Speculation: /�(will|shall|would|should|predict|hypothesis)�/gi
  • Metrics Calculated: cc (Citation Coverage), ud (Unknown Discipline), icr (Internal Contradiction Rate).
  • Failure Conditions: Fails if hedging language is used without [UNKNOWN] markers. In strict mode, fails if citation coverage is below minCitationCoverage (default: 40%).
  • Auto-fix Capabilities: The LensMiddleware can auto-fix TruthLens failures by adding [UNKNOWN] markers to uncited claims.

CausalityLens.js

  • Line Count: 433
  • Purpose: Enforces rigorous causal reasoning by detecting claims of causation vs. simple correlation and ensuring that mechanisms are explained.
  • Key Methods: apply(), detectCorrelationVsCausation(), checkConditionals()
  • Regex Patterns:
    • Strong Causal: /�(causes|because|therefore|results in)�/gi
    • Weak Causal (Correlation): /�(associated with|linked to|related to)�/gi
    • Conditionals: /�(if|when)�.*?�then�/gi
  • Metrics Calculated: causal_strength, mechanism_coverage, dependency_clarity.
  • Failure Conditions: Fails if conditional statements (IF/THEN) or causal claims (leads to) are made without corresponding mechanism markers (BECAUSE, via, etc.), or if causal strength is below a threshold.
  • Auto-fix Capabilities: None directly within the lens.

ContradictionLens.js

  • Line Count: 432
  • Purpose: Detects logical inconsistencies and conflicting statements within the text.
  • Key Methods: apply(), detectDirectContradictions(), detectSemanticConflicts()
  • Regex Patterns:
    • Direct: (\w+)\s+is\s+(\w+) vs. \1\s+is\s+not\s+\2
    • Semantic Pairs: always/never, increase/decrease, true/false.
    • Negation: can/cannot.
  • Metrics Calculated: contradiction_count, internal_consistency_ratio.
  • Failure Conditions: Fails if the number of detected contradictions exceeds maxContradictions (default: 0).
  • Auto-fix Capabilities: The LensMiddleware can auto-fix by adding a [WARNING: Potential contradictions detected] flag to the text.

ExtrapolationLens.js

  • Line Count: 320
  • Purpose: Ensures that all predictions, projections, and forecasts are transparently marked and not presented with undue certainty.
  • Key Methods: apply(), detectPredictions(), detectOverconfidence()
  • Regex Patterns:
    • Predictions: /�(will|forecast|predict|next year)�/gi
    • Markers: /(\[PROJECTION\]|\s*\bPROJECTION\b\s*|\s*\bHYPOTHESIS\b\s*)/gi
    • Overconfidence: /�(definitely|certainly|guaranteed|100% sure)�/gi
  • Metrics Calculated: marked_ratio (% of predictions that are marked), confidence_level, prediction_count.
  • Failure Conditions: Fails if predictions are found without the required markers or if overconfident language is used without hedging.
  • Auto-fix Capabilities: The LensMiddleware can auto-fix by replacing overconfident phrases with more cautious ones (e.g., "will definitely" -> "likely will").

RightsLens.js

  • Line Count: 328
  • Purpose: A critical safety lens that detects potential privacy violations, missing data consent, and non-compliance with regulations like GDPR.
  • Key Methods: apply(), detectDataCollection(), checkConsent(), checkSensitiveData()
  • Regex Patterns:
    • Data Collection: /�(collect|track|monitor)\\s+(user|personal)\\s*(data|information)�/gi
    • Sensitive Data (PII): /�(SSN|email|phone|credit card)�/gi
    • Consent Markers: /(\[CONSENT_REQUIRED\]|\bwith consent\b|\bopt-in\b)/gi
  • Metrics Calculated: consent_coverage, privacy_compliance_score, gdpr_compliance.
  • Failure Conditions: Fails if data collection is detected without corresponding consent markers or if sensitive data is mentioned without security measures. This is a critical lens, and failures often halt execution.
  • Auto-fix Capabilities: The LensMiddleware can auto-fix by redacting detected PII (e.g., replacing an email address with [REDACTED]).

StructureLens.js (V1)

  • Line Count: 366
  • Purpose: A legacy lens that performs basic operational rigor checks, ensuring that procedural instructions (e.g., "deploy the app") include necessary preconditions.
  • Key Methods: apply(), detectActions(), checkPreconditions()
  • Regex Patterns:
    • Actions: /�(deploy|install|configure|delete|execute|run)�/gi
    • Preconditions: /(\[PRECONDITION\]|\brequires?:|ensure that\b)/gi
  • Metrics Calculated: precondition_coverage, action_count, completeness.
  • Failure Conditions: Fails if actions are detected without any precondition markers being present in the text.
  • Auto-fix Capabilities: The LensMiddleware can add a generic [Assuming prerequisites are met] warning.

CausalQualityLens.js

  • Line Count: 619
  • Purpose: A graph-based lens that measures the quality of causal reasoning, going beyond simple pattern matching to analyze the structure of the argument itself.
  • Key Methods: apply(), buildCausalGraph(), analyzeGraphQuality()
  • Regex Patterns: Uses a rich set of patterns to extract causal statements like IF: X THEN: Y, A results in B, and C depends on D.
  • Metrics Calculated: causal_quality_score (0-100), causal_chain_length, bridge_variable_count, isolated_causes, circular_dependencies.
  • Failure Conditions: Fails if the overall causal_quality_score is below a threshold (minQualityScore, default: 70) or if circular dependencies are detected.
  • Auto-fix Capabilities: None. This lens is for analysis, not modification.

StructureLensV2.js

  • Line Count: 416
  • Purpose: The modern successor to StructureLens. It performs a much more advanced analysis by converting the entire text into a knowledge graph to measure structural coherence and find logical gaps.
  • Key Methods: apply(), analyzeGraphStructure() (via local-graph-analysis.cjs).
  • Regex Patterns: Uses V1 patterns for backward compatibility but primarily relies on the graph analysis service.
  • Metrics Calculated: modularity, avg_betweenness, gap_count, community_count, quality_score.
  • Failure Conditions: Fails if the number of structural gaps exceeds a threshold (maxGaps, default: 3) or if the overall structural quality score is too low.
  • Auto-fix Capabilities: None directly, but the gaps it identifies are used by the RAGSwitch to trigger self-correction.

2. Orchestration

The execution of lenses is managed by two key orchestration modules.

  • LensOrchestrator.js (874 lines): This is the pipeline executor. It takes a pipeline name (e.g., 'full') or a custom array of lenses, resolves it to an ordered sequence, and runs each lens one by one. It aggregates all results, metrics, and issues into a final report object.
  • LensMiddleware.js (587 lines): This is the enforcement layer that wraps the LensOrchestrator. It checks the final report for critical failures based on agent-specific rules. Based on its configured enforcementMode (strict, adaptive, soft), it will either halt execution by throwing a LensCriticalError, log a warning and continue, or attempt to auto-fix the output and re-validate it.

3. Pipeline Configurations

The LensOrchestrator defines several standard pipelines:

  • minimal: ['truth', 'graph'] - Basic factual and structural check.
  • strategy: ['rights', 'causality', 'causal_quality', 'truth'] - Used by @governor for ethical and logical decision-making.
  • full: ['rights', 'truth', 'causality', 'causal_quality', 'graph', 'contradiction', 'extrapolation', 'structure'] - The most comprehensive validation, used by default for most worker agents.
  • dynamic: The lens sequence is not predefined. Instead, the DynamicPipelineSelector.js service is called to analyze the text and intelligently select the most relevant subset of lenses to run, saving time and cost.

4. Integration Points

The Lens Framework is deeply integrated into the agent execution lifecycle within backend/council.js.

  • Invocation: After an agent's raw output is received from an LLM call, council.js immediately passes the output to the LensMiddleware.validateWithEnforcement() method before any further processing.
  • Failure Handling: If the middleware throws a LensCriticalError (due to a critical failure in strict mode, or a failed adaptive fix), the runWithCouncil function in council.js catches this error and halts execution for that agent, returning an error message to the user or the parent WorkflowEngine. This prevents low-quality or unsafe content from proceeding.

Appendix E: Configuration & Environment Reference

This section details the environment variables, feature flags, and core configuration logic that governs the Soulfield OS.


1. Core Configuration

  • Source of Truth: .env.example
  • Validation Logic: backend/config/env-check.js
  • Loader: dotenv (initialized in backend/index.cjs)

Essential Variables

These variables are required for the system to boot, unless DEV_NO_API=1 is set.

| Variable | Description | Default / Note | | : | : | : | | ANTHROPIC_API_KEY | Primary LLM access (Claude) | Checked in env-check.js | | PORT | HTTP Server Port | 8790 (defined in index.cjs) | | AIDEN_MODEL | Model ID for the primary agent | claude-sonnet-4-5-20250929 | | DEV_NO_API | Developer mode (skip API calls) | 0 (set to 1 for offline dev) |


2. Feature Flags

The system uses several boolean flags (0/1) to toggle advanced capabilities.

RAG & Knowledge Graph

| Flag | Description | Default | | : | : | : | | ENABLE_RAG_ESCALATION | Activate RAGSwitch for quality control | 0 | | USE_DYNAMIC_PIPELINE | Enable InfraNodus-guided lens selection | 0 | | ENABLE_GRAPH_ANALYSIS | Perform graph analysis on agent outputs | 1 | | USE_KNOWLEDGE_GRAPH | Inject KG context into agent prompts | 1 (implied in council.js) | | USE_TRAINING_DATA | Recall training materials from memory | 1 |

Agent & Tool Capabilities

| Flag | Description | Default | | : | : | : | | ENABLE_JINA | Enable Jina AI tools | Auto-detected from API key | | ENABLE_INFRANODUS | Enable InfraNodus graph reasoning | Auto-detected from API key | | ENABLE_SCRAPER | Enable Bright Data scraping | Auto-detected from API key | | ENABLE_GOOGLE_WORKSPACE| Enable Google MCP tools | Auto-detected from credentials | | ALLOW_AUTO_COMMANDS | Security Risk: Auto-exec ! commands | 0 (Keep disabled) |


3. Service Integrations

External services are configured via specific API keys and endpoints.

LLM Providers

  • Anthropic: ANTHROPIC_API_KEY, AIDEN_MODEL
  • OpenAI (Backup): OPENAI_API_KEY
  • Perplexity (RAG): PERPLEXITY_API_KEY
  • Z.ai (Low-Cost): ZAI_API_KEY, ZAI_MODEL (glm-4-plus)

Memory & Database

  • Supabase (Vector): SUPABASE_URL, SUPABASE_ANON_KEY, SUPABASE_SERVICE_KEY
  • Gemini File Search: GEMINI_API_KEY (for RAG over files)

Tools (MCP & Native)

  • Bright Data: BRIGHTDATA_TOKEN, BD_ZONE_* (multiple zones)
  • Google Workspace: GCAL_CLIENT_ID, GCAL_CLIENT_SECRET, GCAL_REFRESH_TOKEN
  • Jina: JINA_API_KEY
  • InfraNodus: INFRANODUS_API_KEY
  • Ref.tools: REF_API_KEY
  • Apify: APIFY_TOKEN

4. Port Mapping

| Service | Port | Defined In | | : | : | : | | Soulfield Backend | 8790 | backend/index.cjs (via process.env.PORT) | | Graph Viewer | 8791 | Referenced in frontend/graph-viewer.html |

Appendix F: MCP Integration Complete Map

This section provides a comprehensive map of the Model Context Protocol (MCP) integrations, which form the backbone of Soulfield's extensible tooling ecosystem.


1. MCP Client (backend/services/mcp/mcpClient.cjs)

The MCPClientService is a singleton service that acts as the central hub for managing all communication with internal and external MCP servers.

  • Tool Discovery & Connection:

    • The client maintains a static registry of known servers in the MCP_SERVERS constant.
    • When mcpClient.connect(serverId) is called, it looks up the server's configuration in the registry.
    • It spawns the MCP server as a child process using the specified command and args (e.g., npx @perplexity-ai/mcp-server).
    • It manages the connection to the child process via a StdioServerTransport, communicating over stdin/stdout using a raw JSON-RPC implementation (which was chosen to bypass bugs in the official SDK).
  • Tool Invocation:

    • An agent calls mcpClient.callTool(serverId, toolName, args).
    • The client looks up the active connection for the serverId.
    • It sends a JSON-RPC request to the child process with the specified toolName and args.
    • It awaits the JSON-RPC response from the child process's stdout, parses it, and returns the result to the calling agent.
  • Error Handling:

    • The connect method checks for required environment variables (requiresEnv) before attempting to spawn a server, throwing an error if they are missing.
    • The underlying RawMCPClient and the callTool method handle errors during tool invocation, which are then caught by the agent handlers and can be managed by the AgentErrorHandler circuit breaker in council.js.

2. Internal MCP Servers

Soulfield exposes its own core components as MCP servers, allowing them to be used as tools by other agents or systems.

Soulfield KG MCP (backend/services/knowledge-graph/mcp-server.cjs)

  • Purpose: Exposes the internal Knowledge Graph as a standardized MCP toolset.
  • Tools Exposed:
    • mcp__soulfield_kg__search: The main entry point for querying the KG.
      • query (string): The search query.
      • searchType (enum): "GRAPH_COMPLETION", "CHUNKS", "INSIGHTS", "CODE". Default: "GRAPH_COMPLETION".
      • options (object): A rich object with numerous optional filters and flags like limit, agent, includeGraph, filter, traverseDepth, etc.
    • mcp__soulfield_kg__getStats: Returns statistics about the knowledge graph, including node/edge counts, density, and isolated nodes.
      • Parameters: None.
    • mcp__soulfield_kg__getPerformance: Returns mock performance metrics like average query time and memory usage.
      • Parameters: None.
    • mcp__soulfield_kg__multiHopPath: Finds a path between two entities in the graph.
      • sourceEntity (string): Starting entity name.
      • targetEntity (string): Target entity name.
      • maxHops (number, optional): Maximum path length.
    • mcp__soulfield_kg__causalChain: Builds a logical IF/THEN chain from a source entity.
      • sourceEntity (string): The starting entity for the chain.
      • targetEntity (string, optional): The outcome entity to trace to.
      • causalRelTypes (array, optional): Relationship types to follow.
    • mcp__soulfield_kg__disambiguateEntity: Resolves ambiguity for an entity name by optionally filtering by type.
      • entityName (string): The entity name to disambiguate (e.g., "Apple").
      • entityType (string, optional): The required type (e.g., "company").
    • mcp__soulfield_kg__temporalConflict: Resolves conflicts for relationships that change over time (e.g., has_ceo).
      • sourceEntity (string): The entity to query.
      • relationshipType (string): The relationship to resolve.
      • resolutionMethod (enum): 'latest' or 'earliest'.
    • mcp__soulfield_kg__explicitRelationships: Performs a strict query for explicit, direct relationships only.
      • sourceEntity (string): The entity to query.
      • relationshipType (string): The exact relationship type to find.
      • filters (object, optional): Additional property filters.

3. External MCP Servers

The following external tool servers are defined in the mcpClient.cjs registry:

| Server ID | Name | Command | Description | | : | : | : | : | | sequentialThinking| Sequential Thinking | npx -y @modelcontextprotocol/server-sequential-thinking | Provides step-by-step reasoning with branching and revision. | | reftools | Ref.tools | npx -y ref-tools-mcp@latest | Enables documentation search with section-level precision. | | apify | Apify | npx -y @apify/actors-mcp-server | Provides access to over 7,000 web scrapers and automation tools. | | supabase | Supabase | npx -y -p supabase-mcp supabase-mcp-claude | Manages Postgres + pgvector for agent memory and data storage. | | perplexity | Perplexity | npx -y @perplexity-ai/mcp-server | Provides real-time web search with AI summaries and citations. | | googleWorkspace | Google Workspace | npx -y @modelcontextprotocol/server-google-workspace | Integrates with Google Calendar, Docs, Sheets, Gmail, and Drive. | | playwright | Playwright | npx -y @executeautomation/playwright-mcp-server | Enables browser automation for navigation, scraping, and interaction. |


4. MCP Configuration

  • Definition Location: The primary configuration is the static MCP_SERVERS object defined directly in backend/services/mcp/mcpClient.cjs. The codebase does not contain a .mcp.json file, indicating a preference for in-code registry over external JSON files for the Node.js backend.
  • Connection Mechanism: When an agent needs a tool, mcpClient finds the server in its registry and uses its command and args to spawn it as a new child process. It then establishes a JSON-RPC communication channel over the child process's standard input/output.
  • Environment Variables: Each server has specific environment variable requirements, which are checked by the mcpClient before a connection is attempted. Key variables include:
    • reftools: REF_API_KEY
    • apify: APIFY_TOKEN
    • supabase: SUPABASE_URL, SUPABASE_ANON_KEY, SUPABASE_SERVICE_ROLE_KEY
    • perplexity: PERPLEXITY_API_KEY, PERPLEXITY_MODEL
    • googleWorkspace: GCAL_CLIENT_ID, GCAL_CLIENT_SECRET, GCAL_REFRESH_TOKEN

Appendix G: Test Coverage Map

This section provides an overview of the testing landscape within the Soulfield backend.

1. Overview

  • Test Directory: backend/tests/
  • Test Runner: Node.js native test runner (node --test)
  • Command: npm test (executes backend/tests/*.test.cjs)
  • Structure: A mix of unit, integration, performance, and chaos tests.

2. Key Test Categories & Files

Core Service Tests

  • knowledge-graph-sqlite.test.cjs: Validates the SQLite Knowledge Graph.
    • Tests: Schema initialization, document operations (add/chunk), entity extraction, relationship detection, graph traversal, hybrid search.
    • Key Scenarios: Verifies FTS5 index creation, checks performance thresholds (e.g., search < 20ms), ensures correct entity types.

Agent Integration Tests

  • agent-handlers-afs.test.cjs: Tests the integration of agent handlers with the Agent File System (AFS).
    • Tests: Verifies that @marketing, @finance, and @seo handlers correctly detect deliverable types from prompts and export the expected functions.
    • Key Scenarios: Checks if "Create a campaign" maps to the 'campaigns' deliverable type.

Lens Framework Tests

  • lens-orchestrator.test.cjs: Validates the quality assurance engine.
    • Tests: Pipeline execution, lens skipping, halt-on-failure logic, metrics aggregation, report generation.
    • Key Scenarios: Ensures strict mode halts on violations, checks that custom pipelines run the correct lens sequence, validates score calculation.

Workflow & Orchestration Tests

  • multi-agent-orchestration.test.cjs: Tests complex, multi-agent scenarios.
    • Tests: Sequential, parallel, conditional, and iterative orchestration patterns.
    • Key Scenarios: Simulates a full "Market Entry Strategy" workflow involving @visionary -> @finance -> @marketing, verifying that the correct agents are invoked in the correct order.

3. Testing Patterns

  • Mocking: The project uses manual mocking patterns (e.g., mock-ab-test.cjs) rather than heavy reliance on external mocking libraries. Dependencies are often injected or stubbed within the test files.
  • Performance Benchmarks: Critical services like the Knowledge Graph have built-in performance benchmarks within their test suites to prevent regression.
  • Chaos Testing: Specialized tests (chaos-*.test.cjs) exist to verify system resilience against configuration tampering and handler outages.

4. Coverage Gaps

  • MCP Tool Logic: While connection logic is likely tested, deep unit tests for the specific logic within external MCP tools (like the Google Workspace interactions) are less visible, likely due to the difficulty of mocking external APIs.
  • Newer Agents: Handlers for @content and @legal appear to rely more on broad integration tests than specific, granular unit tests compared to the core @marketing and @finance agents.

Appendix H: Google Workspace Integration

This section documents the system's deep integration with Google Workspace services, managed through a dedicated suite of service modules.


1. OAuth2 Authentication

Authentication is handled centrally by backend/services/google/auth.cjs.

  • Dependencies: googleapis, google-auth-library.
  • Setup: It uses a singleton OAuth2Client initialized with GCAL_CLIENT_ID, GCAL_CLIENT_SECRET, and GCAL_REDIRECT_URI.
  • Flow:
    • It requires a valid GCAL_REFRESH_TOKEN in the environment to function autonomously.
    • It automatically handles access token refresh events via the tokens event listener on the OAuth2 client.
    • It exports a getAuthClient() function used by all other Google service modules to obtain an authenticated API client.
  • Scopes: A comprehensive list of scopes covers Calendar, Drive (file access), Docs, Sheets, and Gmail (read/write/send).

2. Service Implementations

Each Google service has a dedicated module in backend/services/google/ that abstracts the Google API into clean, agent-friendly functions. All services export a handleRequest(payload) function, enabling uniform invocation.

calendar.cjs

  • Purpose: Manage calendar events.
  • Key Methods: listEvents, getEvent, createEvent, updateEvent, deleteEvent, listCalendars.
  • Agent Capabilities: Agents can schedule meetings, check availability, and manage their own timelines.

docs.cjs

  • Purpose: Create and manipulate Google Docs.
  • Key Methods: listDocs, getDoc, createDoc (supports initial content insertion), updateDoc (supports content replacement), deleteDoc, exportDoc (exports as PDF/buffer).
  • Agent Capabilities: Agents (like @content or @seo) can generate reports, write articles, and export them directly to Docs.

sheets.cjs

  • Purpose: Read and write to Google Sheets.
  • Key Methods: listSheets, getSheet, createSheet (supports multiple tabs), updateSheet, appendSheet (useful for logging/metrics), clearSheet.
  • Agent Capabilities: Agents (like @finance or @metrics) can build financial models, track KPIs, and log performance data.

gmail.cjs

  • Purpose: Send and receive emails.
  • Key Methods: listMessages, getMessage, sendMessage, createDraft, deleteDraft, trashMessage.
  • Agent Capabilities: Agents can draft emails, send notifications, and process incoming messages.

drive.cjs

  • Purpose: General file management and organization.
  • Key Methods: listFiles, getFile, downloadFile, uploadFile, updateFile, deleteFile, shareFile.
  • Helpers: ensureFolderPath and findFolderByName allow agents to create and navigate a structured directory hierarchy (e.g., "Soulfield Agents/Marketing/Campaigns").

3. AFS Integration (Agent File System)

The Agent File System (AFS) in backend/services/afs.cjs is the primary consumer of the Drive service.

  • Sync Mechanism: AFS implements a syncToDrive method that automatically mirrors local file operations to Google Drive.
  • Directory Mapping:
    • Shared Files: Local workspace/agent-workspace/shared/ maps to Soulfield Agents/Shared Documents/ in Drive.
    • Agent Files: Local workspace/agent-workspace/agents/{agentId}/ maps to Soulfield Agents/Agent Output/{agentId}/ in Drive.
  • Workflow: When an agent saves a file (e.g., afs.writeFile()), AFS writes it to the local disk first, then immediately uploads it to the corresponding folder in Google Drive, returning the web link.

4. Agent Usage Pattern

Agents interact with these services primarily through the handleRequest interface, often via the mcpClient if accessing the googleWorkspace MCP server, or directly via the service bridge for internal logic.

Example: @marketing Agent Campaign Creation

  1. Generate: The agent generates a campaign plan text.
  2. Save: It calls afs.writeFile() to save the plan as a Markdown file. AFS syncs this to Soulfield Agents/Agent Output/marketing/campaign.md.
  3. Schedule: It parses the plan for dates and calls calendar.createEvent() to schedule the campaign launch.
  4. Report: It calls docs.createDoc() to create a formal "Campaign Brief" document with the generated content.

Appendix I: Training Data & DSPy System

This section documents the mechanism by which the system learns from its own operations, converting agent outputs into training data to optimize future performance.


1. Training Data Sources

The system maintains a structured repository of knowledge and performance data in the training-data/ directory.

  • Domain Knowledge:

    • training-data/business/: Books, guides, and papers on general business strategy.
    • training-data/finance/: Financial modeling resources (e.g., "Intermediate Accounting").
    • training-data/marketing/: Marketing frameworks and textbooks.
    • training-data/seo/: SEO guides and whitepapers.
    • Note: These directories contain PDF/text files that are ingested to provide agents with domain expertise.
  • Real-World Learning (The Feedback Loop):

    • training-data/real-world/: This is the active learning directory.
    • Structure: It is organized by agent name (e.g., marketing/, finance/, governor/).
    • Content: It contains JSON files, each representing a single execution cycle of an agent. These files capture the prompt, the output, validation results, and metadata, effectively creating a dataset of "what happened."

2. Docling Pipeline

The system uses Docling to process unstructured documents (PDFs, papers) into structured, chunked text for RAG.

  • Service: backend/services/docling/docling-service.cjs (inferred from afs.cjs imports) handles the conversion.
  • Workflow:
    1. A file is placed in workspace/agent-workspace/inbox/pending.
    2. The InboxProcessor (backend/services/inbox-processor.cjs) detects the file.
    3. It uses the Docling service to parse the document structure (headings, paragraphs).
    4. The content is chunked and stored in the vector memory system (Supabase or SQLite) via backend/services/memory/, making it available for agents to recall.

3. DSPy Integration

DSPy (Declarative Self-improving Language Programs) is the engine used to optimize agent behavior programmatically, moving beyond static prompts.

  • Implementation:

    • Environment: DSPy runs in a dedicated Python virtual environment (.venv-dspy), ensuring dependency isolation.
    • Training Scripts: Python scripts like workspace/training-examples/retrain-pipeline.py and dspy-train-prompter.py are the workhorses. They import the dspy library.
    • Process: These scripts read the JSON files from training-data/real-world/ to create a training set. They then use DSPy optimizers (like BootstrapFewShot) to "compile" the agent's logic. This process automatically generates and refines the prompts (instructions and few-shot examples) to maximize the agent's performance metric (e.g., quality score).
  • Data Collection:

    • The PrompterLogger (backend/services/prompter-logger.cjs) is a specialized service for the @prompter agent. It logs every prompt generation request and result to workspace/training-examples/production-logs/prompter/. These logs serve as the specific training dataset for optimizing the @prompter agent itself.

4. The Learning Loop

The "Learning Loop" is the automated cycle that connects execution, validation, and optimization.

  1. Execution: An agent (e.g., @marketing) performs a task.
  2. Validation & Analysis:
    • The Lens Framework validates the output.
    • InfraNodus/Graph Analysis scores the content's structural quality.
  3. Capture:
    • backend/services/learning-loop.cjs aggregates all this data (prompt, output, scores, validation issues).
    • It generates a "Training Prompt"—a structured summary of "What Worked" and "What Failed."
    • This packet is saved as a JSON file in training-data/real-world/{agent}/.
  4. Optimization (Offline):
    • The Python DSPy scripts ingest these JSON files.
    • Successful examples (high quality score, passed lenses) are used to reinforce good behavior.
    • Failed examples can be used to generate "negative constraints" or to refine the instructions.
  5. Deployment: The optimized prompts are updated in the agent definitions (backend/data/agents.json), closing the loop and improving the system's baseline performance.

Appendix J: Gap Analysis & System Health

This section provides an assessment of the system's current state based on a deep scan of the Knowledge Graph and codebase structure.


1. Knowledge Graph Health (Self-Diagnosis)

The system's internal Knowledge Graph currently reports a Quality Score of 44/100. This low score is driven by several key structural issues identified by the generate-insights.cjs tool:

  • Disconnected Clusters: The graph sees "production" systems (logging, telemetry) and "modular" components (agents, handlers) as separate, unconnected islands. This indicates a lack of documentation or explicit code links bridging the operational layer and the agentic logic layer.
  • Isolated Entities: 23 key entities are "stranded" in the graph, including major agents like @marketing, @finance, and @governor. While these agents are fully implemented in code, their relationships to the broader system architecture are not sufficiently indexed in the KG.
  • Missing Context: 50 documents in the knowledge base lack summaries, reducing the system's ability to retrieve them effectively during RAG operations.

2. Architecture Gaps (Codebase Analysis)

Comparing the codebase reality with the system's intended design reveals specific gaps:

  • Testing Disparity: While the core business agents (@marketing, @finance) have robust, deep integration tests, newer agents (@content, @legal) rely on "quick" tests that may not cover complex edge cases.
  • MCP Tool Opacity: The system treats external MCP tools (like Google Workspace) as black boxes. There is limited internal testing for the logic inside these external tools, relying instead on the assumption that the external API contracts will hold.
  • DSPy Automation: The "offline" nature of the DSPy optimization loop is a friction point. Currently, it requires manual intervention (running scripts) to process the learning data and update agent prompts. A fully automated pipeline that triggers optimization based on data volume thresholds would close this loop.

3. Recommendations for Improvement

  1. Bridge the "Production-Modular" Gap: Create explicit documentation or code comments that link the operational services (logging, telemetry) directly to the agent handlers they support. This will help the Knowledge Graph "see" how the system fits together.
  2. Re-index the Knowledge Graph: Run a full re-indexing job to ensure all agent handlers and their relationships are properly captured, addressing the "isolated entity" problem.
  3. Automate the Learning Loop: Implement a cron job or a trigger within the @governor agent to automatically run the DSPy optimization scripts when a sufficient volume of new learning data (e.g., 50 records) has accumulated.

Appendix K: Glossary

Term Definition
AFS (Agent File System) A unified interface for agents to read/write files to their local workspace, with optional automatic synchronization to Google Drive.
Agent An autonomous AI entity with a specific role (e.g., @marketing, @finance), a defined lens pipeline, and access to specific tools.
Council The central orchestration service (council.js) that routes requests, manages agent execution, and enforces system-wide policies.
Docling A document processing service used to ingest unstructured files (PDFs) into the system's knowledge base.
DSPy A framework for programmatically optimizing language model prompts using training data.
Governor The top-level meta-cognitive agent (@governor) responsible for strategic decision-making and system improvement.
GraphLens A visualization and analysis tool that represents knowledge as a graph to find structural gaps and insights.
InfraNodus An external graph analysis service used for advanced structural text analysis (now largely replaced by local equivalents).
Knowledge Graph (KG) The system's long-term memory store, implemented in SQLite, representing concepts and their relationships.
Learning Loop The automated cycle of capturing agent performance data, analyzing it, and using it to optimize future prompts.
Lens A specific validation module (e.g., TruthLens, RightsLens) that checks agent output for a particular quality criteria.
Lens Framework The overall system for chaining multiple lenses together into validation pipelines.
MCP (Model Context Protocol) A standard protocol for connecting AI agents to external tools and data sources.
Orchestrator The component (WorkflowEngine) responsible for executing multi-step, multi-agent workflows.
RAG (Retrieval-Augmented Generation) A technique for enhancing LLM responses by retrieving relevant context from a knowledge base.
RAGSwitch A mechanism that autonomously decides whether to escalate a query to an external RAG provider based on response quality.
Vector Memory A database (Supabase or SQLite) that stores text embeddings for semantic search and retrieval.

Appendix L: Developer Quick Start Guide

This guide provides practical, code-centric instructions for common development tasks within the Soulfield OS ecosystem.

1. Adding a New Agent

To add a new agent (e.g., @support), follow these steps:

  1. Update Registry: Add the agent definition to backend/data/agents.json.

    {
      "id": "support",
      "name": "Customer Support",
      "role": "support-specialist",
      "status": "active",
      "system": "You are @support...",
      "lensPipeline": "full"
    }
  2. Create Handler: Create backend/agents/handlers/support.cjs.

    // backend/agents/handlers/support.cjs
    const { askAiden } = require('../../../tools/aiden.cjs');
    const { LensOrchestrator } = require('../../lenses/LensOrchestrator.js');
    
    // Initialize dedicated lens orchestrator
    const lensOrchestrator = new LensOrchestrator({
      pipeline: 'full',
      agent: 'support'
    });
    
    async function handleRequest(prompt, context = {}) {
      // 1. Reasoning
      const response = await askAiden({ system: "You are @support...", messages: [{ role: "user", content: prompt }] });
    
      // 2. Validation
      const lensResult = await lensOrchestrator.applyAll(response);
    
      return {
        success: true,
        output: response,
        lens_validation: lensResult
      };
    }
    
    module.exports = { handleRequest };
  3. Register Router: Update backend/services/agent-router-adapters.cjs to include the new handler.

    // backend/services/agent-router-adapters.cjs
    exports.support = require('../agents/handlers/support.cjs');
  4. Register in Council: Update backend/council.js to register the route.

    // backend/council.js
    agentRouter.register('support', agentHandlers.support);

2. Debugging Lens Failures

If an agent's output is being rejected by the Lens Framework:

  1. Enable Debug Mode: Start the server with LENS_DEBUG=true.

    LENS_DEBUG=true npm start
  2. Inspect Logs: Watch the console or check workspace/data/logs/lens-debug-*.jsonl. You will see detailed output for each lens execution.

    [LensOrch] >> truth start
    [LensOrchestrator:DEBUG] Truth Lens:
      - Passed: false
      - Issues: 1 total
        1. Contains hedging language without [UNKNOWN] markers
    
  3. Common Fixes:

    • Truth Lens: Ensure the agent uses [UNKNOWN] markers for uncertain claims. Update the system prompt to enforce this.
    • Causality Lens: The agent must use "BECAUSE" clauses to explain "IF/THEN" logic.
    • Rights Lens: Ensure no PII is generated. Use [REDACTED] placeholders.

3. Tracing a Request

To trace a request from the HTTP entry point through to the agent execution:

  1. Send Request:

    curl -X POST http://localhost:8790/chat \
      -H "Content-Type: application/json" \
      -d '{"prompt": "@marketing Create a campaign", "agent": "marketing"}'
  2. Follow the Log Trail:

    • Entry: [HTTP] POST /chat (in backend/index.cjs)
    • Routing: [council] Routing to @marketing handler (in backend/council.js)
    • Execution: [Marketing] Generating strategy for... (in marketing.cjs)
    • Validation: [LensOrch] ⚡ START pipeline=full (in LensOrchestrator.js)
    • Completion: [council:router] ✅ @marketing routed successfully

4. Running Tests

  • Run All Tests:

    npm test
  • Run Specific Test File:

    node --test backend/tests/agent-handlers-afs.test.cjs
  • Run with Mocked LLM: To avoid spending API credits during testing, ensure your environment does not have ANTHROPIC_API_KEY set, or mock the askAiden tool in your test setup. Most integration tests skip automatically if keys are missing.

Appendix M: Known Technical Debt (Here Be Dragons)

This section documents critical areas of technical debt, known issues, and potential failure points within the codebase.

1. Critical Dragons (Fix Before Production)

  • Location: backend/council.js (File size)

    • Problem: council.js has grown to 2,723 lines and handles routing, memory, validation, telemetry, and tool execution. It is a "God Object" with high coupling.
    • Risk: Extremely high blast radius for changes; difficult to test; single point of failure.
    • Remediation: Refactor into micro-services (OrchestratorService, MemoryRouter, TelemetryService) and reduce council.js to a thin coordination layer.
    • Effort: 40 hours
  • Location: backend/services/mcp/mcpClient.cjs (Contract Testing)

    • Problem: MCP tools are treated as black boxes. There are no contract tests to verify that external tools (Google, Perplexity) behave as expected when their APIs change.
    • Risk: Silent failures in production if external schemas change; agents may hallucinate tool usage.
    • Remediation: Implement contract tests that validate the input/output schema of every registered MCP tool against a mock or sandbox.
    • Effort: 20 hours

2. High Priority (Fix Within 30 Days)

  • Location: backend/services/learning-loop.cjs (Automation)

    • Problem: The learning loop is currently "open." Agent performance data is captured to JSON files, but the re-training step (DSPy optimization) must be triggered manually.
    • Risk: Agents do not improve automatically; feedback is lost or stale.
    • Remediation: Implement a cron job or @governor trigger to automatically run the Python optimization scripts when a data threshold (e.g., 50 new records) is reached.
    • Effort: 10 hours
  • Location: backend/agents/handlers/ (Test Coverage)

    • Problem: Core agents like @marketing and @finance have integration tests, but newer agents (@content, @legal) rely on "quick" smoke tests.
    • Risk: Regression bugs in specific agent logic; lower reliability for newer capabilities.
    • Remediation: Create dedicated *.test.cjs suites for all 15 agents, mirroring the depth of agent-handlers-afs.test.cjs.
    • Effort: 25 hours

3. Medium Priority (Technical Debt Backlog)

  • Location: backend/services/knowledge-graph/kg-sqlite.cjs (Quality)

    • Problem: The Knowledge Graph quality score is 44/100 due to disconnected clusters and isolated entities.
    • Risk: RAG retrieval may miss relevant context because concepts are not semantically linked.
    • Remediation: Run a "graph gardening" script to identifying and bridging structural gaps (already partially implemented in RAGSwitch).
    • Effort: Ongoing
  • Location: backend/jobs.js (Deprecated Code)

    • Problem: The legacy jobs.js file is deprecated but references to it may still exist in comments or unused code paths.
    • Risk: Confusion for new developers; potential security risk if dead code is accidentally revived.
    • Remediation: Perform a final grep sweep and delete the file and all references.
    • Effort: 2 hours

Appendix N: Agent Output Templates

This section defines the IDEAL output format for each strategic agent. These templates are designed to inherently pass all 6 lenses of the Soulfield Lens Framework (Truth, Causality, Rights, Extrapolation, Contradiction, Structure).

1. Marketing Agent Template (@marketing)

Goal: Strategic campaigns, funnel optimization, growth plans.

# [Campaign/Strategy Name]

## Executive Summary
[2-3 sentences summarizing the strategy and expected outcome. Use [UNKNOWN] for any uncertainties.]

## Market Analysis
**DATA:** [Observable facts with citations, e.g., "CTR is 2% (Source: GA4)"]
**INTERPRETATION:** [Logical inferences drawn from data]
**SPECULATION:** [Hypotheses marked as such, e.g., "[HYPOTHESIS] Competitor X is pivoting..."]

## Strategic Recommendations
**IF:** [Condition, e.g., "We increase budget by 20%"]
**THEN:** [Result, e.g., "Leads will increase by ~15%"]
**BECAUSE:** [Mechanism, e.g., "Market saturation has not been reached in this vertical"]

## Projections [PROJECTION]
- **Primary Metric:** [Value] (Confidence: [High/Med/Low])
- **Secondary Metric:** [Value] (Confidence: [High/Med/Low])
*Note: Projections assume current market conditions.*

## Compliance Check
- [x] **Rights:** No PII included; [REDACTED] used for specific user data.
- [x] **Truth:** All data sources cited; speculation marked.
- [x] **Causality:** Recommendations include causal mechanisms.

2. Finance Agent Template (@finance)

Goal: Financial models, risk assessment, ROI analysis.

# [Financial Model/Analysis Name]

## Executive Summary
[Concise overview of financial health/decision. No hedging without [UNKNOWN].]

## Key Assumptions [ASSUMPTION]
1. [Assumption 1] - [Basis/Source]
2. [Assumption 2] - [Basis/Source]

## Analysis & Calculations
**DATA:** [Raw numbers with source]
**CALCULATION:** [Show math, e.g., "Revenue = Units * Price"]
**RESULT:** [Final figure]

## Scenario Planning
**IF:** [Base Case condition]
**THEN:** [Outcome]
**BECAUSE:** [Financial logic]

**IF:** [Worst Case condition]
**THEN:** [Outcome]
**BECAUSE:** [Risk mechanism]

## Recommendation
[Clear path forward based on ROI/Risk balance]

## Compliance Check
- [x] **Truth:** Calculations explicitly shown.
- [x] **Extrapolation:** Assumptions and confidence levels stated.
- [x] **Contradiction:** Internal logic (Revenue - Cost = Profit) verified.

3. SEO Agent Template (@seo)

Goal: Keyword research, technical audits, content strategy.

# [SEO Report/Strategy Name]

## Executive Summary
[High-level SEO health or opportunity summary.]

## Data & Insights
**DATA:** [Keyword Volume/Difficulty with source, e.g., "Vol: 5.4k (SEMrush)"]
**INTERPRETATION:** [Ranking potential analysis]

## Strategy
**Action:** [Specific tactic, e.g., "Create skyscraper content for 'X'"]
**IF:** [We implement this]
**THEN:** [Ranking improvement expected]
**BECAUSE:** [SEO Mechanism, e.g., "Top ranking pages are thin on content"]

## Technical Compliance [GUIDELINE]
- **White Hat:** [Confirm strategy adheres to Google Search Essentials]
- **Ethics:** [Confirm no manipulative link schemes]

## Projections [PROJECTION]
- **Traffic Growth:** [Estimate] (Confidence: [Level])
- **Timeline:** [Estimate, e.g., "3-6 months"]

## Compliance Check
- [x] **Rights:** Competitor data is public/ethical.
- [x] **Truth:** Data sources cited (SEMrush, Ahrefs, GSC).
- [x] **Structure:** Actionable, prioritized list.

4. Builder Agent Template (@builder)

Goal: Tangible deliverables (code, templates, copy).

# [Product/Deliverable Name]

## Deliverable Description
[Brief description of what has been built.]

## Implementation/Code
```[language]
[Code or Template Content]
// [PRECONDITION]: Dependencies or setup required
// [VARIABLE]: Placeholders for user customization

Usage Instructions

  1. [Step 1]
  2. [Step 2]
  3. [Step 3]

Privacy & Safety

  • Forms: [Confirm data collection is minimized/secure]
  • PII: [Confirm no hardcoded secrets or PII]

Logic & Logic

IF: [User performs Action X] THEN: [System does Y] BECAUSE: [User Experience/Technical reason]

Compliance Check

  • Structure: Complete, runnable code/template.
  • Rights: Privacy-first design (e.g., clear opt-ins).
  • Truth: No fake testimonials or misleading copy.

## Appendix O: Workspace & Data Structure Map

This section provides a definitive map of the file system structure for agent outputs, training data, and shared resources.

### CRITICAL FOR MIGRATION - DO NOT LOSE

#### Essential Databases & State
- **Knowledge Graph:** `workspace/data/knowledge-graph.db` (655 docs, 7,829 entities, 195,651 relationships)
- **Workflow Events:** `workspace/data/workflow-events.db` (if exists)
- **Agent Outputs:** `workspace/agent-workspace/agents/*/` (ALL agent-generated content)
- **Training Data:** `training-data/` (Books, PDFs - move to S3/GCS)
- **Real-World Learning:** `training-data/real-world/` (Performance capture JSON files)

### 1. Agent Workspace (`workspace/agent-workspace/`)
This is the **active** directory where agents read/write their daily work outputs.

workspace/agent-workspace/ ├── agents/ # Per-agent outputs │ ├── finance/ # Finance agent outputs │ │ ├── analysis/ # 7 analysis files │ │ ├── models/ # 8 financial model files │ │ ├── budgets/ │ │ ├── dashboards/ │ │ ├── investor/ │ │ ├── unit-economics/ │ │ └── valuation/ │ ├── seo/ # SEO agent outputs │ │ ├── keywords/ │ │ ├── competitors/ │ │ ├── onpage/ │ │ ├── backlinks/ │ │ └── research/ │ ├── visionary/ # 15 vision documents │ ├── builder/ # Builder outputs │ │ ├── landing-pages/ │ │ └── products/ │ ├── distributor/launches/ │ ├── metrics/reports/ │ └── governor/orchestrations/ │ ├── projects/ # Multi-agent project folders │ ├── cashflow-micro-offer-* # 9 projects │ ├── asset-compounder-* # 2 projects │ └── test-list-* # 2 test projects │ ├── shared/ # Shared resources │ ├── templates/ │ ├── documents/ │ ├── data/ │ └── cove-cycles/ │ └── inbox/ # File processing ├── pending/ └── processed/


### 2. Training Examples (`workspace/training-examples/`)
This directory contains the **training sets** used by DSPy to optimize agent prompts.

workspace/training-examples/ ├── marketing/ # Marketing training data │ ├── training-data/ # 5 JSON examples │ ├── -structure-analysis.md # 14 analysis files │ └── marketing_schema*.json # 2 schema versions │ ├── finance/ # Finance training data │ ├── training-data/ # 5 JSON examples │ ├── *-structure-analysis.md # 2 analysis files │ └── financeschema.json # 1 schema │ ├── seo/ # SEO training data │ ├── training-data/ # 5 JSON examples │ ├── *-structure-analysis.md # 13 analysis files │ └── seoschema*.json # 2 schema versions │ ├── competitor/ # Competitor analysis training │ ├── training-data/ # 5 JSON examples │ └── competitor*_schema.json # 1 schema │ └── [DSPy scripts, generators, validators]


### 3. Training Data Root (`training-data/`)
This directory contains the **source materials** (books, PDFs, papers) that are ingested into the Knowledge Graph.

*   **Status:** **NOT YET INGESTED** (Identified Gap)
*   **Structure:**
    ```
    training-data/
    ├── marketing/                       # Marketing books/papers
    ├── finance/                         # Finance books/papers
    ├── seo/                             # SEO books/papers
    ├── business/                        # General business
    └── real-world/                      # Real-world examples
    ```