MindForge is designed with a modular, extensible architecture that separates concerns between memory management, AI model interaction, and storage.
┌─────────────────────────────────────────────────────────┐
│ User Application │
└────────────────────┬────────────────────────────────────┘
│
↓
┌─────────────────────────────────────────────────────────┐
│ MemoryManager │
│ • Coordinates memory operations │
│ • Manages memory retrieval and storage │
│ • Integrates chat and embedding models │
└──────┬─────────────┬──────────────┬─────────────────────┘
│ │ │
↓ ↓ ↓
┌──────────┐ ┌──────────┐ ┌──────────────┐
│ Chat │ │Embedding │ │ Storage │
│ Model │ │ Model │ │ Engine │
└──────────┘ └──────────┘ └──────┬───────┘
│
┌──────────────┼──────────────┐
↓ ↓ ↓
┌──────────┐ ┌──────────┐ ┌──────────┐
│ SQLite │ │PostgreSQL│ │ Redis │
│ Vector │ │ pgvector│ │ Vector │
└──────────┘ └──────────┘ └──────────┘
The MemoryManager is the central orchestrator that coordinates all memory operations.
Responsibilities:
- Process user inputs and generate responses
- Extract concepts from queries using the chat model
- Generate embeddings using the embedding model
- Retrieve relevant memories from storage
- Store new interactions
- Trigger clustering and graph updates
Key Methods:
process_input(): Main entry point for processing queries_build_context(): Constructs context from retrieved memories_store_interaction(): Persists new interactions
The model layer provides abstractions for different LLM providers.
BaseChatModel
generate_response(context, query): Generate AI responsesextract_concepts(text): Extract key concepts from text
BaseEmbeddingModel
get_embedding(text): Generate vector embeddingsdimension: Property returning embedding dimensions
-
OpenAI Models
- Uses OpenAI's latest API (v1.0+)
- Supports GPT-3.5, GPT-4, and embedding models
-
Azure OpenAI Models
- Compatible with Azure's OpenAI service
- Supports deployment-based model access
-
Ollama Models
- Enables local model usage
- Great for privacy-sensitive applications
-
LiteLLM Models
- Universal adapter for any LLM provider
- Supports 100+ models from various providers
The storage layer handles persistence and retrieval of memories.
class BaseStorage(ABC):
def store_memory(memory_data, memory_type, user_id, session_id)
def retrieve_memories(query_embedding, concepts, memory_type, user_id, session_id, limit)
def update_memory_level(memory_id, new_memory_level, user_id, session_id)The default storage implementation using SQLite with vector search.
Features:
- FAISS-based or sqlite-vec vector indexing
- Multi-level memory types
- Concept graph storage
- Session and user memory tracking
Schema:
memories (
id, prompt, response, timestamp,
access_count, last_access, memory_type, recency_boost
)
memory_vectors (
rowid, embedding
)
concepts (
id, memory_id, concept, weight
)
user_memories (
user_id, memory_id, preference, history
)
session_memories (
session_id, memory_id, recent_activity, context
)
agent_memories (
memory_id, knowledge, adaptability
)SQLiteVecEngine
- Optimized for vector search performance
- Uses sqlite-vec extension
PostgresVectorEngine
- Production-ready for multi-process applications
- Uses pgvector extension
- Supports concurrent access
RedisVectorEngine
- In-memory storage for ultra-fast retrieval
- Great for high-throughput applications
- Supports distributed deployments
ChromaDBEngine
- Document-oriented vector storage
- Built-in embedding support
- Easy integration with ML workflows
MindForge implements a sophisticated multi-level memory hierarchy:
- Purpose: Recent context in conversations
- Characteristics: Limited retention, fast decay
- Use cases: Chatbots, conversational AI
- Purpose: Persistent knowledge and facts
- Characteristics: Permanent storage, no decay
- Use cases: Knowledge bases, fact retention
- Purpose: Personalization per user
- Characteristics: Isolated per user_id
- Use cases: Multi-user applications, personalization
- Purpose: Context within a conversation
- Characteristics: Scoped to session_id
- Use cases: Conversation threads, task continuity
- Purpose: Agent's self-knowledge
- Characteristics: Agent's capabilities and identity
- Use cases: Agent introspection, capability awareness
The concept graph tracks relationships between concepts for enhanced retrieval.
Features:
- Tracks co-occurrence of concepts
- Weighted edges based on frequency
- Spreading activation for semantic search
- Temporal decay of relationships
Benefits:
- Improves context relevance
- Discovers related memories
- Enhances semantic understanding
Automatic clustering of memories based on semantic similarity.
Process:
- Periodically triggered after N interactions
- Groups memories by embedding similarity
- Identifies memory themes and topics
- Enables topic-based retrieval
Algorithms:
- K-means clustering
- Hierarchical clustering
- DBSCAN for outlier detection
User Input
↓
Generate Embedding (Embedding Model)
↓
Extract Concepts (Chat Model)
↓
Create Memory Data
↓
Store in Database
↓
Update Concept Graph
↓
Check Clustering Threshold
↓
[Optional] Trigger Clustering
Query Input
↓
Generate Query Embedding
↓
Extract Query Concepts
↓
Vector Similarity Search
↓
Concept Graph Expansion
↓
Apply Filters (type, user, session)
↓
Score and Rank Results
↓
Return Top K Memories
Query + Retrieved Memories
↓
Build Context Dictionary
↓
Format Context for LLM
↓
Generate Response (Chat Model)
↓
Return Response to User
MindForge uses a hierarchical configuration system:
AppConfig
├── MemoryConfig
│ ├── similarity_threshold
│ ├── short_term_limit
│ ├── long_term_limit
│ ├── decay_rate
│ └── clustering_trigger_threshold
├── VectorConfig
│ ├── embedding_dim
│ ├── index_type
│ └── max_neighbors
├── ModelConfig
│ ├── chat_model_name
│ ├── embedding_model_name
│ ├── use_model (provider)
│ └── provider-specific settings
└── StorageConfig
├── db_path
├── wal_mode
└── cache_sizefrom mindforge.core.base_model import BaseChatModel, BaseEmbeddingModel
class MyCustomChatModel(BaseChatModel):
def generate_response(self, context, query):
# Your implementation
pass
def extract_concepts(self, text):
# Your implementation
passfrom mindforge.storage.base_storage import BaseStorage
class MyCustomStorage(BaseStorage):
def store_memory(self, memory_data, memory_type, user_id, session_id):
# Your implementation
pass
def retrieve_memories(self, query_embedding, concepts, ...):
# Your implementation
passMemory types are extensible through the database schema. Add new types by:
- Updating the CHECK constraint in the memories table
- Creating corresponding junction tables
- Implementing storage/retrieval logic
- Indexing: Use FAISS or specialized vector databases for large datasets
- Dimensionality Reduction: Consider PCA or quantization for reduced storage
- Caching: Implement embedding caching for frequently queried items
- Horizontal Scaling: Use Redis or PostgreSQL for distributed deployments
- Vertical Scaling: Increase vector index size and cache
- Sharding: Partition memories by user_id or time ranges
- Cleanup: Implement periodic cleanup of old short-term memories
- Archival: Move old memories to cold storage
- Compression: Use vector quantization for storage efficiency
- API Keys: Never hardcode; use environment variables
- User Isolation: Ensure user_id filtering is enforced
- SQL Injection: Use parameterized queries (already implemented)
- Rate Limiting: Implement at application level
- Data Privacy: Consider encryption at rest for sensitive memories
MindForge includes comprehensive tests:
- Unit Tests: Individual component testing
- Integration Tests: End-to-end workflows
- Storage Tests: Database operations
- Model Tests: Mock LLM responses for deterministic testing
Run tests with:
pytest tests/Potential areas for extension:
- Multi-modal Memory: Support for images, audio
- Federated Learning: Distributed memory updates
- Memory Compression: Summarization of old memories
- Active Learning: Smart memory prioritization
- Memory Visualization: Graph visualization tools
- Cross-lingual Memory: Multi-language support