You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Provides the foundational building blocks for the entire solution:
ServiceFactory<T> — Abstract generic factory base class implementing the strategy/factory pattern with hash-based singleton caching and transient scoping.
HashHelper — SHA-256 hashing with YAML serialization for producing deterministic cache keys from complex objects.
ConfigLoader — YAML/JSON configuration loading with environment variable substitution (${ENV_VAR}) and .env file support.
ServiceScope — Enum controlling factory instance lifetime: Transient (new each time) or Singleton (cached by hash key).
Abstracts file/blob/memory storage behind a uniform interface
Purpose
Provides a pluggable storage abstraction for reading/writing data by key. The IStorage interface supports file finding via regex, async get/set/has/delete/clear operations, hierarchical child storage, and creation date retrieval. The Tables/ subsystem adds row-oriented access for Parquet and CSV formats.
Caches expensive LLM responses to avoid redundant API calls
Purpose
Provides a caching layer with three strategies: JSON-file persistence (backed by any IStorage), in-memory dict, and no-op passthrough. CacheKeyHelper produces deterministic hash keys from request arguments for cache lookup.
Splits long documents into overlapping chunks for LLM processing
Purpose
Implements text chunking strategies that break documents into token- or sentence-bounded chunks with configurable size and overlap. The MetadataTransformer prepends/appends key-value metadata to chunk text. Each chunk is represented as a TextChunk record carrying original text, transformed text, character offsets, and optional token count.
Reads input documents from storage in various formats
Purpose
Provides format-specific file readers that load documents from an IStorage backend and produce TextDocument records. Each reader handles pattern-matched files (e.g., .*\.csv$), extracts text/title/ID, and generates SHA-512 hash IDs when no ID column is specified.
Full LLM orchestration layer with pluggable middleware
Purpose
The largest supporting package. Provides abstractions for LLM completions, embeddings, tokenization, rate limiting, retries, metrics, and templating. The middleware pipeline chains cross-cutting concerns (logging → metrics → rate limiting → retries → cache) around base LLM calls. Nine factory classes create all component types from configuration.
Abstracts vector similarity search with a rich filtering system
Purpose
Provides IVectorStore for vector database operations (connect, create index, insert, similarity search by vector/text, search by ID, count, remove, update). Includes a composable filtering DSL with Condition, AndExpression, OrExpression, NotExpression, and the fluent FieldRef builder. TimestampHelper explodes ISO dates into searchable year/month/day/hour fields for vector store indexing.
Entry point: CLI, pipeline orchestration, query engine, data model, config
Purpose
The main application assembling all libraries into a working system. Contains the data model, configuration, indexing pipeline, query engine, prompt tuning, CLI commands, and embedded prompt templates.
Subsystems
DataModel/
Core graph entities as sealed records with Identified → Named inheritance hierarchy.
Full configuration system: 16 config models, 6 enum classes, GraphRagConfig master record, DefaultValues constants. Matches every default value from the Python original.
Index/ (Indexing Pipeline)
Component
Purpose
Pipeline
Ordered list of (name, WorkflowFunction) tuples
PipelineRunner
Async executor yielding PipelineRunResult per workflow
System.CommandLine subcommands: init, index, query, prompt-tune via RootCommandBuilder.
Prompts/
10 embedded .txt templates (extraction, summarization, community reporting, search).
Test Projects
GraphRag.Tests.Unit
Attribute
Value
Output
Test library
Tests
181
References
All 8 src projects
Purpose
Unit tests for every source project — isolated class tests with mocks
GraphRag.Tests.Integration
Attribute
Value
Output
Test library
Tests
5
References
All 8 src projects
Purpose
Cross-cutting tests: FileStorage round-trips, JsonCache+FileStorage persistence, Pipeline execution with stub workflows, TokenChunker on large documents