Git-Inspired Intelligence for Testimony Reconstruction.
"Memory is fragmented, emotional, and non-linear. Narrative Merge Engine is the version control for human testimony."
The Narrative Merge Engine is a high-performance, async-first backend designed to reconstruct complex event timelines from fragmented witness testimonies. Using advanced LLM orchestration and clean architecture, it treats human memory like a distributed version control system — identifying conflicts, resolving temporal ambiguities, and merging divergent accounts into a single, unified narrative truth.
- Decomposition: Breaking down non-linear, multi-lingual (Hindi/English/Hinglish) prose into atomic events.
- Conflict Detection: Automatically identifying contradictions in timing, location, or actor descriptions across different witnesses.
- Uncertainty Tracking: Explicitly modeling "maybe", "around", and "I think" to prevent false precision.
- Temporal Alignment: Resolving relative markers (e.g., "baad mein", "after the noise") into a global timeline.
- Investigation Assistance: Generating clarifying questions to resolve found contradictions and close narrative gaps.
graph TD
subgraph INGESTION["1. Ingestion"]
T["Raw Testimony"] --> N["Normalisation"]
end
subgraph INTELLIGENCE["2. Extraction Intelligence"]
N --> LLM["Event Extraction v2 (LLM)"]
LLM --> P["4-Strategy Parser"]
P --> V["Pydantic Validation"]
end
subgraph RECONSTRUCTION["3. Timeline & Logic"]
V --> TR["Timeline Reconstruction"]
TR --> CD["Conflict Detection"]
CD --> QG["Question Generation"]
end
subgraph PERSISTENCE["4. Data Layer"]
TR --> DB["PostgreSQL (Supabase)"]
CD --> DB
end
classDef core fill:#f9f,stroke:#333,stroke-width:2px;
class INTELLIGENCE core;
| Component | Technology | Rationale |
|---|---|---|
| Framework | FastAPI | Async-first, high performance, native OpenAPI. |
| Database | PostgreSQL + SQLAlchemy 2.0 | Robust relational model with full async support via asyncpg. |
| Orchestration | Tenacity | Exponential backoff and retry for resilient AI calls. |
| Intelligence | LLM Orchestrator | Provider-agnostic abstraction for OpenAI, Anthropic, or Gemini. |
| Validation | Pydantic v2 | Strict typing and data integrity for LLM outputs. |
| Logging | Structlog | Structured, searchable logs for production observability. |
The EventExtractionService serves as the core intelligence layer, managing:
- Testimony Chunking: Overlapping windows for long-form testimonies to ensure data continuity.
- Multi-lingual Context: Fine-tuned prompting for code-switching environments (e.g., Hindi-English).
- Source Provenance: Verification of every extracted event against original text via fuzzy matching to mitigate hallucination risks.
- Resilient Processing: Graceful degradation logic that rescues valid events from partial LLM failure and triggers targeted retries.
Temporal data is managed via a six-category uncertainty framework rather than fixed timestamps:
hedged: Indicates witness uncertainty (e.g., "I think", "shayaad")approximate: Indicates estimated values (e.g., "around 9", "lagbhag")relative: Signals temporal dependency (e.g., "after that", "later")missing: Indicates absence of temporal dataconflicting: Marks internal contradictions within a single accountnone: Reserved for clear, unambiguous statements
Extracted events are modelled as commits within a narrative branch. The system's Merge Engine identifies semantic conflicts where divergent testimonies provide mutually exclusive accounts of chronological events.
app/
├── api/v1/ # Versioned endpoints (testimonies, events, timeline)
├── core/ # Configuration, Security, Logging, and AI Orchestration
├── db/ # Database connection and session management
├── models/ # Pydantic schemas and SQLAlchemy ORM models
├── repositories/ # Data access layer (async CRUD logic)
└── services/ # Core business logic (Extraction, Reconstruction, Conflicts)
tests/ # Comprehensive test suite (async HTTP clients + LLM mocks)
alembic/ # Database migration management
main.py # Application entry-point
- Python 3.11+
- Redis (optional, for caching)
- PostgreSQL (or Supabase)
# Clone the repository
git clone https://github.com/YourOrg/narrative-merge-engine.git
cd narrative-merge-engine
# Install dependencies via Poetry
poetry installGenerate a .env file from the provided template:
cp .env.example .env
# Configure DATABASE_URL and LLM_API_KEY in the .env filepoetry run alembic upgrade headpoetry run uvicorn app.main:app --reloadInteractive API documentation is available at http://localhost:8000/docs.
The repository includes a comprehensive asynchronous test suite.
# Execute all tests
poetry run pytest
# Generate coverage report
poetry run pytest --cov=app tests/MIT License — Focused on forensic integrity and system reliability.