DocWeave is an enterprise-ready, agentic Retrieval-Augmented Generation (RAG) platform for document-heavy workflows, with a strong fit for regulated domains.
It helps teams ingest heterogeneous documents, retrieve grounded evidence, and deliver auditable answers through APIs and UI surfaces.
- End-to-end RAG workflow from ingestion to answer generation
- Multi-format document support: PDF, CSV/XLSX, JSON/JSONL, TXT/MD
- Hybrid retrieval (vector + BM25) with optional reranking
- Agentic orchestration for retrieval, grounding, and analysis
- Multi-tenant patterns for isolation and scale
- Production-oriented API, worker, observability, and runbook support
- Ingestion pipeline: parse -> normalize -> chunk -> embed -> index
- Vector backends:
pgvector,qdrant,chroma - LangGraph agent flow: route -> retrieve -> ground -> analyze -> answer
- FastAPI application layer for serving and integration
- Celery worker for asynchronous/background processing
- Security and operations guides for deployment hardening
- Quickstart: docs/QUICKSTART.md
- Architecture: docs/ARCHITECTURE.md
- API: docs/API.md
- Security: docs/SECURITY.md
- Runbooks: docs/RUNBOOKS.md
make dev
make runIn a separate terminal:
make workerOptional sample indexing:
make indexAvailable Make targets:
make dev- create virtual environment and install editable dependenciesmake run/make api- start the FastAPI app with reloadmake worker- start Celery worker queuesmake index- build sample index fromdata/samplesmake test- run test suitemake lint- run Ruff lint checksmake type- run MyPy type checksmake docker- run with Docker Compose
DocWeave includes deployment assets under deploy/ and docker/ for containerized and Kubernetes/Helm-based environments.
Before production rollout:
- Configure secrets and environment variables securely
- Validate tenant isolation and data boundaries
- Enable tracing/metrics and log redaction policies
- Review docs/SECURITY.md and docs/RUNBOOKS.md
This repository focuses on practical, production-minded RAG patterns with tutorial-level clarity for fast onboarding.
MIT - see LICENSE.