A production-grade multi-agent AI system for retail inventory management.
Built with LangGraph Β· CrewAI Β· MCP Β· RAG Β· FastAPI Β· Streamlit Β· Docker
During peak retail events β Ramadan, Dubai Shopping Festival, Eid β 200+ UAE retail stores face a critical operational bottleneck:
- Demand surges are unpredictable and event-driven
- Supplier lead times may not align with urgency
- Capital approval for large orders requires human sign-off
- Manual decisions are slow, inconsistent, and not auditable
Store managers spend hours on WhatsApp with suppliers, comparing spreadsheets, and escalating to finance β while stockouts happen and revenue is lost.
ORCA replaces this entire workflow with a 4-agent AI pipeline + one human decision.
Alert Triggered (stock critical/at-risk)
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Agent 1 β Demand Intelligence (CrewAI crew) β
β β’ Event uplift analysis (Ramadan 2.8Γ, DSF 1.9Γ) β
β β’ Supplier constraint discovery β
β β’ Confidence scoring + demand forecasting β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Agent 2 β Replenishment Options β
β β’ Option A: Standard Replenishment β
β β’ Option B: Profit Maximisation (Tier-1 stores) β
β β’ Option C: Expedite Air Freight β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Agent 3 β Capital Allocation & Scoring β
β budget_score = (1 - cost/budget) Γ 40 β
β availability_score = availability_pct Γ 0.40 Γ 100 β
β margin_score = (1/margin_rank) Γ 20 β
β lead_time_penalty = -20 if CRITICAL & lead > 30d β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Route Decision Node β
β β’ AUTO_EXECUTE β cost < pool auto-approve limit β
β β’ ESCALATE β cost > limit β human required β
β β’ SUSPEND β pool pressure HIGH β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββ΄βββββββ
βΌ βΌ
Human HITL Auto Execute
(Approve / (reorder_triggered
Reject) = Yes β DB)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Streamlit Dashboard β
β (Command Centre / Pipeline Monitor / HITL) β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β HTTP (httpx)
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FastAPI Layer β
β POST /pipeline/run β 202 + background task β
β GET /pipeline/{id}/state β polling endpoint β
β POST /pipeline/{id}/approve β HITL resume β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LangGraph Pipeline β
β agent1_node β agent2_node β agent3_node β route_node β
β β β β β β
β CrewAI crew Options Gen Scoring ESCALATE/AUTO β
β (3 AI agents) (3 options) (formula) EXECUTE/SUSPEND β
ββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββββββ
β MCP Server β β RAG Pipeline β
β (tool discoveryβ β ChromaDB + BGE β
β via stdio) β β Reranker β
βββββββββββββββββββ βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SQLite Database β
β skus Β· stores Β· stock_positions Β· capital_pools β
β pipeline_log Β· supplier_data Β· events β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Feature | Description |
|---|---|
| π€ Multi-Agent Pipeline | 4 specialised LangGraph agents, each with a single responsibility |
| π₯ CrewAI Integration | 3-agent crew (Data Analyst, Market Analyst, Forecast Strategist) inside Agent 1 |
| π MCP Tool Discovery | Dynamic tool registration via Model Context Protocol β no hardcoded calls |
| π RAG Policy Retrieval | BGE reranker + ChromaDB for policy-grounded decisions |
| β HITL Approval Workflow | LangGraph interrupt β human reviews briefing β approve/reject |
| β‘ Async FastAPI | 202 pattern β pipeline runs as background task, client polls state |
| π¨ Industrial Dashboard | Dark theme Streamlit UI β Command Centre / Pipeline Monitor / HITL tabs |
| π³ Docker + Render | Fully containerised, deployed on Render free tier |
| π Audit Trail | Every decision logged with reviewer, timestamp, action taken |
Layer Technology
βββββββββββββββββββββββββββββββββββββββββββββββββ
Orchestration LangGraph 1.1.10
Multi-Agent CrewAI 1.14.4
Tool Protocol MCP (Model Context Protocol)
LLM Groq / llama-3.1-8b-instant
Embeddings nomic-ai/nomic-embed-text-v1.5
Reranker BAAI/bge-reranker-v2-m3
Vector Store ChromaDB 1.1.1
API Framework FastAPI 0.136 + Uvicorn
Dashboard Streamlit 1.57
HTTP Client httpx 0.28
Database SQLite + SQLAlchemy
Containerisation Docker + docker-compose
Deployment Render.com (free tier)
Observability LangSmith (integrated β all LLM calls traced)
- Python 3.11+
- Docker Desktop
- Groq API key (free at console.groq.com)
# Clone
git clone https://github.com/ankitv42/orca-retail.git
cd orca-retail
# Create .env
cat > .env << EOF
GROQ_API_KEY=your_groq_key_here
GROQ_MODEL=llama-3.1-8b-instant
LLM_PROVIDER=groq
LANGCHAIN_TRACING_V2=false
EOF
# Build and run
docker-compose up --buildOpen:
- Dashboard: http://localhost:8501
- API Docs: http://localhost:8080/docs
# Clone and setup
git clone https://github.com/ankitv42/orca-retail.git
cd orca-retail
python -m venv venv
venv\Scripts\activate # Windows
pip install -r requirements.txt
# Create .env (same as above)
# Terminal 1 β API
uvicorn api.main:app --port 8080 --reload
# Terminal 2 β Dashboard
streamlit run dashboard/app.pyorca-retail/
βββ agents/
β βββ graph.py # LangGraph pipeline β all 4 agents + route logic
β βββ prompts.py # Agent prompts + scoring formula
β βββ crew.py # CrewAI crew (3 agents)
β βββ llm_factory.py # LLM provider abstraction
β βββ tools.py # MCP tool definitions
β
βββ api/
β βββ main.py # FastAPI app β 7 endpoints
β βββ models.py # Pydantic schemas
β
βββ dashboard/
β βββ app.py # Streamlit UI β 3 tabs
β βββ api_client.py # HTTP client wrapper
β
βββ docs/
β βββ rag/
β β βββ ingest.py # PDF β chunks β ChromaDB
β β βββ retriever.py # BGE reranker retrieval
β βββ adr/ # Architecture Decision Records (ADR-001 to ADR-005)
β
βββ db/
β βββ queries.py # SQLite query layer
β βββ pipeline_log.py # Audit log
β βββ schema.sql # Database schema
β
βββ mcp_server/
β βββ server.py # MCP stdio server
β
βββ data/
β βββ scheduler.py # Alert generation scheduler
β
βββ Dockerfile.api # API container
βββ Dockerfile.dashboard # Dashboard container
βββ docker-compose.yml # Orchestration
βββ requirements.txt # Dependencies
Base URL: https://orca-retail.onrender.com
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
System status β DB, RAG, LLM, MCP |
| GET | /api/v1/alerts |
102 critical/at-risk SKU alerts |
| POST | /api/v1/pipeline/run |
Trigger pipeline β returns 202 + pipeline_id |
| GET | /api/v1/pipeline/{id}/state |
Poll pipeline state (progressive) |
| GET | /api/v1/pipeline/{id}/briefing |
HITL briefing text |
| POST | /api/v1/pipeline/{id}/approve |
Approve or reject HITL decision |
| GET | /api/v1/pipelines |
Session audit log |
Full interactive docs: https://orca-retail.onrender.com/docs
- Open https://orca-dashboard.onrender.com
- Command Centre tab β click Analyse on any SKU (try Ajwa Dates 1kg β Class A, Ramadan event)
- Pipeline Monitor tab β watch 4 agents complete progressively (auto-refreshes every 3s)
- If pipeline is ESCALATED β go to HITL Approval tab
- Enter your email β read the briefing β click APPROVE
reorder_triggered = Yesis written to the database
β οΈ Free tier note: Render free instances spin down after inactivity. First load may take 30β60 seconds to wake up.
Full ADR documents are in docs/adr/.
| ADR | Decision | Choice |
|---|---|---|
| ADR-001 | Graph framework | LangGraph β stateful, interruptible, production-grade checkpointing |
| ADR-002 | Tool protocol | MCP β dynamic discovery vs hardcoded tool calls |
| ADR-003 | Eval metrics | Native RAGAS metrics over RAGAS library |
| ADR-004 | ChromaDB index | Committed to repo for reproducible CI |
| ADR-005 | HITL routing | Cost vs auto-approve limit β pure Python, no LLM |
| Sprint | Focus | Status |
|---|---|---|
| Sprint 1 | Data Foundation (SQLite, 100 SKUs, 200 stores, scheduler) | β Complete |
| Sprint 2 | LangGraph Pipeline + MCP Integration | β Complete |
| Sprint 3 | RAG (ChromaDB + BGE) + CrewAI | β Complete |
| Sprint 4 | FastAPI + Streamlit HITL Dashboard | β Complete |
| Sprint 5 | Docker + Render Deployment | β Complete |
| Sprint 6 | LangSmith Tracing + ADRs β Β· Redis π | π In Progress |
Built by Ankit Kumar Verma
Data Science Manager @ Accenture | Palantir Foundry | GCP Professional Data Engineer
This project is an open-source rebuild of the Retail Command Centre (RCC) β a production HITL multi-agent inventory system deployed across 200+ UAE retail stores on Palantir Foundry.
ORCA is my bridge from proprietary enterprise AI to portable, open-source agentic systems.
MIT License β see LICENSE for details.