Laravel Documentation RAG System

A production-ready Retrieval-Augmented Generation (RAG) system for querying Laravel documentation using local LLMs. Built with Ollama, ChromaDB, and LangChain for fast, version-aware documentation lookup.

Features

Local LLM Inference: Uses Ollama with Gemma 2B for privacy and speed
Efficient Embeddings: nomic-embed-text (768 dimensions) optimized for retrieval
Persistent Vector Store: ChromaDB with disk persistence
Version-Aware: Support for multiple Laravel versions with metadata tracking
Smart Chunking: Adaptive chunking with configurable size limits and overlap
Multiple Interfaces: CLI, Interactive mode, and REST API
Docker-Based: Fully containerized with M1 Mac support
Production Ready: Comprehensive logging, error handling, and monitoring

📚 Documentation

Quick Start Guide - Get up and running in 10 minutes
API Reference - Complete REST API documentation
Architecture - System design and components
Deployment - Production deployment guide
Troubleshooting - Common issues and solutions

All documentation is in the documentation/ directory.

Quick Start

Prerequisites

Docker and Docker Compose
MacBook Pro M1 (or adjust platform in docker-compose.yml)
16GB RAM minimum
10GB free disk space

Installation

Clone and setup:
```
cd /path/to/laravel-rag
make setup
```
This will:
- Start Docker services
- Pull required Ollama models (gemma:2b and nomic-embed-text)
- Initialize the system
Extract Laravel documentation:
```
make extract
```
Downloads Laravel v12 docs from GitHub (default version).
Index documentation:
```
make index
```
Generates embeddings and stores in ChromaDB (~2-5 minutes for v12).

Query documentation:

make query Q="How do I create an Eloquent model?"

Usage

CLI Commands

# Extract documentation
docker-compose exec rag-app python -m src.cli.main extract --version 12

# Index documentation
docker-compose exec rag-app python -m src.cli.main index

# Query (one-off)
docker-compose exec rag-app python -m src.cli.main query "What is middleware?" --show-sources

# Interactive mode
docker-compose exec rag-app python -m src.cli.main interactive

# Check system status
docker-compose exec rag-app python -m src.cli.main check

# View statistics
docker-compose exec rag-app python -m src.cli.main stats

Using Make (Recommended)

# Setup and management
make setup          # Initial setup
make start          # Start services
make stop           # Stop services
make restart        # Restart services
make logs           # View logs
make clean          # Clean all data (WARNING: destructive)

# Documentation
make extract        # Extract docs
make index          # Index docs
make reindex        # Force re-index

# Querying
make query Q="your question"
make interactive
make stats
make check

# API
make api-test       # Test API endpoints

REST API

Start the API (runs automatically with docker-compose):

make start

API available at http://localhost:8000

API Documentation: http://localhost:8000/docs

Endpoints:

POST /query - Query documentation

curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "How do I use migrations?",
    "include_sources": true,
    "temperature": 0.7
  }'

GET /search?q=query - Search without LLM generation

curl "http://localhost:8000/search?q=eloquent&top_k=5"

GET /stats - Vector store statistics
GET /versions - Available Laravel versions
GET /health - Health check

Interactive Mode

make interactive

Provides a conversational interface:

Question: How do I create a model?
Answer: To create an Eloquent model in Laravel...

Question: exit

Configuration

Environment Variables

Create .env from .env.example:

# Ollama Configuration
OLLAMA_HOST=http://localhost:11434
LLM_MODEL=gemma:2b
EMBEDDING_MODEL=nomic-embed-text

# ChromaDB
CHROMA_PERSIST_DIR=./chromadb
CHROMA_COLLECTION_NAME=laravel_docs

# Laravel Documentation
LARAVEL_VERSION=12
DOCS_CACHE_DIR=./docs

# RAG Settings
TOP_K=5
RESPONSE_TIMEOUT=30
LOG_LEVEL=INFO

System Configuration

Edit config/system.yaml for advanced settings:

Model parameters
Chunking strategy
Performance tuning
Logging configuration

Performance

Resource Usage

Memory: ~3-4GB total
- Ollama + Gemma 2B: 2-3GB
- nomic-embed-text: ~500MB
- ChromaDB: ~100-200MB
Storage:
- Laravel v12 raw docs: ~5-10MB
- Embeddings: ~100-200MB
- ChromaDB overhead: ~50MB
Query Performance:
- Response time: <3 seconds (target)
- Embedding generation: ~100ms per query
- Vector search: <50ms
- LLM generation: 1-2 seconds

Optimization Tips

Batch Size: Adjust --batch-size for indexing based on available memory
Top-K: Lower TOP_K for faster responses, higher for more context
Temperature: Lower (0.3-0.5) for factual answers, higher (0.7-1.0) for creative
Context Window: Configured in system.yaml

Project Structure

laravel-rag/
├── docker-compose.yml          # Docker orchestration
├── Dockerfile                  # Python app container
├── requirements.txt            # Python dependencies
├── Makefile                    # Convenience commands
├── setup.sh                    # Setup script
├── .env.example               # Environment template
├── config/
│   └── system.yaml            # System configuration
├── src/
│   ├── config.py              # Configuration management
│   ├── extraction/            # Document extraction
│   │   ├── docs_fetcher.py   # Git clone & fetch
│   │   └── markdown_parser.py # H2-based chunking
│   ├── indexing/              # Embedding & storage
│   │   ├── embeddings.py     # Ollama embeddings
│   │   └── vector_store.py   # ChromaDB integration
│   ├── retrieval/             # RAG chain
│   │   └── rag_chain.py      # LangChain RAG
│   ├── api/                   # REST API
│   │   └── main.py           # FastAPI application
│   ├── cli/                   # CLI interface
│   │   └── main.py           # Click commands
│   └── utils/
│       └── logger.py          # Logging setup
├── data/                      # Application data
├── chromadb/                  # Vector store persistence
├── sources/                   # Laravel documentation sources cache
└── logs/                      # Application logs

Advanced Usage

Multiple Versions

Index multiple Laravel versions:

# Extract and index v11
docker-compose exec rag-app python -m src.cli.main extract --version 11
docker-compose exec rag-app python -m src.cli.main index --version 11

# Query specific version
docker-compose exec rag-app python -m src.cli.main query \
  "How do migrations work?" --version 11

Custom Chunking

Modify src/extraction/markdown_parser.py to customize chunking strategy.

Monitoring

View logs:

make logs                           # All services
docker-compose logs -f rag-app     # Application only
docker-compose logs -f ollama      # Ollama only

Log files:

Application: logs/laravel-rag.log
Docker: docker-compose logs

Troubleshooting

Issue: Models not found

# Check Ollama
docker exec laravel-rag-ollama ollama list

# Pull manually
docker exec laravel-rag-ollama ollama pull gemma:2b
docker exec laravel-rag-ollama ollama pull nomic-embed-text

Issue: ChromaDB errors

# Clear and re-index
make clean
make setup
make extract
make index

Issue: Out of memory

Reduce batch_size during indexing
Check Docker resource limits
Restart Docker

Note: Embedding warnings You may see warnings like init: embeddings required but some input tokens were not marked as outputs -> overriding. These are harmless internal messages from the nomic-embed-text model and can be safely ignored. They do not affect embedding quality or system performance. See documentation/TROUBLESHOOTING.md for details.

Development

Running Tests

docker-compose exec rag-app pytest tests/ -v

Adding New Features

Follow the modular structure in src/
Use the logger: from src.utils.logger import app_logger as logger
Update configuration in src/config.py
Add CLI commands in src/cli/main.py
Add API endpoints in src/api/main.py

Code Quality

# Format code
docker-compose exec rag-app black src/

# Type checking
docker-compose exec rag-app mypy src/

# Linting
docker-compose exec rag-app flake8 src/

Production Deployment

Security Considerations

API Access: Configure CORS in src/api/main.py
Network: Adjust docker-compose.yml to use internal networks
Authentication: Add API key middleware for production
Rate Limiting: Enable in config/system.yaml

Scaling

Horizontal: Run multiple API instances behind load balancer
Vertical: Increase Docker resource limits
Distributed: Use remote ChromaDB server

Backup

# Backup vector store
tar -czf chromadb-backup.tar.gz chromadb/

# Restore
tar -xzf chromadb-backup.tar.gz

Roadmap

Support for additional Laravel versions (11, 10)
Integration with Livewire, Filament docs
Query result caching
Multi-modal support (images, diagrams)
Conversation history
Fine-tuned models for Laravel
Team collaboration features

License

MIT

Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch
Submit a pull request

Support

For issues and questions:

GitHub Issues: [Create an issue]

Acknowledgments

Laravel Framework Team for excellent documentation
Ollama for local LLM inference
ChromaDB for vector storage
LangChain for RAG orchestration

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
config		config
documentation		documentation
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
analyze_chunks.py		analyze_chunks.py
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
setup.sh		setup.sh
verify_installation.sh		verify_installation.sh

einnar82/laravel-rag

Folders and files

Latest commit

History

Repository files navigation