This guide will help you set up the RAG system with real data using either local Qdrant (Docker) or Qdrant Cloud.
- Node.js 18+
- Docker (for local Qdrant setup)
- API Keys:
- OpenAI API Key (required)
- Cohere API Key (optional, for reranking)
# Interactive setup wizard
npm run qdrant:setup
# Or use the all-in-one script
./start-with-real-data.sh# Copy the example environment file
cp .env.example .env
# Edit .env and add your API keys:
# - OPENAI_API_KEY=your_key_here
# - COHERE_API_KEY=your_key_here (optional)Local Qdrant (Docker):
# Using Docker Compose (recommended)
npm run docker:up
# Or standalone Docker
docker run -d \
--name rag-qdrant \
-p 6333:6333 \
-v ./qdrant_storage:/qdrant/storage \
qdrant/qdrant
# Verify Qdrant is running
npm run qdrant:checkQdrant Cloud:
- Sign up at https://cloud.qdrant.io
- Create a cluster
- Update
.env:QDRANT_URL=https://your-cluster.qdrant.io QDRANT_API_KEY=your_api_key
# Create documents directory
mkdir -p documents
# Add your files (supports .txt, .md, .pdf, .json, .csv, .html)
cp your-files/* documents/
# Or use the sample document
# (Already included: documents/example-nodejs-guide.md)# Ingest your documents
npm run ingest:files
# Or use sample documents
npm run ingest:samplenpm run dev
# Open http://localhost:3000npm run qdrant:setup- Interactive setup wizardnpm run qdrant:check- Check Qdrant health and statusnpm run qdrant:init- Initialize Qdrant collectionsnpm run qdrant:reset- Reset and recreate collectionsnpm run docker:up- Start Qdrant with Docker Composenpm run docker:down- Stop Qdrant containersnpm run docker:logs- View Qdrant logs
npm run ingest:files- Ingest documents from ./documents/npm run ingest:sample- Ingest built-in sample documentsnpm run ingest- Default ingestion (sample docs)
npm run dev- Start development servernpm run build- Build for productionnpm start- Run production build
You can test the system without API keys using mock mode:
MOCK_MODE=true npm run devThis simulates the full RAG pipeline with sample data.
- Text:
.txt - Markdown:
.md,.markdown - PDF:
.pdf(requires pdf-parse) - JSON:
.json - CSV:
.csv - HTML:
.html,.htm
Edit .env to customize:
# Chunking parameters
CHUNK_SIZE=512
CHUNK_OVERLAP=128
# Retrieval parameters
TOP_K=10
RERANK_TOP_K=3
# Hybrid search balance (0=keyword, 1=vector)
HYBRID_SEARCH_ALPHA=0.5
# Server
PORT=3000
NODE_ENV=development# Check if Qdrant is running
docker ps | grep qdrant
# View Qdrant logs
npm run docker:logs
# Test connection
npm run qdrant:checkIf port 6333 is in use:
# Find process using the port
lsof -i :6333
# Use alternative port in docker-compose.yml
# Update QDRANT_URL in .env accordingly# Check Qdrant is initialized
npm run qdrant:init
# Reset and retry
npm run qdrant:reset
npm run ingest:files- Ensure
.envfile exists and contains valid keys - OpenAI: Get key from https://platform.openai.com/api-keys
- Cohere: Get key from https://dashboard.cohere.ai/api-keys
The docker-compose.yml includes:
- qdrant - Vector database with built-in dashboard
Access the Qdrant Dashboard at:
# After starting Qdrant
http://localhost:6333/dashboardFor production, consider:
- Use Qdrant Cloud for managed hosting
- Set NODE_ENV=production in environment
- Use proper API key management (e.g., AWS Secrets Manager)
- Implement rate limiting and authentication
- Set up monitoring and logging
- Use HTTPS for all endpoints
- ✅ Qdrant is running
- ✅ Documents are ingested
- ✅ Server is running
- 🎯 Visit http://localhost:3000
- 🚀 Try queries related to your documents!
- Check logs:
npm run docker:logs - Verify setup:
npm run qdrant:check - Reset everything:
npm run qdrant:reset - Run in mock mode:
MOCK_MODE=true npm run dev