This project was implemented for LLM Zoomcamp - a free course about LLMs and RAG.
The Challenge: Godot is a powerful open-source game engine, but its extensive documentation can be overwhelming for newcomers. Developers, especially those new to game engines, often struggle to find relevant information quickly when they encounter specific problems or need to implement particular features. Traditional documentation search is often inadequate - users may not know the exact terminology to search for, or they might need information scattered across multiple sections.
The Solution: This RAG (Retrieval-Augmented Generation) system transforms how developers interact with Godot documentation. Instead of manually searching through hundreds of pages, users can ask natural language questions and receive contextual answers along with direct links to the most relevant documentation sections. The system combines semantic search with AI-powered response generation to provide accurate, helpful answers that accelerate learning and development workflow.
Key Benefits:
- Faster Problem Resolution: Get instant answers without browsing through extensive documentation
- Contextual Learning: Receive targeted information relevant to your specific use case
- Documentation Discovery: Find relevant sections you might not have discovered through traditional search
- Beginner-Friendly: Ask questions in plain English without needing to know exact technical terminology
The dataset used in this project contains comprehensive information from the official Godot Engine documentation, including:
Document Structure: Each chunk contains structured information from Godot's documentation hierarchy
- Title: The main title of the documentation page (e.g., "RigidBody2D", "Creating Your First Scene")
- File Path: The source documentation file location (e.g., "classes/class_rigidbody2d.rst.txt")
- Section: The major section within the documentation (e.g., "Methods", "Properties", "Tutorials")
- Subsection: Specific subsections for detailed organization (e.g., "Virtual Methods", "Constants")
- Content: The actual documentation text containing explanations, code examples, and instructions
Content Categories: The documentation covers all aspects of Godot game development:
- Class References: Detailed API documentation for all Godot classes and methods
- Tutorials: Step-by-step guides for common game development tasks
- Manual Pages: Comprehensive explanations of Godot's features and concepts
- Code Examples: GDScript code snippets with explanations
- Best Practices: Recommended approaches and patterns for game development
Processing Pipeline:
- Download: Official Godot documentation is downloaded from the nightly builds
- Extraction: Documentation is extracted and converted from .rst.txt files to structured text
- Chunking: Large documents are intelligently split into smaller, semantically meaningful chunks
- Embedding: Each chunk is converted into 384-dimensional vectors using the all-MiniLM-L6-v2 model
- Storage: Vectors and metadata are stored in Qdrant vector database for fast semantic search
The processed dataset contains thousands of documentation chunks optimized for retrieval-augmented generation. Each chunk maintains its connection to the original Godot documentation through metadata, enabling users to trace answers back to their official sources.
You can find the processed data in data/chunked/chunks.json and the raw documentation can be found at https://nightly.link/godotengine/godot-docs/workflows/build_offline_docs/master/godot-docs-epub-stable.zip.
https://www.youtube.com/embed/-F6iT-kqeKw?si=ReaCcVhH0xVhkLIo
Have the following packages and tools installed:
- Python 3.12 or later
- Docker
- Docker Compose
optional but recommended
- 32GB+ RAM
- NVIDIA GPU (for faster embedding computation, and LLM otherwise CPU will be used)
Choose one of the setup options below to get started. Both scripts will automatically configure Docker containers for the LLM, Qdrant database, and monitoring stack, as well as populate the vector database with embedded chunked documentation data.
Prerequisites: Ensure Docker is running before executing these commands.
Uses pre-processed dataset from 8/17/2025 for faster setup. Choose this option for quick deployment with stable documentation.
./setup_pre_chunked.shDownloads and processes the latest Godot documentation, then embeds it into the vector database. Only use this option if you specifically need the most current Godot documentation.
./setup_scratch.sh- Query Rate (QPS) - Real-time query volume
- Average Response Time - Response latency gauge
- LLM Evaluation Scores - AI quality metrics (Relevance, Accuracy, Completeness, Clarity, Faithfulness)
- Vector Database Metrics - Qdrant performance and usage
- Top Query Categories - Most common query types
- π Grafana Dashboard: http://localhost:3000 (admin/admin)
- π Prometheus: http://localhost:9090
- π Metrics: http://localhost:8000/metrics
π View Complete Retrieval Evaluation Analysis β
Summary: After comprehensive testing of multiple retrieval methods against our actual Qdrant database, Aggressive MMR emerged as the optimal approach, providing the best balance of relevance, diversity, and performance. The system has been updated with optimized MMR parameters for improved results.
- all-MiniLM-L6-v2 (384-dimensional embeddings)
- Cosine distance in vector space Process: Query Embedding: Your question β 384D vector
- 384D vectors for all Godot docs
- Cosine similarity between query and all documents
- Top 5 most similar documents returned
- 0.7-1.0: High relevance (excellent matches)
- 0.4-0.7: Medium relevance (related concepts)
- <0.4: Low relevance (weakly related)
The LLM (Large Language Model) is used as a judge to evaluate the quality of the answers generated by the RAG system. This involves using the LLM to assess various aspects of the answers, such as relevance, accuracy, completeness, clarity, and faithfulness.
- Docker & Docker Compose: Container orchestration for services
- Python 3.12+: Main programming language
- Bash Scripts: Automated setup and deployment scripts
qdrant-client: Vector database client for semantic searchsentence-transformers: Pre-trained embedding modelslangchain_huggingface: HuggingFace integration for embeddingstransformers: Deep learning models for NLPtorch: PyTorch for GPU-accelerated embedding computation
langchain: RAG pipeline frameworklangchain-core: Core LangChain functionalitylangchain-community: Community integrationslangchain_qdrant: Qdrant vector store integrationlangchain_ollama: Ollama LLM integrationOllama: Local LLM inference server (llama3.2:1b model)tiktoken: Token counting and text processing
streamlit: Interactive web application framework
requests: HTTP client for downloading documentationbs4(BeautifulSoup): parsing and web scrapingtqdm: Progress bars for data processing
prometheus_client: Python client for Prometheus metricsfastapi: High-performance API framework for metrics endpointuvicorn: ASGI server for FastAPI applicationspython-multipart: Multipart form data support
- Prometheus: Time-series metrics collection and storage
- Grafana: Metrics visualization and dashboards
- Qdrant: High-performance vector database
- Shell Scripts:
setup_pre_chunked.sh: Quick setup with pre-processed datasetup_scratch.sh: Full pipeline from raw documentationstart_monitoring.sh: Launch monitoring stackimport_dashboard.sh: Configure Grafana dashboards
- Embedding Model:
all-MiniLM-L6-v2(384-dimensional vectors) - LLM:
llama3.2:1b(1 billion parameter model) - Similarity Search: Cosine similarity in vector space
- Evaluation: LLM-as-a-Judge for answer quality assessment
If the dashboard doesn't appear automatically:
- Make sure you've logged into Grafana web interface first (admin/admin)
- Run
./import_dashboard.shmanually - Check that all services are running with
docker compose ps - Verify Grafana is accessible at http://localhost:3000
- Compare different LLM models (e.g. GPT-4, Llama 2, Falcon, Mistral)
- Compare different chunking methods (e.g. overlapping chunks, hierarchical chunking)
- Compare different databases (e.g. Pinecone, Weaviate, Milvus)
- Explore additional data sources for improving knowledge base (e.g. Wikipedia, GitHub, Stack Overflow)
- Use user feedback for continuous improvement and fine-tuning.
- Deploy on cloud platforms for scalability and availability.




