|
1 | | - |
2 | 1 | # Learning module with conversational educational system - Backend Module |
3 | 2 |
|
4 | | -This repository contains the Bachelor's Engineering Thesis project titled: **"Learning module with conversational educational system"**, developed at the **Faculty of Mathematics and Information Science (MiNI)** of **Warsaw University of Technology**. |
5 | | - |
6 | | -The system is a modular, RAG-based (Retrieval-Augmented Generation) assistant designed to enhance the digital learning experience within the Moodle LMS by bridging the gap between traditional Learning Management Systems and modern Generative AI. |
| 3 | +This repository contains the Backend Module developed as the primary functional layer for the Bachelor's Engineering Thesis: **"Learning module with conversational educational system"**. |
7 | 4 |
|
8 | 5 | --- |
9 | 6 |
|
10 | 7 | ## Academic Context |
11 | 8 | * **University:** Warsaw University of Technology (Politechnika Warszawska) |
12 | 9 | * **Faculty:** Faculty of Mathematics and Information Science (MiNI) |
13 | 10 | * **Supervisor:** dr inż. Anna Wróblewska |
14 | | -* **Thesis Type:** Bachelor's Engineering Thesis (Praca Inżynierska) |
15 | | - |
16 | | -## Authors |
17 | | -* Anna Ostrowska |
18 | | -* Gabriela Majstrak |
19 | | -* Jan Opala |
| 11 | +* **Authors:** Anna Ostrowska, Gabriela Majstrak, Jan Opala |
20 | 12 |
|
21 | | -## Project Structure |
22 | | - |
23 | | -``` |
24 | | -chatbot-backend/ |
25 | | -├── app/ # Main application |
26 | | -│ ├── __init__.py |
27 | | -│ ├── main.py # FastAPI app setup |
28 | | -│ ├── api/ |
29 | | -│ │ └── routes/ # API endpoints |
30 | | -│ │ ├── chat.py # Chat endpoints |
31 | | -│ │ ├── moodle.py # Moodle integration |
32 | | -│ │ ├── embeddings.py # Embedding generation |
33 | | -│ │ └── health.py # Health checks |
34 | | -│ ├── services/ # Business logic |
35 | | -│ │ ├── llm_service.py # LLM integration (Ollama) |
36 | | -│ │ ├── moodle_service.py |
37 | | -│ │ ├── embedding_service.py # Embedding generation |
38 | | -│ │ └── rag_service.py # RAG |
39 | | -│ ├── models/ # Data models |
40 | | -│ │ ├── schemas.py # Pydantic schemas |
41 | | -│ │ └── database.py # SQLAlchemy models |
42 | | -│ ├── core/ |
43 | | -│ │ └── database.py # DB connection |
44 | | -│ └── config/ # Configuration |
45 | | -│ └── settings.py |
46 | | -├── docker-compose.yaml |
47 | | -├── Dockerfile # Backend container |
48 | | -├── Dockerfile.ollama # Ollama container |
49 | | -├── requirements.txt |
50 | | -└── README.md |
51 | | -``` |
52 | | - |
53 | | -## Setup |
| 13 | +--- |
54 | 14 |
|
55 | | -### Prerequisites |
| 15 | +## Technical Stack & Architecture |
| 16 | +The backend is built as a high-performance REST API using **FastAPI**, designed to utilize the RAG (Retrieval-Augmented Generation) pipeline. |
56 | 17 |
|
57 | | -- Docker Desktop |
58 | | -- Python 3.10+ (for local development) |
59 | | -- 8GB+ RAM recommended |
| 18 | +### Core Technologies: |
| 19 | +* **Vector Database:** ChromaDB for efficient semantic search and context retrieval. |
| 20 | +* **OCR & Parsing:** Adaptive document processing. |
| 21 | +* **LLM Orchestration:** Integration with **OpenAI (GPT-4o-mini)** and **Google Gemini** for generation, and **Voyage AI** for embeddings. |
| 22 | +* **Task Scheduling:** **APScheduler** automatically synchronizes course materials from Moodle. |
60 | 23 |
|
61 | | -### Environment Variables |
| 24 | +--- |
62 | 25 |
|
63 | | -Create a `.env` file: |
| 26 | +## Key Features |
64 | 27 |
|
65 | | -```env |
66 | | -# LLM Settings |
67 | | -LLM_BASE_URL=http://ollama:11434 |
68 | | -CHAT_MODEL= |
69 | | -EMBED_MODEL= |
70 | | -LLM_TIMEOUT= |
71 | | -MAX_TOKENS= |
72 | | -TEMPERATURE= |
| 28 | +### 1. Advanced Document Ingestion |
| 29 | +The system features a custom-built ingestion pipeline (`parser_utils.py`) that includes: |
| 30 | +* **Adaptive OCR** |
| 31 | +* **Context-Aware Chunking** |
| 32 | +* **Math Normalization** |
73 | 33 |
|
74 | | -# Moodle Settings |
75 | | -MOODLE_BASE_URL= |
| 34 | +### 2. Intelligent Services |
| 35 | +* **RAG Pipeline:** Uses vector search to provide factually grounded answers strictly based on course content. |
| 36 | +* **Quiz Generation:** Automated assessment creation based on Bloom's Taxonomy to evaluate student understanding. |
| 37 | +* **Smart Sync:** Periodic background jobs that check for new course materials without manual teacher intervention. |
76 | 38 |
|
77 | | -# Database |
78 | | -DATABASE_URL= |
| 39 | +--- |
79 | 40 |
|
80 | | -# CORS |
81 | | -CORS_ORIGINS=["http://localhost:3000","https://moodle-edu-chatbot.vercel.app"] |
82 | | -``` |
| 41 | +## Repository Structure |
| 42 | +* **`app/api/routes/`**: Define API endpoints for chat, quiz, and dashboard features. |
| 43 | +* **`app/services/`**: Core business logic, including the RAG engine and embedding management. |
| 44 | +* **`app/core/parser_utils.py`**: Technical implementation of the parsing and OCR logic. |
| 45 | +* **`Dockerfile` & `docker-compose.yaml`**: Full containerization setup for easy deployment and scalability. |
83 | 46 |
|
84 | | -### Docker Deployment |
| 47 | +--- |
85 | 48 |
|
86 | | -```powershell |
87 | | -# Build and start all services |
| 49 | +## Deployment (Docker) |
| 50 | +1. Ensure your `.env` file is configured with the necessary API keys. |
| 51 | +2. Build and start the system: |
| 52 | +```bash |
88 | 53 | docker compose up -d --build |
89 | | -
|
90 | | -# Check logs |
91 | | -docker compose logs -f |
92 | | -
|
93 | | -# Stop services |
94 | | -docker compose down |
95 | 54 | ``` |
| 55 | +The API will be available at http://localhost:8000 with interactive Swagger documentation at /docs. |
96 | 56 |
|
97 | | -## API Endpoints |
98 | | - |
99 | | -### Chat |
100 | | - |
101 | | -**POST** `/chat` |
102 | | - |
103 | | -```json |
104 | | -{ |
105 | | - "message": "Hello!", |
106 | | - "history": [ |
107 | | - {"role": "user", "content": "Previous message"}, |
108 | | - {"role": "assistant", "content": "Previous response"} |
109 | | - ], |
110 | | - "stream": false |
111 | | -} |
112 | | -``` |
113 | | - |
114 | | -### Embeddings |
115 | | - |
116 | | -**POST** `/api/embeddings?text=your_text` |
117 | | - |
118 | | -Returns vector embedding with dimensions. |
119 | | - |
120 | | -### Moodle |
121 | | - |
122 | | -**POST** `/api/moodle/login` |
123 | | - |
124 | | -```json |
125 | | -{ |
126 | | - "username": "user", |
127 | | - "password": "pass" |
128 | | -} |
129 | | -``` |
130 | | - |
131 | | -**GET** `/api/moodle/courses?token=YOUR_TOKEN` |
132 | | - |
133 | | -**GET** `/api/moodle/assignments?token=YOUR_TOKEN&course_id=1` |
134 | | - |
135 | | -### Health |
136 | | - |
137 | | -**GET** `/health` - Returns `{"status": "ok"}` |
138 | | - |
139 | | - |
140 | | -## Configuration |
141 | | - |
142 | | -All configuration is in `app/config/settings.py`. Override via: |
| 57 | +--- |
143 | 58 |
|
144 | | -1. `.env` file |
145 | | -2. Environment variables |
146 | | -3. Defaults in Settings class |
| 59 | +*Developed as the primary technical part of the diploma process at Warsaw University of Technology.* |
0 commit comments