FinQuery is a sophisticated Question-Answering system packaged as a Telegram bot. It leverages a fine-tuned sentence-transformer model to understand and accurately answer user questions about banking services, all built on a scalable, asynchronous architecture using Apache Kafka and Docker.
- Semantic Search: Utilizes a fine-tuned
bge-small-en-v1.5model to find answers based on semantic meaning, not just keywords. - Scalable Architecture: Employs Apache Kafka as a message broker to decouple the Telegram bot from the AI model, ensuring high performance and reliability.
- Dockerized: The entire application stack (bot, worker, Kafka) is containerized and can be launched with a single Docker Compose command.
- End-to-End MLOps: Covers the full lifecycle: fine-tuning the model, creating a search index, and deploying the inference service.
The project follows a microservice architecture:
- Telegram Bot (
telegram_bot.py): The user-facing service that receives queries from users. It acts as a Kafka Producer. - Apache Kafka: A message broker that buffers incoming requests.
- Model Worker (
model_worker.py): A dedicated Kafka Consumer that uses the AI model and a FAISS index to find the best answer and sends it back to the user.
- Docker & Docker Compose
- Git
- A Telegram Bot Token, API ID, and API Hash.
git clone [https://github.com/moslemamini/FinQuery.git](https://github.com/moslemamini/FinQuery.git)
cd FinQueryCreate your own .env file by copying the example and then fill it with your actual credentials.
cp .env.example .envNow, open the .env file and edit the variables (API_ID, API_HASH, BOT_TOKEN).
Place your training data (e.g., triplets.jsonl) and your knowledge base data (bank_dataset.jsonl) inside the data/ directory.
If you want to fine-tune the base model on your own data, run the training script. This step requires a machine with a GPU.
# This command runs the training script inside a temporary Docker container
docker-compose run --rm model_worker python src/worker/train.py \
--model_name_or_path BAAI/bge-small-en-v1.5 \
--train_data /app/data/triplets.jsonl \
--output_dir /app/output/finetuned-bge-model \
--num_train_epochs 3 \
--fp16This is a required one-time setup. This command will process your knowledge base and create a FAISS index for fast searching.
docker-compose run --rm model_worker python src/worker/create_index.pyThis will create an index/ directory on your local machine containing the search index.
Start the entire system with one command!
docker-compose up --buildYour FinQuery bot is now live and listening for messages on Telegram. To stop the application, press Ctrl+C.
This project is licensed under the MIT License. See the LICENSE file for details.