
DocQnA is a two-phase document question-answering system that combines advanced natural language processing techniques with a robust data ingestion and transformation pipeline. The first phase involves fine-tuning a language model, while the second phase focuses on deploying the system for real-time document question answering.
It is deployed on Azure having Website Link: [https://dockyqna.azurewebsites.net/]
You can find my finetuned version of Zephyr-7B Model for document question-answering here
Hugging Face
Link: [https://huggingface.co/Feluda/Zephyr-7b-QnA]
This phase involves training and fine-tuning a Zephyr-7B Model language model on a dataset of choice. The fine-tuned model is then saved for later use in the DocQnA system.
This is the training results of my model: 
In this phase, the fine-tuned language model is integrated with a document retrieval system to provide accurate and contextually relevant responses to user queries.
Clear separation between model training and deployment phases, ensuring optimal performance and maintainability.
Utilizes a fine-tuned language model for generating high-quality, human-like responses.
Implements a FAISS vector store for fast and precise document retrieval, ensuring that the most relevant information is quickly accessible.
Employs a conversational memory mechanism to maintain continuity in the conversation, allowing the chatbot to consider past interactions and provide contextually relevant responses.
Offers a user-friendly interface for users to interact with the system, making it easy to ask questions and receive detailed answers.
Designed to scale with the size of the document collection, accommodating large volumes of data without compromising performance.
Prerequisites
Python 3.x
Flask
langchain-community libraries
PyPDF2 (for handling PDF files)
Hugging Face Transformers (for embeddings)
git clone https://github.com/shrey2003/docq.git
cd docq
pip install -r requirements.txt
python app.py
Open a web browser and navigate to http://localhost:7000.
Upload a PDF document.
Ask a question related to the content of the document.
Receive an answer generated by the fine-tuned language model.
The application can be deployed using Docker. A Dockerfile is provided for building a container image.
docker build -t docqna .
docker run -p 7000:7000 docqna
docker build -t docqna.azurecr.io/docqna:latest .
docker login docqna.azurecr.io
docker push docqna.azurecr.io/docqna:latest
Distributed under the MIT License. See LICENSE for more information.