A beautiful web interface for the RAG (Retrieval-Augmented Generation) Question Answering system built with Streamlit.
- 🤖 Interactive Web Interface: Modern, responsive design
- 📊 Confidence Scores: Visual confidence indicators with color coding
- ⚙️ Configurable Parameters: Adjust model settings in real-time
- 📁 Data Upload: Upload custom JSON files with Q&A pairs
- 🎯 Real-time Processing: Instant answers with processing time display
- 📈 Statistics Dashboard: System metrics and recent questions
-
Install dependencies:
pip install -r requirements_streamlit.txt
-
Run the application:
streamlit run streamlit_app.py
-
Open your browser: The app will automatically open at
http://localhost:8501
- The app loads with sample data by default
- Type your question in the text area
- Click "Get Answer" to receive a response with confidence score
- Adjust parameters in the sidebar as needed
-
Prepare a JSON file with Q&A pairs in this format:
[ { "question": "Your question here?", "answer": "The corresponding answer here." } ] -
Select "Upload Custom Data" in the sidebar
-
Upload your JSON file
-
Start asking questions!
Model Parameters:
- Max Answer Length: Control the length of generated answers (20-100 tokens)
- Number of Beams: Adjust beam search for better quality (1-8 beams)
- Confidence Threshold: Set minimum confidence for reliable answers (0-100%)
Device Selection:
- Auto-detect: Automatically choose GPU if available
- CPU: Force CPU usage
- GPU: Force GPU usage (if available)
rag_project/
├── streamlit_app.py # Main Streamlit application
├── requirements_streamlit.txt # Dependencies for Streamlit
├── sample_data.json # Sample Q&A data
├── README_streamlit.md # This file
└── rag_project/
└── qa_system.py # Core QA system
The system provides confidence scores based on semantic similarity:
- 🟢 High Confidence (70%+): Green indicator - reliable answer
- 🟡 Medium Confidence (40-69%): Yellow indicator - moderate reliability
- 🔴 Low Confidence (<40%): Red indicator - low reliability
-
CUDA/GPU Issues:
- If you encounter GPU errors, try selecting "CPU" in device options
- Ensure you have compatible CUDA drivers installed
-
Memory Issues:
- Reduce "Max Answer Length" parameter
- Use smaller datasets
- Close other applications using GPU memory
-
Model Loading Issues:
- Check internet connection (models are downloaded on first run)
- Ensure sufficient disk space for model downloads
- GPU Usage: Enable GPU for faster processing
- Batch Processing: Process multiple questions in sequence
- Model Caching: The app caches the model to avoid reloading
Modify the CSS in the st.markdown() section to customize the appearance.
- Extend the
get_answer_with_confidence()function for additional metrics - Add more configuration options in the sidebar
- Implement session state for conversation history
- Streamlit: Web framework
- PyTorch: Deep learning framework
- Transformers: Hugging Face models
- Sentence-Transformers: Embedding models
- Scikit-learn: Machine learning utilities
- NumPy/Pandas: Data processing
This project is part of the RAG QA System. See the main project license for details.