RagBot is an AI-powered chatbot that integrates Retrieval-Augmented Generation (RAG) with Google's Gemini API. It retrieves relevant documents from a vector database and enhances responses using a generative AI model. This project is built using Streamlit for the frontend and ClickHouse for vector storage.
https://ragbot-with-gemini-api.onrender.com/ - might not work yet
- Extracts and processes text from PDFs and images
- Converts text into vector embeddings for efficient search
- Retrieves relevant documents based on user queries
- Generates AI-powered responses using the Gemini API
- Built with Streamlit for an interactive UI
- Python (Streamlit, Pandas, PyPDF2)
- Google Gemini API for text generation
- ClickHouse for vector storage and retrieval
- Google Generative AI for content embedding
- PyPDF2 for PDF text extraction
- OpenCV & Tesseract OCR for image processing
- Clone the repository:
git clone https://github.com/Vinodhariharan/RagBot-with-Gemini-API.git
- Navigate to the project directory:
cd RagBot-with-Gemini-API - Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
GEMINI_API_KEY: Your API key for the Gemini APICLICKHOUSE_HOST,CLICKHOUSE_USERNAME,CLICKHOUSE_PASSWORD: ClickHouse database credentials
- Run the Streamlit application:
streamlit run app.py
- Upload a PDF or an image to extract and store content in the vector database.
- Enter a query in the chatbot interface to retrieve relevant documents and get AI-generated responses.
RagBot-with-Gemini-API/
│── app.py # Main Streamlit application
│── requirements.txt # Python dependencies
│── utils/
│ │── vector_storage.py # Vector DB operations
│ │── pdf_processing.py # PDF text extraction
│ │── image_processing.py # Image text extraction
│── models/
│ │── gemini_integration.py # Gemini API integration
│ │── rag_retrieval.py # Retrieval logic
│── pages/
│ │── chat.py # Chat interface
│ │── upload.py # Upload interface
Contributions are welcome! Feel free to fork the repository and submit pull requests.
This project is licensed under the MIT License.