Skip to content

aryanguptacsvtu/DocuMind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Typing SVG


📚 DocuMind-RAG -- Chat with Your Documents

Python Version Streamlit Groq AI FAISS Category


🧠 About the Project

DocuMind-RAG is an intelligent AI-powered PDF chatbot that lets you upload multiple documents and chat directly with their content.
It uses Retrieval-Augmented Generation (RAG) with LangChain, FAISS, and Groq LLaMA 3.1 to deliver accurate, context-aware answers from your files.

Whether you're analyzing research papers, study material, or business reports — DocuMind transforms your PDFs into interactive conversations.


🔄 How It Works

graph TB
    subgraph Setup["📄 Document Processing (One-Time Setup)"]
        A[Upload PDFs] --> B[PyPDF2: Extract Text]
        B --> C[TextSplitter: Split into Chunks]
        C --> D[HuggingFace: Generate Embeddings]
        D --> E[FAISS: Store Vector DB]
    end
    
    subgraph Chat["💬 Conversational Interface"]
        F[User Asks Question]
        F --> G{Has Chat History?}
    end
    
    subgraph RAG["🔍 RAG Workflow"]
        G -->|Yes| H[Contextualize Question with History]
        G -->|No| I[Use Question As-Is]
        H --> J[FAISS: Retrieve Relevant Chunks]
        I --> J
        J --> K[LLaMA 3.1 via Groq: Generate Answer]
        K --> L[Update Chat History]
    end
    
    L --> M[Display Response to User]
    M --> F
    E -.->|Vector Store Ready| F
    
    style A fill:#9D00FF,color:#fff
    style E fill:#4CAF50,color:#fff
    style F fill:#2196F3,color:#fff
    style K fill:#FF9800,color:#fff
    style M fill:#2196F3,color:#fff
Loading

Core Features

✅ Upload and process multiple PDF files
✅ Extract, chunk, and embed text using HuggingFace transformers
✅ Store vector embeddings efficiently with FAISS
✅ Query your PDFs conversationally using LLaMA 3-8B (Groq API)
✅ Memory-aware responses with contextual follow-ups
✅ Clean Streamlit UI for real-time chatting


🛠️ Tech Stack


📦 Setup & Installation

Follow these steps to run DocuMind on your local machine.

1. Clone the Repository

git clone https://github.com/your-username/DocuMind.git
cd DocuMind

2. Create a Virtual Environment

It's highly recommended to use a virtual environment to manage dependencies.

# For macOS/Linux
python3 -m venv venv
source venv/bin/activate

# For Windows
python -m venv venv
.\venv\Scripts\activate

3. Install Dependencies

You can install all Python packages using the provided requirements.txt file.

pip install -r requirements.txt

4. Set Up Environment Variables

The application uses an API key for the Groq LLM.

  1. Create a file named .env in the root of your project directory.

  2. Add your Groq API key to this file: GROQ_API_KEY="your-api-key-here"

5. Run the Streamlit App

Once everything is installed, launch the app from your terminal:

streamlit run frontend.py

Your browser should automatically open to the application.


▶️ Usage

  1. Launch the application using the command above.

  2. Use the sidebar to upload one or more PDF files you wish to chat with.

  3. Click the "Process" button and wait for the "Documents processed!" success message.

  4. The chat input box at the bottom of the page will become active.

  5. Start asking your questions!


👨‍💻 Author

Aryan Gupta
📍 Bhilai, Chhattisgarh
🔗 GitHub Profile


⭐ Support

If you like this project, leave a ⭐ and share it with others!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages