A powerful PDF chat application that leverages Retrieval-Augmented Generation (RAG) to enable intelligent conversations with your documents.
This application allows users to upload PDF documents and interact with them through natural language questions. Using advanced AI techniques, it retrieves relevant information from your documents and provides accurate, context-aware answers.
π Try it Live
- π€ Easy PDF Upload: Drag and drop PDF documents through an intuitive interface
- π€ AI-Powered Conversations: Ask questions in natural language and get intelligent responses
- β‘ Auto-Indexing: Documents are automatically processed and indexed upon upload
- π Semantic Search: Uses vector embeddings for accurate information retrieval
- π¬ Chat History: Maintains conversation context throughout your session
- π¨ Clean Interface: Modern, user-friendly Streamlit interface
| Component | Technology |
|---|---|
| Frontend | Streamlit |
| LLM | Meta-Llama-3-8B-Instruct |
| Embeddings | sentence-transformers/all-MiniLM-L6-v2 |
| Vector Store | FAISS |
| Framework | LangChain 0.3+ (LCEL) |
| PDF Processing | PyPDFLoader |
| Text Splitting | RecursiveCharacterTextSplitter |
βββββββββββββββ
β PDF Upload β
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββββββ
β PyPDFLoader β
β Extract Text β
ββββββββ¬βββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Text Splitter β
β (1000 chars, 150 β
β overlap) β
ββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β HuggingFace β
β Embeddings β
β (all-MiniLM-L6-v2) β
ββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β FAISS Vector Store β
ββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β User Query β
ββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Retriever (k=3) β
β Get relevant chunks β
ββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β RAG Chain (LCEL) β
β Context + Question β
ββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Llama 3 Model β
β Generate Answer β
ββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Display Response β
βββββββββββββββββββββββ
- Python 3.8+
- Hugging Face API Token
-
Clone the repository
git clone https://huggingface.co/spaces/Pats182/Document-Intelligence-Agent-RAG cd Document-Intelligence-Agent-RAG -
Install dependencies
pip install -r requirements.txt
-
Set up environment variables
export HF_Token="your_huggingface_token_here"
-
Run the application
streamlit run src/streamlit_app.py
- Upload a PDF: Click the "Upload PDF" button in the sidebar
- Wait for Processing: The document will be automatically indexed
- Ask Questions: Type your questions in the chat input
- Get Answers: Receive AI-generated responses based on your document
- Temperature: 0.3 (for more focused responses)
- Max Tokens: 512
- Retrieval Top-K: 3 chunks
- Chunk Size: 1000 characters
- Chunk Overlap: 150 characters
You can modify these parameters in streamlit_app.py:
# Adjust LLM parameters
base_llm = HuggingFaceEndpoint(
repo_id="meta-llama/Meta-Llama-3-8B-Instruct",
temperature=0.3, # Adjust for creativity
max_new_tokens=512, # Adjust response length
)
# Adjust text splitting
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, # Adjust chunk size
chunk_overlap=150 # Adjust overlap
)
# Adjust retrieval
retriever = vector_store.as_retriever(
search_kwargs={"k": 3} # Number of chunks to retrieve
)The application uses PyPDFLoader to extract text from PDF documents with proper error handling and temporary file management.
Documents are converted into vector embeddings using sentence-transformers/all-MiniLM-L6-v2, enabling semantic search capabilities.
Built with LangChain's modern LCEL (LangChain Expression Language), the RAG chain:
- Retrieves relevant document chunks
- Formats them with the user's question
- Generates contextual responses using Llama 3
Based on the following context, answer the question accurately and concisely.
If the answer is not in the context, say "I don't have enough information to answer that."
Context: {context}
Question: {question}
Answer:- Documents are processed immediately upon upload
- No manual indexing button required
- Visual feedback during processing
- Chat history persists during the session
- Current document tracking
- Clear chat option available
- Comprehensive try-catch blocks
- User-friendly error messages
- Proper temporary file cleanup
If you need to reprocess a document:
- Click the "π Re-index" button in the sidebar
- Wait for processing to complete
To start a fresh conversation:
- Click the "ποΈ Clear Chat" button in the sidebar
- Fast Retrieval: FAISS enables efficient similarity search
- Optimized Chunks: 1000-character chunks with 150-character overlap ensure context preservation
- Cached Models: Streamlit caching reduces model loading time
- API tokens are securely managed through environment variables
- Temporary files are properly cleaned up after processing
- No data persistence beyond the session
Contributions are welcome! Feel free to:
- Report bugs
- Suggest features
- Submit pull requests
This project is open source and available under standard terms.
- Built with LangChain
- Powered by Hugging Face
- UI by Streamlit
- Model: Meta-Llama-3-8B-Instruct
For questions or feedback, please visit the Hugging Face Space.
π Launch Live Demo