This project implements a Retrieval-Augmented Generation (RAG) system that allows users to upload documents and then search through them using natural language queries. The system leverages OpenAI's powerful language models to understand queries, retrieve relevant document snippets, and generate concise answers based on the retrieved context.
- Document Upload: Asynchronously upload text documents to the backend.
- Vector Embeddings: Documents are processed to generate vector embeddings using OpenAI's
text-embedding-ada-002model. - Semantic Search (RAG):
- User queries are embedded.
- Cosine similarity is used to find the most relevant document snippets from the uploaded collection.
- The retrieved context and the user's query are sent to a Large Language Model (LLM) (e.g., GPT-3.5 Turbo) to generate an accurate and contextualized answer.
- FastAPI Backend: A robust and efficient Python backend built with FastAPI.
- React Frontend: A user-friendly React application for interacting with the backend.
- Detailed Logging & Error Handling: Enhanced logging to aid in debugging and clearer error messages for better user experience.
Backend:
- Python 3.x
- FastAPI: Web framework for building APIs.
- Uvicorn: ASGI server to run FastAPI.
- OpenAI Python Library: For generating embeddings and LLM completions.
- NumPy: For numerical operations on embeddings.
- python-dotenv: For managing environment variables.
- httpx: Underlying HTTP client for OpenAI.
Frontend:
- React.js: JavaScript library for building user interfaces.
- Fetch API: For making HTTP requests to the backend.
-
Navigate to the backend directory:
cd backend -
Create a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up your environment variables: Create a
.envfile in thebackenddirectory with the following variables:CHATLLM_API_KEY=your_openai_api_key CHATLLM_API_URL=https://api.openai.com/v1 CHATLLM_MODEL=gpt-3.5-turbo CHATLLM_TEMPERATURE=0.7 CHATLLM_MAX_TOKENS=1000
- Install the required dependencies:
npm install
-
Start the backend server:
cd backend python main.pyThe backend will be available at
http://localhost:8000. -
In a new terminal, start the frontend development server:
npm start
The frontend will be available at
http://localhost:3000.
GET /- Root endpoint with API informationPOST /api/upload- Upload a document for indexingPOST /api/search- Perform a search queryGET /docs- Interactive API documentation
-
Upload a document:
curl -X POST "http://localhost:8000/api/upload" \ -H "Content-Type: multipart/form-data" \ -F "file=@path/to/your/document.txt"
-
Perform a search:
curl -X POST "http://localhost:8000/api/search" \ -H "Content-Type: application/json" \ -d '{"query": "your search query"}'
Search-AI/
├── backend/
│ ├── main.py # FastAPI application entry point
│ ├── requirements.txt # Python dependencies
│ ├── .env # Environment variables (not in version control)
│ └── sample_document.txt # Example document for testing
├── src/
│ ├── components/
│ │ ├── FileUpload.jsx # Component for document upload
│ │ ├── Header.jsx # Application header
│ │ ├── SearchBar.jsx # Search input component
│ │ ├── SearchComponent.js # Main search functionality
│ │ └── ToggleAbstracts.jsx # Toggle for abstract/full text
│ ├── pages/
│ │ ├── AdvancedSearchPage.jsx # Advanced search interface
│ │ ├── SearchPage.js # Basic search page
│ │ └── SearchPage.jsx # Enhanced search page
│ ├── App.js # Main application component
│ ├── App.jsx # Alternative main application component
│ ├── index.js # React application entry point
│ └── theme.js # Application theme configuration
├── public/
│ └── index.html # Main HTML file
├── package.json # Frontend dependencies and scripts
└── README.md # This file
- Document Upload: Users upload text documents through the frontend interface.
- Embedding Generation: The backend processes uploaded documents to create vector embeddings using OpenAI's
text-embedding-ada-002model. - Storage: Document contents and their embeddings are stored in memory (in production, this would be persisted to a database).
- Query Processing: When a user submits a search query, it's converted to an embedding.
- Similarity Search: The system finds the most relevant document snippets using cosine similarity.
- Response Generation: The retrieved context and user query are sent to OpenAI's GPT model to generate a contextualized answer.
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for providing the powerful language models and embeddings API
- The FastAPI team for an excellent Python web framework
- The React team for the frontend library