AI Doctor 2.0 is an experimental AI-powered health assistant capable of listening, seeing, and responding intelligently through a unified multimodal interface. It integrates speech, image, and text modalities using advanced AI models and a user-friendly Gradio interface.
- 🎙️ Voice Input — Real-time speech-to-text using OpenAI Whisper
- 🖼️ Image Input — Vision context processed using LLaMA Instruct (Base64 encoded)
- 🧠 AI Response — Powered by Groq LLM for ultra-fast and accurate reasoning
- 🔊 Text-to-Speech Output — Converts AI responses to voice using gTTS
- 💻 Web UI — Seamless interaction via Gradio
| Component | Technology |
|---|---|
| Speech-to-Text | OpenAI Whisper |
| Image Processing | LLaMA Instruct |
| LLM Inference | Groq LLM |
| Text-to-Speech | gTTS |
| Web UI | Gradio |
| Backend | Python |
git clone https://github.com/Coder-010506/ai-doctor-2.0.git
cd ai-doctor-2.0