A simple web application that uses a multimodal large language model (LLM) to perform Optical Character Recognition (OCR) on images. This application utilizes the opengvlab/internvl3-14b:free model via OpenRouter API and provides a user-friendly interface with Gradio.
- Upload images containing text
- Extract text from images using advanced AI
- Support for various image formats
- Python 3.8+
- OpenRouter API Key (get it at OpenRouter)