Skip to content

Latest commit

 

History

History
14 lines (9 loc) · 531 Bytes

File metadata and controls

14 lines (9 loc) · 531 Bytes

Simple OCR App with LLM

A simple web application that uses a multimodal large language model (LLM) to perform Optical Character Recognition (OCR) on images. This application utilizes the opengvlab/internvl3-14b:free model via OpenRouter API and provides a user-friendly interface with Gradio.

Features

  • Upload images containing text
  • Extract text from images using advanced AI
  • Support for various image formats

Requirements

  • Python 3.8+
  • OpenRouter API Key (get it at OpenRouter)