Skip to content

Raafi-Code/Simple-OCR-App-with-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple OCR App with LLM

A simple web application that uses a multimodal large language model (LLM) to perform Optical Character Recognition (OCR) on images. This application utilizes the opengvlab/internvl3-14b:free model via OpenRouter API and provides a user-friendly interface with Gradio.

Features

  • Upload images containing text
  • Extract text from images using advanced AI
  • Support for various image formats

Requirements

  • Python 3.8+
  • OpenRouter API Key (get it at OpenRouter)

About

A simple web application that uses a multimodal large language model (LLM) to perform Optical Character Recognition (OCR) on images.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages