opencode-auto-ocr-image

Intelligent image/PDF handler for OpenCode. When you paste an image into chat, this plugin automatically:

✅ Multimodal models (GPT-5, Claude 4, Gemini 3, etc.) — keeps the image as a native FilePart so the model sees it directly
✅ Text-only models — runs built-in OCR via tesseract.js and replaces the image with extracted text
✅ PDFs — passes through to PDF-capable models or provides the file path as fallback

Zero system dependencies. tesseract.js runs in pure JS/WASM — no Tesseract binary, no Python, no Docker. The language model (~5 MB) auto-downloads from CDN on first OCR call and caches for subsequent use.

How it works

User pastes image
  ↓
chat.message hook fires
  ├─ Multimodal model? → Keep FilePart (model sees the image directly)
  └─ Text-only model?  → OCR via tesseract.js → replace with text
  ↓
messages.transform hook fires (before LLM send)
  ├─ Multimodal model? → Re-inject FilePart from saved temp file
  └─ Text-only model?  → Clean up any stray FileParts

Installation

Add to your opencode.json:

{
  "plugin": ["opencode-auto-ocr-image"]
}

Restart OpenCode. The plugin activates automatically — no configuration needed.

First run notes

On the first image paste with a text-only model, tesseract.js will download the OCR language data (~5 MB for Chinese + English). This takes a few seconds and happens once.
Model capabilities are auto-detected from OpenCode's provider API. A built-in fallback list covers known multimodal models (Claude, GPT-5, Gemini, etc.).

Usage

Just paste images as you normally would. The plugin handles everything transparently:

You: [paste screenshot.png]
  → Multimodal model: image sent as FilePart (native vision)
  → Text-only model:  image OCR'd, text replaces the image

Diagnostic command

Type !test-ocr in chat to see current status:

=== [auto-ocr-image diagnostic] ===
Model: claude-sonnet-4-5
Capabilities: image=true, pdf=false
OCR engine: tesseract.js (chi_sim+eng)
OCR status: ready
Cached models: 47
Temp dir: /tmp/opencode/ocr-data
=== done ===

Configuration

Set the OCR_LANG environment variable to override OCR languages:

# English only
OCR_LANG=eng

# Japanese
OCR_LANG=jpn

# Multiple languages (improves accuracy for mixed content)
OCR_LANG=chi_sim+eng+jpn

Default: chi_sim+eng (Chinese Simplified + English)

Data files

Path	Purpose
`$TMPDIR/opencode/ocr-data/`	Saved pasted images for re-injection
`$TMPDIR/opencode/auto-ocr-image.log`	Debug log (auto-trimmed at 50 KB)

All files are stored in the OS temp directory — nothing persists across reboots.

Limitations

PDFs are not OCR'd (only the file path is passed through for text-only models)
First OCR call is slower (~5-10s) due to model download
OCR accuracy depends on image quality and tesseract.js capabilities

How model detection works

Provider API: queries OpenCode's client.config.providers() for per-model capability flags
Override list: known multimodal models are hardcoded as fallback when the API doesn't report capabilities
Graceful degradation: text-only models always get OCR'd text instead of raw images

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

opencode-auto-ocr-image

How it works

Installation

First run notes

Usage

Diagnostic command

Configuration

Data files

Limitations

How model detection works

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

opencode-auto-ocr-image

How it works

Installation

First run notes

Usage

Diagnostic command

Configuration

Data files

Limitations

How model detection works

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages