How to add languages for OCR

After I got this up and running, and indexed some TIF image files of a book in Kazakh, the resulting OCR was mostly gobblygook. Clearly for it to process Cyrillic, or other scripts, it needs a specific language. I scrolled through the directories of File-brain and saw that it uses PIL, but didn't see anything about pytesseract. 

Where would I need to add lang files for file-brain to use? I already have Tesseract installed (homebrew on macos), with the lang files located at: /opt/homebrew/share/tessdata which in turn points to /opt/homebrew/Cellar/tesseract-lang/4.1.0/share/tessdata which has the lang files of the languages I need. 

How do I get file-brain to use this? PIL, from what I read online, needs pytesseract, which doesn't seem to be in the installation.

Thank youy!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to add languages for OCR #37

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How to add languages for OCR #37

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions