After I got this up and running, and indexed some TIF image files of a book in Kazakh, the resulting OCR was mostly gobblygook. Clearly for it to process Cyrillic, or other scripts, it needs a specific language. I scrolled through the directories of File-brain and saw that it uses PIL, but didn't see anything about pytesseract.
Where would I need to add lang files for file-brain to use? I already have Tesseract installed (homebrew on macos), with the lang files located at: /opt/homebrew/share/tessdata which in turn points to /opt/homebrew/Cellar/tesseract-lang/4.1.0/share/tessdata which has the lang files of the languages I need.
How do I get file-brain to use this? PIL, from what I read online, needs pytesseract, which doesn't seem to be in the installation.
Thank youy!
After I got this up and running, and indexed some TIF image files of a book in Kazakh, the resulting OCR was mostly gobblygook. Clearly for it to process Cyrillic, or other scripts, it needs a specific language. I scrolled through the directories of File-brain and saw that it uses PIL, but didn't see anything about pytesseract.
Where would I need to add lang files for file-brain to use? I already have Tesseract installed (homebrew on macos), with the lang files located at: /opt/homebrew/share/tessdata which in turn points to /opt/homebrew/Cellar/tesseract-lang/4.1.0/share/tessdata which has the lang files of the languages I need.
How do I get file-brain to use this? PIL, from what I read online, needs pytesseract, which doesn't seem to be in the installation.
Thank youy!