Thank you for this good work. I have two questions about using this tool. First let me briefly explain my use case:
I am translating Buddhist texts from Thai to English for the Mahachulalangkornraachawitayaalay (MCU). The source material is images, so I must first do OCR (with tesseract) and then edit to markdown format. After that I can translate to English using Google Translate. During OCR some characters and annotations are missed or misinterpreted. I hope that deepcut can allow me to correct those words that are misrepresented by OCR. For example, the correct word is 'ประจําบท' but OCR misses the sara am and returns 'ประจาบท'.
- Can
deepcut help in this case?
- If there are new or unseen words in the text, how can I add these words to
deepcut for identification in the future?
Thank you for this good work. I have two questions about using this tool. First let me briefly explain my use case:
I am translating Buddhist texts from Thai to English for the Mahachulalangkornraachawitayaalay (MCU). The source material is images, so I must first do OCR (with
tesseract) and then edit to markdown format. After that I can translate to English using Google Translate. During OCR some characters and annotations are missed or misinterpreted. I hope thatdeepcutcan allow me to correct those words that are misrepresented by OCR. For example, the correct word is 'ประจําบท' but OCR misses the sara am and returns 'ประจาบท'.deepcuthelp in this case?deepcutfor identification in the future?