Research-grade Turkish morphological segmentation system and dataset (roots, suffixes, POS) built from Kaikki, Zemberek, and Wikimedia; optimized for FST-based linguistics.
nlp morphology wikimedia turkish dataset lexicon computational-linguistics finite-state-transducer pos-tagging wiktionary fst zemberek morphological-segmentation kaikki morphotactics
-
Updated
Mar 16, 2026 - Python