The .lexc File Parser is a tool designed to analyze files in the .lexc format used in the Apertium project. It provides several useful functions, including:
- Listing parts of speech in the language.
- Parsing and visualizing a tree from any node within a specified depth.
- Counting lemmas in any specific parts of speech.
To install the .lexc File Parser, you can clone the GitHub repository using the following command:
git clone https://github.com/elleish/sakha_language_toolsOnce this python package is installed, you can use it to analyze .lexc files for the Apertium project.
- Import this python package:
import lexc_parser as lp - Load the .lecx file
Sakha = lp.download('Sakha'),Kazakh = lp.download('Kazakh'),Kyrgyz = lp.download('Kyrgyz')or another language - Parse the .lecx file into tree:
tree = lp.Tree(Sakha),tree = lp.Tree(Kazakh)or another language - Examine the parts of speech in the language:
tree.tree['Root'] - Count the lemmas in a specific part of speech:
len(tree.tree['Verbs']),len(tree.tree['Nouns']), etc. - List of tree nodes
tree.tree.keys() - Visualize a tree from any node:
tree.tree('N1') - Visualize a tree from any node with a specified depth:
tree.tree('Nouns', depth_restrict=4) - Additional examples can be found in the
examples.pyorexamples.ipynbfiles.