The Xextan PEG parser is a work in progress surface syntax parser for the Xextan language, as well as related tools and interfaces.
The interfaces to the parser are:
parser.html: HTML interface with various parsing options and allowing selecting the desired parser.run_parser.js: Command line interface.
For generating a PEGJS grammar engine from its PEG grammar file, as well as for running the IRC bot interfaces, you need to have Node.js installed on your machine.
For generating a PEGJS engine, you need to have the Node.js module pegjs installed in the directory, which you can achieve with the command line npm install pegjs.
For building the parser (parser.js) from the parser.peg grammar file, you need to use the following command:
node build-parser parser.peg
Here's how to parse the Xextan text "pal nio e tik" with the grammar parser from command line:
node run_parser "pal nio e tik"
The standard grammar parser is used by default, but another grammar engine can be specified.
- The
-stdflag selects the standard grammar engine. -p PATHcan be used for selecting a parser by giving its file path as a command line argument.
Additionally, -m MODE can be used to specify output postprocessing options.
Here, MODE can be any letter string, each letter standing for a specific option.
Here is the list of possible letters and their associated meaning:
- M -> Keep morphology
- S -> Show spaces
- C -> Show word classes
- T -> Show terminators
- N -> Show main node labels
- R -> Raw output, do not prune the tree, except the morphology if 'M' not present.
- J -> JSON output
- G -> Replace words by glosses
- L -> Run the parser in a loop, consume every input line terminated by a newline and output parsed result
- L -> A second 'L' means that run_parser will expect every input line to begin with a mode string (possibly empty) followed by a space, after which the actual input follows.
Example:
node run_parser -m CTN "pal nio e tik"
This will show terminators, word classes and main node labels.