Lexical analyzer is a C++ console application for windows, used to write a report of a script file depending on given language definition file in XML format.
You can start the application via either opening the executable file (LexicalAnalyzer.exe) or by opening it via command prompt and giving it appropriate arguments.
There are three arguments you can pass to the application:
| Syntax | Description |
|---|---|
| -l | the given file after this parameter is a language XML file |
| -o | the given file after this parameter is a desired output file |
| -i | the given file after this parameter is an input script file |
Here are some examples of running the application through a command prompt with arguments:
.\LanguageAnalyzer.exe -o outputFilename.txt.\LanguageAnalyzer.exe -i testcase.txt -o testResults.txt.\LanguageAnalyzer.exe -l language.xml -i whatever.txt
The order at which you specify these arguments does not matter and any missing arguments is going to force the application to prompt the user to enter missing files.
Language definition file is defined using the XML syntax. There are six main elements:
<language>: the root element, there may only be one.<category>: goes inside<language>element, there can be multiples, but two categories must exist and they must have a type attribute.- The required category types (attributes) are:
"separators"and"comments".
- The required category types (attributes) are:
<whitelist>: goes inside<category>element, is used to denote that elements inside of it belong to a whitelist behaviour.<blacklist>: goes inside<category>element, is used to denote that elements inside of it belong to a blacklist behaviour.<string>: goes inside either<whitelist>or<blacklist>elements, specified text is meant to be found exactly as is.- However, the given string still has to follow RegEx syntax rules.
<expression>: goes inside either<whitelist>or<blacklist>elements, specified expression is used forregex_match()method.
For RegEx syntax rules, please refer to RegEx syntax
Blacklists will match given string to its category only if it doesn't match with any of the elements inside the blacklist.
Whitelist will match the given string to its category if it matches with any of the elements inside the whitelist.
It is possible to combine blacklists and whitelists in a single category, at then, it will check through the whitelist elements, and then check the blacklist elements.
<?xml version="1.0" encoding="utf-8" standalone="no" ?>
<language>
<category type="separators">
<whitelist>
<string>\s</string> <!--whitespace-->
</whitelist>
</category>
<category type = "comments">
<whitelist>
<string>\$</string>
<expression>\$.*</expression>
</whitelist>
</category>
<category type = "identifiers">
<blacklist>
<expression>\(.*</expression>
</blacklist>
</category>
<category type = "operators">
<whitelist>
<string>+<string>
</whitelist>
</category>
</language>For an XML format cheatsheet, please refer to XML Syntax Rules.