- Document algorithm and syntax
- Try to make large AST not causing stack overflow while disposing.
- Generate a helper function to flatten an AST tree to a node list, disconnect all fields.
- Extensible tokens, for example, recognize
R"[^\s(]\(and invoke a callback function to determine the end of the string.RegexTokenizer- New syntax for tokenizers for such extensible tokens.
- We can try
/***/with extensible tokens.
- AST file groups.
- An AST file only sees:
- Types defined in this file.
-
@publictypes defined in the same file group. -
@externtypes defined in different depended file groups as field type only.
- C++ codegen are created per groups.
- Only AST classes
#includedepended files groups, visitors do not. - When a visitor need to call types in different file groups, leave it abstract.
- Only AST classes
- An AST file only sees:
- Render input text in Trace-3
- Try to visualize ambiguity using input text
- Built-in parsers: C++ Non-traced test cases save json files with extra field recording the input code at the beginning
- Refactor TraceTree logging and put together with WriteMonospacedEnglishTable
- Reorganize log utilities for better dependency
- Code Coverage
- Collect uncovered code again by break points in executator (trace manager).
- Reconsider in new implementation:
- Test
SyntaxSymbolManager::PrefixMergeCrossReference_Solvefirmly. - Test
TraceManager::BuildStepListForAmbiguitywith nested ambiguity firmly. - Create ambiguity test case caused by only one clause with alternative syntax.
- Test
- Everytime
BuiltInTest_Compilerseem to updateBuiltInTest_Cppwith no reason ::a::b::c::*- Ambiguity
- It should be invalid, because
::a::b::care always parsed as one QualifiedName, instead of being::a(::b::c::*)and::a::b(::c::*)and(::a::b) ::c::* - Refer to
Priority in left recursive transition(?)
- Compiler crashes:
_DeclOrExpr ::= !_BExpr ::= {_DeclaratorKeyword:keywords} _TypeBeforeDeclarator:type _DeclaratorRequiredName:declarator as DeclaratorType ;workingSwitchValuesis nullptr inExpandClauseVisitor::FixRuleName
- Write a powershell script to generate all
.ifiles from.cppand.hin everySourcefolder, collect them in a central place to create test cases.Release\IncludeOnlywill be useful to resolve cross-repo dependencies.
- When
XToResolveis in anotherXToResolve, flatten them. - TODO in
CalculateRuleAndClauseTypes. - TODO in
ValidateDirectPrefixMergeRuleVisitor. - Optimize
CalculateFirstSet_IndirectStartRulesusing partial ordering. - TODO in
SyntaxSymbolManager::EliminateSingleRulePrefix.- Deny
A ::= !B ::= B as Something ::= ...;.
- Deny
- TODO in
CalculateObjectLastInstruction - TODO in
CheckAmbiguityResolution - Print correct codeRange for:
ParserErrorType::RuleIsIndirectlyLeftRecursiveParserErrorType::LeftRecursiveClauseInsidePushConditionParserErrorType::LeftRecursiveClauseInsideTestCondition
X ::= ([a] | [b]) cfails because multiple optional syntax create multiple epsilon transition between the same pair of states.- Possible solution: if multiple combinations of consecutive epsilon transitions makes an epsilon transition between two states, treat them as one single epsilon transition.
- Merge conditions in these epsilon transitions properly.
- Or one more phase before generationg automaton for optimization to merge states and edges.
- Possible solution: if multiple combinations of consecutive epsilon transitions makes an epsilon transition between two states, treat them as one single epsilon transition.
- Optimize
CrossReferencedNFAto merge prefix (two states can be merged if theirInEdgesare identical,FromStateinInEdgesare replaced by merged states). JsonEscapeStringJsonUnescapeStringhandle surrogate pairs correctly.- Review all comments.
- Indirect and multiple left recursion.
- Twist slot number in alternative branches in a clause and see if it is possible to merge prefix
- Add union type and remove
TypeOrExprOrOthersin C++.- Consider what does
@ambiguous unionmean.
- Consider what does
- Try to see if it is possible to
- Remove
PushReturnStacklast argument. - Remove
ReturnDesc::ruleType. - Move
ReturnRuleTypefrom automaton to symbol. - Share traces in different branches.
- From a given state and a few tokens, the trace graph could be copied directly if:
- none of state.returnStacks is reduced
- competitions created before the first token are not attended
- completitions created after the first token are closed
- Do not copy, share it.
- From a given state and a few tokens, the trace graph could be copied directly if:
- Remove
- Serializing
- Escaping and Unescaping pairs (instead of only unescaping)
- Calculate ambiguous ToString cases
- Generate ToString algorithm
- Generate LL parser if possible (print error if failed but forced to do)
- Generate SLR parser if possible (print error if failed but forced to do)
- Document the algorithm in a markdown file
- Switching lexical analyzer during parsing.
- Refactor some properties in
LexerSymbolManagerintoLexerFilewith a name.
- Refactor some properties in
- Printing AST classes that created from a memory pool.
- All
tokenpropertyXbecomesX_, paired with a string propertyXto access the text value inX_. - New priority syntax
- Priority in alternative syntax, but all branches must not consume empty input series (add compile error)
- Priority in left recursive transition (which clause starts this competition?)
- Priority in loop
- Custom error in syntax.
- Error recovering.
- Escaping and unescaping functions
- Offer two options: experiment
- Map positions between escaped and unescaped text
- Error if any condition is constantly evaluated to true or false