Skip to content

Latest commit

 

History

History
109 lines (97 loc) · 5.59 KB

File metadata and controls

109 lines (97 loc) · 5.59 KB

TODO

Next task

  • Document algorithm and syntax
  • Try to make large AST not causing stack overflow while disposing.
    • Generate a helper function to flatten an AST tree to a node list, disconnect all fields.
  • Extensible tokens, for example, recognize R"[^\s(]\( and invoke a callback function to determine the end of the string.
    • RegexTokenizer
    • New syntax for tokenizers for such extensible tokens.
    • We can try /***/ with extensible tokens.
  • AST file groups.
    • An AST file only sees:
      • Types defined in this file.
      • @public types defined in the same file group.
      • @extern types defined in different depended file groups as field type only.
    • C++ codegen are created per groups.
      • Only AST classes #include depended files groups, visitors do not.
      • When a visitor need to call types in different file groups, leave it abstract.
  • Render input text in Trace-3
    • Try to visualize ambiguity using input text
    • Built-in parsers: C++ Non-traced test cases save json files with extra field recording the input code at the beginning
  • Refactor TraceTree logging and put together with WriteMonospacedEnglishTable
    • Reorganize log utilities for better dependency

Test Cases

  • Code Coverage
    • Collect uncovered code again by break points in executator (trace manager).
  • Reconsider in new implementation:
    • Test SyntaxSymbolManager::PrefixMergeCrossReference_Solve firmly.
    • Test TraceManager::BuildStepListForAmbiguity with nested ambiguity firmly.
    • Create ambiguity test case caused by only one clause with alternative syntax.

Issues (BuiltIn-Cpp)

  • Everytime BuiltInTest_Compiler seem to update BuiltInTest_Cpp with no reason
  • ::a::b::c::*
    • Ambiguity
    • It should be invalid, because ::a::b::c are always parsed as one QualifiedName, instead of being ::a(::b::c::*) and ::a::b(::c::*) and (::a::b) ::c::*
    • Refer to Priority in left recursive transition (?)
  • Compiler crashes:
    • _DeclOrExpr ::= !_BExpr ::= {_DeclaratorKeyword:keywords} _TypeBeforeDeclarator:type _DeclaratorRequiredName:declarator as DeclaratorType ;
    • workingSwitchValues is nullptr in ExpandClauseVisitor::FixRuleName
  • Write a powershell script to generate all .i files from .cpp and .h in every Source folder, collect them in a central place to create test cases.
    • Release\IncludeOnly will be useful to resolve cross-repo dependencies.

Issues (Glr)

  • When XToResolve is in another XToResolve, flatten them.
  • TODO in CalculateRuleAndClauseTypes.
  • TODO in ValidateDirectPrefixMergeRuleVisitor.
  • Optimize CalculateFirstSet_IndirectStartRules using partial ordering.
  • TODO in SyntaxSymbolManager::EliminateSingleRulePrefix.
    • Deny A ::= !B ::= B as Something ::= ...;.
  • TODO in CalculateObjectLastInstruction
  • TODO in CheckAmbiguityResolution
  • Print correct codeRange for:
    • ParserErrorType::RuleIsIndirectlyLeftRecursive
    • ParserErrorType::LeftRecursiveClauseInsidePushCondition
    • ParserErrorType::LeftRecursiveClauseInsideTestCondition
  • X ::= ([a] | [b]) c fails because multiple optional syntax create multiple epsilon transition between the same pair of states.
    • Possible solution: if multiple combinations of consecutive epsilon transitions makes an epsilon transition between two states, treat them as one single epsilon transition.
      • Merge conditions in these epsilon transitions properly.
    • Or one more phase before generationg automaton for optimization to merge states and edges.
  • Optimize CrossReferencedNFA to merge prefix (two states can be merged if their InEdges are identical, FromState in InEdges are replaced by merged states).
  • JsonEscapeString JsonUnescapeString handle surrogate pairs correctly.
  • Review all comments.

Experiments

  • Indirect and multiple left recursion.
  • Twist slot number in alternative branches in a clause and see if it is possible to merge prefix
  • Add union type and remove TypeOrExprOrOthers in C++.
    • Consider what does @ambiguous union mean.
  • Try to see if it is possible to
    • Remove PushReturnStack last argument.
    • Remove ReturnDesc::ruleType.
    • Move ReturnRuleType from automaton to symbol.
    • Share traces in different branches.
      • From a given state and a few tokens, the trace graph could be copied directly if:
        • none of state.returnStacks is reduced
        • competitions created before the first token are not attended
        • completitions created after the first token are closed
      • Do not copy, share it.
  • Serializing
    • Escaping and Unescaping pairs (instead of only unescaping)
    • Calculate ambiguous ToString cases
    • Generate ToString algorithm
  • Generate LL parser if possible (print error if failed but forced to do)
  • Generate SLR parser if possible (print error if failed but forced to do)
  • Document the algorithm in a markdown file

Work Items (enhancement)

  • Switching lexical analyzer during parsing.
    • Refactor some properties in LexerSymbolManager into LexerFile with a name.
  • Printing AST classes that created from a memory pool.
  • All token property X becomes X_, paired with a string property X to access the text value in X_.
  • New priority syntax
    • Priority in alternative syntax, but all branches must not consume empty input series (add compile error)
    • Priority in left recursive transition (which clause starts this competition?)
    • Priority in loop
  • Custom error in syntax.
  • Error recovering.
  • Escaping and unescaping functions
    • Offer two options: experiment
    • Map positions between escaped and unescaped text
  • Error if any condition is constantly evaluated to true or false