Problem
Using <- containing rules that use < will nullify the effectiveness of the former. For example, imagine you desire grammar like the following:
File < Statement_list eof
Statement_list <- Statement Inline_spacing (Separator Inline_spacing Statement)*
Separator <- ',' / eol+
Statement < '1' '+' '2'
Inline_spacing <- :(' ')*
valid input that yields 3 statements might look like
The intention of this grammar is that only one statement can occur per line. Basically statements can be separated by a number of newlines, or one comma, but not both, while the contents of those statements can be spread across multiple lines.
These rules are impossible with this more intuitive setup, because < captures leading and trailing white-space. There is a way to pull this off, but it requires instead setting a custom spacing rule with no newlines, and then manually specifying everywhere a newline can occur in the middle of a statement. For instance, here's an excerpt from my own code.
br <- :(eol?)
Expression < Ex5
Ex5 < Type? br Ex4
Ex4 < Ex3 (br '+' br Ex3)*
Ex3 < Ex2 (br '*' br Ex2)*
Ex2 < ('-' / '/' br)? Ex1
Ex1 < '(' br Expression br ')' / Ex0
Ex0 < lit_number / name_value
Proposal
There is an elegant solution that should save on some spacing checks and enable the code I presented at the top to work as intended. Essentially, Spacing is only inserted between characters. Never at the start or end of the rule. There is one exception to this though, which is the entrypoint rule. With this the entrypoint must have non space characters at the first position of the input. The solution for that is to either make an exception where < on the entrypoint will capture leading whitespace, or just require the user to handle it explicitly with "" or using Space directly.
This shouldn't cause breakages for most code, consider:
A < B C
B < '1' '2'
C < '3' '4'
Expands to:
A <- B Spacing C
B <- '1' Spacing '2'
C <- '3' Spacing '4'
The only case where spacing is missed, is where A uses <-, forcing 2 and 3 to be adjacent, or allow only what the user wants.
Problem
Using
<-containing rules that use<will nullify the effectiveness of the former. For example, imagine you desire grammar like the following:valid input that yields 3 statements might look like
The intention of this grammar is that only one statement can occur per line. Basically statements can be separated by a number of newlines, or one comma, but not both, while the contents of those statements can be spread across multiple lines.
These rules are impossible with this more intuitive setup, because
<captures leading and trailing white-space. There is a way to pull this off, but it requires instead setting a custom spacing rule with no newlines, and then manually specifying everywhere a newline can occur in the middle of a statement. For instance, here's an excerpt from my own code.Proposal
There is an elegant solution that should save on some spacing checks and enable the code I presented at the top to work as intended. Essentially,
Spacingis only inserted between characters. Never at the start or end of the rule. There is one exception to this though, which is the entrypoint rule. With this the entrypoint must have non space characters at the first position of the input. The solution for that is to either make an exception where<on the entrypoint will capture leading whitespace, or just require the user to handle it explicitly with""or usingSpacedirectly.This shouldn't cause breakages for most code, consider:
Expands to:
The only case where spacing is missed, is where
Auses<-, forcing2and3to be adjacent, or allow only what the user wants.