feat: Implement std.parseXmlJsonml#704
Conversation
jsonml.go
Outdated
| b.currDepth++ | ||
| case xml.CharData: | ||
| t := token.(xml.CharData) | ||
| s := strings.TrimSpace(string(t)) |
There was a problem hiding this comment.
Is there a reason for trimming it? I think whitespace is actually meaningful in XML because of things like <b>foo</b>bar vs <b>foo</b> bar. If we're only supporting the (admittedly common) case of the XML file being whitespace agnostic (i.e. whoever consumes the XML doesn't care about the whitespace) then we can document it as not supporting that. However I think just not trimming the strings here will fix it, will it not?
There was a problem hiding this comment.
The reason behind trimming whitespaces is to simplify jsonml. Currently due to formatted xml, jsonml becomes bit complex to work with due to empty content. We want to ignore empty lines but I agree current implementation would also trim spaces around the content. I could take of it.
I am thinking of allowing users to control this behaviour. We can provide an argument trimWhitespaces = true to the function which would control if user wants to remove formatting whitespaces.
WDYT?
There was a problem hiding this comment.
I found this which looks interesting. http://xml.coverpages.org/rfc-wshp19990416.html I wonder if the current behaviour of the manifester is wrong?
There was a problem hiding this comment.
There was a problem hiding this comment.
What I understand from these link is it is safe to remove boundary whitespaces as a default behaviour. We would provide an argument preserveWhitespaces = false to preserve them if anyone wants to do so.
There was a problem hiding this comment.
There seems to be some diversity in what other parsers are doing so I think anything we do is fine as long as it's documented.
|
Hi @sparkprime |
|
@sparkprime nudging this PR |
Implement
std.parseXmlJsonmlas standard functioncpp-jsonnet PR: google/jsonnet#1092