A small, safe-by-default HTML AST and renderer for Standard ML. Text and
attribute values are escaped automatically; the only way to emit unescaped
markup is the explicit Html.raw node, so XSS holes are opt-in and greppable.
Output is assembled through sml-buffer
to avoid O(n^2) string concatenation.
Pure and deterministic; builds and tests identically under MLton and Poly/ML.
Escape: context-aware escaping.text/attrescape& < > " '(conservative OWASP rules, safe for both text and quoted attributes), plus anisSafeAttrNamepredicate.Html: anodeAST (Text,Raw,Element) withel/void/text/rawconstructors, automatic attribute-value escaping, HTML5 void-element handling, andrender/renderList/document.
Unsafe attribute names (containing spaces, event-handler injection, etc.) are dropped rather than emitted.
(* Escape *)
val text : string -> string
val attr : string -> string
val isSafeAttrName : string -> bool
(* Html *)
datatype node = Text of string | Raw of string
| Element of { tag : string, attrs : (string*string) list, children : node list }
val text : string -> node
val raw : string -> node
val el : string -> (string * string) list -> node list -> node
val void : string -> (string * string) list -> node
val render : node -> string
val renderList : node list -> string
val document : node -> stringopen Html
val page =
el "ul" [("class", "list")]
[ el "li" [] [text "one"]
, el "li" [] [text "two & three"] ] (* & is escaped *)
val () = print (render page)
(* <ul class="list"><li>one</li><li>two & three</li></ul> *)
(* attribute injection is neutralized *)
val SAFE = render (void "input" [("value", "\"><script>")])
(* <input value=""><script>"> *)make test # MLton
make test-poly # Poly/ML
make all-tests # both
make example # build + run the demo18 deterministic checks (incl. XSS escaping vectors), green under both compilers.
examples/demo.sml builds a small fixed element tree and
renders a full document, showing automatic escaping of text and attribute
values. Rendering is a pure function of the tree, so the output is identical on
every run and on both compilers. Run it with:
$ make example
<!DOCTYPE html>
<html lang="en"><head><title>Demo & Friends</title></head><body><h1 class="main">Hello & welcome</h1><p>1 < 2 and "quoted" text is escaped</p><ul><li>one</li><li>two</li></ul><img src="/logo.png" alt="the logo"><a href="/next?x=1&y=2">next page</a></body></html>
require {
github.com/sjqtentacles/sml-html
}
then smlpkg sync, or vendor under lib/github.com/sjqtentacles/sml-html/ and
reference sml-html.mlb.
lib/github.com/sjqtentacles/
sml-html/
escape.{sig,sml} context-aware HTML escaping
html.{sig,sml} HTML AST + safe renderer
sources.mlb sml-html.mlb
sml-buffer/ vendored dependency (committed)
test/ Harness suite (18 checks)
MIT. See LICENSE.