Skip to content

Releases: contact-red/libxml2

1.2.0

22 May 10:52

Choose a tag to compare

Add namespace-aware accessors to Xml2Node

Adds four methods to Xml2Node for working with documents that use XML namespaces. name() continues to return the local name only.

  • namespaceUri(): the node's namespace URI, or "" if the node has no namespace.
  • namespacePrefix(): the namespace prefix as written in the source document (e.g. "glib" for <glib:signal>), or "" for the default namespace.
  • qname(): the qualified name as a (namespace_uri, local_name) tuple, for matching on both at once.
  • getPropNs(uri, local): retrieves a namespace-qualified attribute by URI and local name, useful when a document binds the namespace under a non-conventional prefix.
let GLIB_NS: String = "http://www.gtk.org/introspection/glib/1.0"
for child in class_node.getChildren().values() do
  match child.qname()
  | ("", "method")      => // <method>
  | (GLIB_NS, "signal") => // <glib:signal>
  end
end

Add parser options and safer parsing defaults

Xml2Doc.parseDoc and Xml2Doc.parseFile now accept an optional Xml2ParserOptions argument and route through libxml2's xmlReadDoc / xmlReadFile internally. Without an explicit options argument they parse with safe-by-default settings: no network access (XML_PARSE_NONET), no entity substitution, no external DTD loading. This closes the XXE / SSRF attack surface for code parsing untrusted XML without forcing every caller to know to opt in.

Xml2ParserOptions exposes typed boolean fields mapping 1:1 to libxml2's XML_PARSE_* flags: error_recovery, substitute_entities, no_blanks, no_net, load_dtd, load_dtd_attrs, pedantic, huge. Construct with where named arguments to override only the flags you need.

// Default: safe (no_net = true, substitute_entities = false)
let doc = Xml2Doc.parseDoc(xml)?

// Lenient parse that recovers from malformed input
let opts = Xml2ParserOptions.create(
  where error_recovery' = true, no_blanks' = true)
let doc = Xml2Doc.parseDoc(xml, opts)?

Behaviour change: existing parseDoc / parseFile callers now parse with XML_PARSE_NONET enabled by default. Code that relied on libxml2's previous permissive defaults — in particular, on the parser fetching remote entities or DTDs — must explicitly opt in by constructing Xml2ParserOptions.create(where no_net' = false, substitute_entities' = true, load_dtd' = true).

Errors are now returned as data, not raised

Parsing and XPath entry points no longer use Pony's partial-function ? to signal failure. Instead they return a union of the success value and an Xml2Error value, so callers can inspect structured error information (domain, level, code, message, file, line, plus three context strings and two context integers).

match Xml2Parser.parseDoc(xml)
| let doc: Xml2Doc =>
    // use the document
| let err: Xml2Error =>
    env.err.print(err.string())
end

What changed

  • New Xml2Parser.parseDoc(xml, options) and Xml2Parser.parseFile(auth, path, options) entry points return (Xml2Doc | Xml2Error).
  • New xpathEvalNodes / xpathEvalString / xpathEvalF64 / xpathEvalBool convenience methods on Xml2Doc and Xml2Node return (T | Xml2Error) instead of raising.
  • Xml2XPathResult now includes Xml2Error as a variant; xpathEval populates it on evaluation failure. Empty nodesets now return an empty Array[Xml2Node] rather than None.
  • Xml2Error is now class val with let fields throughout; it is safe to share across actors.
  • Xml2Error.domain and Xml2Error.level are typed primitive unions (Xml2ErrorDomain and Xml2ErrorLevel) rather than free-form String values. Exhaustive match over these unions is supported.
  • Xml2Error.string() produces a human-readable rendering suitable for logging.
  • Xml2Error.from_last_error()? is the new (partial) constructor for reading libxml2's per-thread last-error directly; raises if there is no current error rather than silently fabricating one.

Breaking changes

Xml2Doc.parseDoc(...)? and Xml2Doc.parseFile(...)? constructors have been removed. Migrate to Xml2Parser.parseDoc(...) / Xml2Parser.parseFile(...) and pattern-match the returned union:

Before:

try
  let doc = Xml2Doc.parseDoc(xml)?
  // ...
else
  env.err.print("parse failed")
end

After:

match Xml2Parser.parseDoc(xml)
| let doc: Xml2Doc =>
    // ...
| let err: Xml2Error =>
    env.err.print(err.string())
end

The convenience XPath methods change shape similarly. Before:

try
  let nodes = doc.xpathEvalNodes("//foo")?
  // ...
end

After:

match doc.xpathEvalNodes("//foo")
| let nodes: Array[Xml2Node] => // ...
| let err: Xml2Error          => // ...
end

Callers that previously relied on Xml2Error.create() reading thread-local last-error after a raise should instead receive the Xml2Error directly from the union return — this is reliable across Pony actor migration boundaries, which the previous mechanism was not.

Fix memory leak on every node accessor

Every call to getProp, getPropNs, getNodePath, getContent, getLang, or xpathCastNodeToString on an Xml2Node leaked the underlying libxml2 string allocation. Tree-walks reading attributes or content from many nodes grew memory without bound. The C-side allocation is now freed after the value is cloned into a Pony string.

[1.2.0] - 2026-05-22

Fixed

  • Adds parser options and safer parsing defaults (PR #35)
  • Returns errors as data from parsing and XPath (PR #36)
  • Fixes memory leak on every node accessor (PR #38)

Added

  • Adds namespace-aware accessors to Xml2Node (PR #34)
  • Returns errors as data from parsing and XPath (PR #36)

1.1.2

22 May 00:26

Choose a tag to compare

Adds functions to manipulate attributes.

  • Added getProps(): Array[(String, String)]
  • Added setProp(name: String val, value: String val)
  • Added unsetProp(name: String val)

Adds convenience methods for xpathEval()

Adds:

  • xpathEvalNodes(xpath: String val, namespaces: Array[(String val, String val)] = []): Array[Xml2Node] ?
  • xpathEvalString(xpath: String val, namespaces: Array[(String val, String val)] = []): String val ?
  • xpathEvalF64(xpath: String val, namespaces: Array[(String val, String val)] = []): F64 ?
  • xpathEvalBool(xpath: String val, namespaces: Array[(String val, String val)] = []): Bool ?

These functions are clearer in use at the call-site:

Compare:

  match doc.xpathEval("//child")
  | let nodes: Array[Xml2Node] => // etc…
  else
    h.fail("Expected nodeset result for //child")
  end

or

  let nodes: Array[Xml2Node] = (doc.xpathEval("//child") as Array[Xml2Node])

To:

  let nodes: Array[Xml2Node] = doc.xpathEvalNodes("//child")?

Adds Xml2Doc.serialize() and Xml2Doc.saveToFile()

adds serialize() and saveToFile() methods to Xml2Doc, enabling
XML documents to be saved to strings or files with optional formatting
and encoding control.

Implementation:

  • Added serialize() method using xmlDocDumpFormatMemoryEnc for in-memory
    serialization with format and encoding options
  • Added saveToFile() method using xmlSaveFormatFileEnc for direct file
    output
  • Both methods support pretty-printing and multiple character encodings
    (UTF-8, ISO-8859-1, UTF-16, etc.)

Testing:

  • Added 6 comprehensive test cases covering round-trip serialization,
    formatting options, file I/O, encoding support, modified document
    serialization, and error handling
  • All 33 tests pass successfully

This feature enables both modifying existing XML documents and provides
the foundation for creating new XML documents from scratch.

Adds Document Creation API

Implements comprehensive document creation functionality enabling XML documents to be built programmatically from scratch. Previously, the library only supported parsing existing XML documents.

New Xml2Doc Methods

Core document creation:

  • create(version: String = "1.0"): Xml2Doc ? - Create empty XML documents
  • createElement(name: String, content: String = ""): Xml2Node ? - Create element nodes
  • setRootElement(root: Xml2Node): Xml2Node ? - Set document root element

Convenience constructors and factories:

  • createWithRoot(root_name: String, version: String = "1.0"): Xml2Doc ? - Create document with root in one step
  • createTextNode(content: String): Xml2Node ? - Create text nodes for mixed content
  • createComment(content: String): Xml2Node ? - Create comment nodes

New Xml2Node Methods

Tree manipulation:

  • appendChild(child: Xml2Node): Xml2Node ? - Add child nodes to elements
  • setContent(content: String): None - Set text content of nodes
  • addChild(child_name: String, content: String = ""): Xml2Node ? - Create and add child element in one step

Usage Examples

Creating a simple document:

let doc = Xml2Doc.create()?
let root = doc.createElement("catalog")?
doc.setRootElement(root)?

let book = doc.createElement("book")?
book.setProp("id", "bk101")
let title = doc.createElement("title", "XML Developer's Guide")?
book.appendChild(title)?
root.appendChild(book)?

// Serialize and save
let xml = doc.serialize()?
doc.saveToFile(auth, "catalog.xml")?

Using convenience methods:

// Create with root in one step
let doc = Xml2Doc.createWithRoot("html")?
let root = doc.getRootElement()?

// Add children using convenience method
let body = root.addChild("body")?
let h1 = body.addChild("h1", "Welcome")?

Creating mixed content (text + elements):

let doc = Xml2Doc.createWithRoot("para")?
let para = doc.getRootElement()?

para.appendChild(doc.createTextNode("This is ")?)?
let bold = doc.createElement("b", "bold")?
para.appendChild(bold)?
para.appendChild(doc.createTextNode(" text.")?)?

// Result: <para>This is <b>bold</b> text.</para>

Adding comments:

let doc = Xml2Doc.createWithRoot("root")?
let root = doc.getRootElement()?
root.appendChild(doc.createComment("Generated by Pony")?)?

Fix memory leak in XPath queries

Every call to xpathEval, xpathEvalNodes, xpathEvalString, xpathEvalF64, or xpathEvalBool leaked memory. Long-running processes performing repeated queries against a parsed document grew without bound. Memory is now reclaimed after each call.

Fix memory leak in nodeDump

Every call to nodeDump on an Xml2Node leaked memory. Long-running processes that repeatedly serialised node subtrees grew without bound. Memory is now reclaimed after each call.

Fix silent memory leak in serialize on allocator-lookup failure

If libxml2's internal allocator-function lookup ever failed during serialize, the call would silently leak the serialised buffer. The call now raises an error in that case rather than continuing on with a no-op free.

Harden XPath evaluation and createWithRoot against allocation failures

xpathEval on both Xml2Doc and Xml2Node previously dereferenced a null XPath context inside libxml2 if context allocation failed, contradicting the library's "safe from crashes" promise. The methods now return None cleanly in that case.

xpathEval on Xml2Node also silently fell back to the document root if setting the context node failed, so a query like ./child could return matches from across the whole document. It now returns None rather than producing wrong results.

Xml2Doc.createWithRoot previously leaked the document if root-element creation failed mid-constructor. The document is now freed before the error is raised.

[1.1.2] - 2026-05-22

Fixed

  • Fixes xmlXPathObject leak in Xml2XPathObject.apply (PR #28)
  • Fixes memory leak in nodeDump (PR #29)
  • Fixes silent memory leak in serialize on allocator-lookup failure (PR #30)
  • Hardens XPath evaluation and createWithRoot against allocation failures (PR #31)

Added

  • Added functions to manipulate XML Attributes. (PR #19)
  • Adds convenient methods for xpathEval() (PR #20)
  • Adds serialization API (PR #25)
  • Add document creation API (PR #27)
  • Adds crash-resistance fuzz tests for the Pony API (PR #32)

1.1.1

11 Jan 08:37

Choose a tag to compare

Adds Autogenerated Documentation and Tests

The documentation that was automatically written focuses mainly on
how the pony classes wrap the C API as opposed to how you functionally
use it. This will likely need a rewrite, but it is useful as-is.

The autogenerated tests exercise most of the included API.

Fixes #1 by ensuring that Xml2Doc is last to die

In our previous iteration of this API we had Pony Class instances freeing C structs which were still referenced by Xml2Doc.

The documentation states that for our current parsing use-cases it is unsafe to automatically _final() free any of the XmlNodes as they're a part of our XmlDoc tree.

As such, we have threaded a tag reference through every dependent Pony instance so that it will be "impossible"™ for Xml2Doc._final() to execute the frees until all other dependent references are no longer reachable.

Fixes: #4 Code Review Xml2Node

  • Removed allocated flag as unused.
  • getChildren wasn't actually an off-by-one error, but I prefer the suggested solution.
  • removed castNodeToString function

Fixes: #5 Code Review Xml2XPathObject

  • Vestigal code removed.
  • Documentation already notes the unsupported data types.
  • There is no facility for logging until later versions of the C library become more common.

[1.1.1] - 2026-01-11

Fixed

  • Fixes #2 by ensuring that Xml2Doc is last to die. (PR #8)
  • Fixes: #4 Code Review Xml2Node (PR #10)
  • Fixes: #5 Code Review Xml2XPathObject (PR #11)

Added

  • Adds autogenerated documentation / tests (PR #1)

1.1.0

11 Jan 04:19

Choose a tag to compare

[1.1.0] - 2026-01-11