Recommended libxml2 API Functionality to Add Next
Executive Summary
Recommendation: Add Document Serialization APIs
Document serialization (saving XML to file/string) is the highest-value addition because:
- Foundation for your goals: Required for both modifying existing XML and creating new XML from scratch
- Immediate utility: Enables save-modify-save workflows right away
- Low complexity: Straightforward implementation (~2.5 hours)
- Unblocks future work: Prerequisite for document creation APIs
Current API Coverage Analysis
What's Implemented:
- ✅ XML parsing (from file and string)
- ✅ XPath evaluation with namespaces
- ✅ Tree navigation (root, children, attributes)
- ✅ Attribute operations (get/set/unset)
- ✅ Content retrieval
- ✅ Error handling via Xml2Error
Critical Gap Identified:
- ❌ Document serialization - Cannot save documents to file or string
- ❌ Document creation - No API to build XML from scratch
- ❌ Node creation/insertion - Cannot add new elements
- ❌ Node removal - Cannot delete elements
- ❌ HTML parsing - Not exposed at high level
- ❌ Schema validation (XSD/DTD)
Why Serialization First:
Without serialization, any document modifications or creation are useless - you can't save the results. This is the logical first step that enables all write-oriented workflows.
Implementation Plan: Document Serialization
API Design
Add two methods to Xml2Doc class:
fun serialize(
format: Bool = true,
encoding: String = "UTF-8")
: String ?
"""
Serialize document to String with optional formatting.
Returns pretty-printed or compact XML as String val.
"""
fun saveToFile(
auth: FileAuth,
filename: String,
format: Bool = true,
encoding: String = "UTF-8")
: None ?
"""
Save document to file with optional formatting and encoding.
Requires FileAuth capability for safe file access.
"""
Parameters:
format: Bool - Pretty-print with indentation (true) or compact (false)
encoding: String - Character encoding (default "UTF-8", also supports "ISO-8859-1", "UTF-16", etc.)
Error Handling:
- Uses Pony's
? operator (consistent with existing API)
- Raises error on memory allocation failure or file write failure
Implementation Steps
Phase 1: Add xmlFree to Raw API (required for memory management)
- File:
libxml2/raw/uses.pony
- Add:
use @xmlFree[None](ptr: Pointer[U8] tag)
- File:
libxml2/raw/functions.pony
- Add wrapper function
xmlFree(ptr: Pointer[U8] tag): None
Phase 2: Implement Serialization Methods
- File:
libxml2/xml2doc.pony
- Add
serialize() method using xmlDocDumpFormatMemoryEnc()
- Add
saveToFile() method using xmlSaveFormatFileEnc()
Key Implementation Details:
serialize() calls xmlDocDumpFormatMemoryEnc(), copies result to Pony String, then calls xmlFree() to release C memory
saveToFile() calls xmlSaveFormatFileEnc(), checks return value for errors (negative = failure)
- Both methods convert Bool format parameter to I32 (1 or 0) for C API
Phase 3: Add Comprehensive Tests
- File:
libxml2/_tests/coverage_tests.pony
- Round-trip test: parse → serialize → parse → verify structure
- Formatting test: verify compact has no newlines, formatted has indentation
- File save/load test: save → load → verify content
- Encoding test: verify UTF-8 and ISO-8859-1 output
- Modified document test: setProp → serialize → verify changes persist
- Error handling test: invalid file paths raise errors correctly
Phase 4: Documentation
- Update docstrings with examples
- Update CHANGELOG with new features
Critical Files
libxml2/xml2doc.pony - Add serialize() and saveToFile() methods
libxml2/raw/uses.pony - Add @xmlFree FFI declaration
libxml2/raw/functions.pony - Add xmlFree() wrapper
libxml2/_tests/coverage_tests.pony - Add 6 new test cases
libxml2/xml2node.pony (reference) - Existing nodeDump() shows similar pattern
Memory Management Approach
// In serialize():
// 1. Call xmlDocDumpFormatMemoryEnc() - libxml2 allocates memory
// 2. Check for null pointer (allocation failure)
// 3. Copy to Pony String using String.from_cstring()
// 4. FREE with xmlFree() - critical to avoid leak
// 5. Return Pony-owned string
Verification Plan
Manual Testing:
# Run unit tests
make unit-tests
# Test with examples
cd examples
# Modify an example to parse, modify, serialize, and save
Test Coverage Required:
- Parse document → serialize → verify XML structure
- Parse → modify attributes → serialize → verify changes
- Serialize with format=true → verify indentation present
- Serialize with format=false → verify compact output (no newlines)
- saveToFile → parseFile → verify round-trip works
- Test different encodings (UTF-8, ISO-8859-1)
- Error cases: invalid file paths, allocation failures
Success Criteria:
- All new tests pass
- No memory leaks (valgrind clean if needed)
- Round-trip preservation: parse → serialize → parse yields same structure
- Both UTF-8 and other encodings work correctly
Roadmap: Recommended Order for Future APIs
After document serialization is complete, implement in this order:
-
Document Creation (next logical step)
Xml2Doc.create(version, encoding) - Create empty document
- Enables building XML from scratch
-
Node Creation and Insertion
Xml2Node.createElement(name, content?)
Xml2Node.appendChild(child)
Xml2Doc.setRootElement(node)
- Enables dynamic XML construction
-
Node Removal
Xml2Node.remove() or Xml2Node.unlink()
- Completes CRUD operations on trees
-
Text Node Access
- Expose text nodes (currently skipped by getChildren())
Xml2Node.getTextContent(), Xml2Node.setTextContent()
-
HTML Parsing (separate use case)
Xml2Doc.parseHtmlFile(), Xml2Doc.parseHtml()
- Leverage libxml2's HTML parser
-
Schema Validation (advanced feature)
- XSD/DTD validation APIs
- Lower priority, more complex
Why Not Other Features First?
Node Creation Without Serialization: Useless - can't save results
HTML Parsing: Separate use case, doesn't support your stated goals (XML modification/creation)
Schema Validation: Advanced feature, less commonly needed
XSLT: Complex, lower demand than basic CRUD operations
Serialization is the linchpin - it unblocks everything else you want to do.
Estimated Effort
Document Serialization Implementation:
- Add xmlFree to raw API: 15 minutes
- Implement serialize(): 30 minutes
- Implement saveToFile(): 20 minutes
- Write 6 test cases: 45 minutes
- Debug and polish: 30 minutes
- Total: ~2.5 hours
Difficulty: Low-Medium
- Straightforward C API usage
- Clear memory management pattern
- Good test coverage possible
- Main risk: pointer handling (mitigated by following existing patterns)
Generated by Claude Code analysis
Recommended libxml2 API Functionality to Add Next
Executive Summary
Recommendation: Add Document Serialization APIs
Document serialization (saving XML to file/string) is the highest-value addition because:
Current API Coverage Analysis
What's Implemented:
Critical Gap Identified:
Why Serialization First:
Without serialization, any document modifications or creation are useless - you can't save the results. This is the logical first step that enables all write-oriented workflows.
Implementation Plan: Document Serialization
API Design
Add two methods to
Xml2Docclass:Parameters:
format: Bool - Pretty-print with indentation (true) or compact (false)encoding: String - Character encoding (default "UTF-8", also supports "ISO-8859-1", "UTF-16", etc.)Error Handling:
?operator (consistent with existing API)Implementation Steps
Phase 1: Add xmlFree to Raw API (required for memory management)
libxml2/raw/uses.ponyuse @xmlFree[None](ptr: Pointer[U8] tag)libxml2/raw/functions.ponyxmlFree(ptr: Pointer[U8] tag): NonePhase 2: Implement Serialization Methods
libxml2/xml2doc.ponyserialize()method usingxmlDocDumpFormatMemoryEnc()saveToFile()method usingxmlSaveFormatFileEnc()Key Implementation Details:
serialize()callsxmlDocDumpFormatMemoryEnc(), copies result to Pony String, then callsxmlFree()to release C memorysaveToFile()callsxmlSaveFormatFileEnc(), checks return value for errors (negative = failure)Phase 3: Add Comprehensive Tests
libxml2/_tests/coverage_tests.ponyPhase 4: Documentation
Critical Files
libxml2/xml2doc.pony- Add serialize() and saveToFile() methodslibxml2/raw/uses.pony- Add @xmlFree FFI declarationlibxml2/raw/functions.pony- Add xmlFree() wrapperlibxml2/_tests/coverage_tests.pony- Add 6 new test caseslibxml2/xml2node.pony(reference) - Existing nodeDump() shows similar patternMemory Management Approach
Verification Plan
Manual Testing:
Test Coverage Required:
Success Criteria:
Roadmap: Recommended Order for Future APIs
After document serialization is complete, implement in this order:
Document Creation (next logical step)
Xml2Doc.create(version, encoding)- Create empty documentNode Creation and Insertion
Xml2Node.createElement(name, content?)Xml2Node.appendChild(child)Xml2Doc.setRootElement(node)Node Removal
Xml2Node.remove()orXml2Node.unlink()Text Node Access
Xml2Node.getTextContent(),Xml2Node.setTextContent()HTML Parsing (separate use case)
Xml2Doc.parseHtmlFile(),Xml2Doc.parseHtml()Schema Validation (advanced feature)
Why Not Other Features First?
Node Creation Without Serialization: Useless - can't save results
HTML Parsing: Separate use case, doesn't support your stated goals (XML modification/creation)
Schema Validation: Advanced feature, less commonly needed
XSLT: Complex, lower demand than basic CRUD operations
Serialization is the linchpin - it unblocks everything else you want to do.
Estimated Effort
Document Serialization Implementation:
Difficulty: Low-Medium
Generated by Claude Code analysis