docx-md-comments is a DOCX to Markdown converter and Markdown to DOCX converter focused on one hard problem: preserving Microsoft Word comments through roundtrip edits.
It is built for workflows where you:
- start with a Word
.docxcontaining comments - edit content and comments in Markdown (manually or with an LLM)
- convert back to Word without losing comment structure
LLM-focused benefit:
- LLMs can draft or rewrite content
- LLMs can add new comments and threaded replies in Markdown
- those comments/replies are reconstructed as native Word comment threads on export
| Capability | Supported |
|---|---|
| DOCX -> Markdown conversion | Yes |
| Markdown -> DOCX conversion | Yes |
| Comment anchors | Preserved |
| Threaded replies | Preserved |
| Comment state (active/resolved) | Preserved |
| Native Word comment thread reconstruction | Yes |
| LLM-authored comments and replies roundtrip back to Word threads | Yes |
Most generic converters drop or flatten comment metadata. This project is focused on Word review workflows, so comments remain usable after conversion.
Common use cases:
- legal, scientific, or editorial review drafts with heavy comment threading
- LLM-assisted editing where the model also writes comments/replies for reviewers
- roundtrip pipelines where Word comments must survive intact
Requirements:
- Python 3.10+
- Pandoc installed and available on
PATH
Install Pandoc:
- macOS:
brew install pandoc - Ubuntu/Debian:
sudo apt-get install pandoc - Windows (PowerShell):
choco install pandoc -y
Recommended (pipx, isolated app install):
pipx install docx-md-commentsAlternative (pip):
python -m pip install docx-md-commentsInstalled commands:
dmcdocx-commentsdocx2md/d2mmd2docx/m2d
If installed with pipx:
pipx upgrade docx-md-commentsIf installed with pip:
python -m pip install --upgrade docx-md-commentsConvert Word to Markdown:
dmc draft.docxCreates draft.md.
Convert Markdown back to Word:
dmc draft.mdCreates draft.docx.
- Convert reviewed Word document to Markdown:
dmc reviewed.docx- Edit
reviewed.mdmanually or with an LLM. The LLM can update prose and add root comments or threaded replies. - Convert back to Word:
dmc reviewed.md- Open
reviewed.docxin Word and continue normal comment-based review.
DOCX -> Markdown:
docx-comments draft.docx
docx2md draft.docx
d2m draft.docxMarkdown -> DOCX:
md2docx draft.md
m2d draft.mdDOCX -> Markdown:
docx2md draft.docx -o draft.md
d2m draft.docx -o draft.md
dmc docx2md draft.docx -o draft.mdMarkdown -> DOCX:
md2docx draft.md -o draft.docx
m2d draft.md -o draft.docx
dmc md2docx draft.md -o draft.docxUse a reference Word document for styling:
md2docx draft.md --ref original.docx -o final.docx
m2d draft.md -r original.docx -o final.docx--ref maps to Pandoc --reference-doc.
Unknown flags are passed through to Pandoc:
docx2md draft.docx -o draft.md --extract-media=media
dmc md2docx draft.md --reference-doc=template.docx- Tracked Changes: Word revisions are not preserved through roundtrip. Resolve them in Word first.
- Formatting: This project prioritizes comment fidelity. Very complex layouts may not roundtrip perfectly.
dmc --help
docx2md --help
md2docx --helpRun full suite:
make testRoundtrip-focused tests only:
make test-roundtripmake test also writes:
artifacts/out_test.mdartifacts/out_test.docx
Please open bugs/feature requests at:
https://github.com/Pascal-Kueng/docx-md-comments/issues
When reporting a conversion bug, include:
- input sample (or minimal repro)
- command used
- expected vs actual behavior (Word view)
- failing
failure_bundlepath if tests failed
- Marker style uses
///C<ID>.START///////C<ID>.END///(optional==...==wrapper). - Reply relationships are reconstructed as native Word threads (
commentsExtended.xmlparaIdParent+ story markers). - Validation fails fast on malformed marker edits with line-specific diagnostics.
For maintainer details, see AGENTS.md.