Skip to content

Test Python code examples in documentation #491

@sethfitz

Description

Problem

Python code blocks in READMEs and guides contain import paths, type references, and usage patterns that go stale when the codebase changes. Nothing catches this today. Errors survive until a human reads the rendered documentation and tries to follow along.

#489 is a recent example. The core/system package split left behind incorrect import paths (core.types instead of system.string, primitives instead of primitive, model_constraints instead of system.model_constraint) and stale file references (core/ref.py instead of system/ref/ref.py). These were found and fixed manually as part of a broader documentation alignment — not because a test caught them.

The repo has ~56 Python code blocks across 5 files (PYDANTIC_GUIDE.md alone has 47). As the schema evolves, these will drift again.

Scope

Files with Python code examples today:

File Code blocks Notes
PYDANTIC_GUIDE.md 47 Contributor-facing guide; mix of self-contained and sequential examples
packages/overture-schema/README.md 3 Top-level package usage
packages/overture-schema-core/README.md 3 Primitive types, scoping
README.pydantic.md 2 Package overview
packages/overture-schema-codegen/README.md 1 Extraction API

Most blocks are self-contained (each includes its own imports). Some build sequentially (defining a model in one block, using it in the next). Some reference undefined variables like feature_data that would need hidden setup to execute.

Options

Three pytest-based tools are worth considering. All detect ```python fences automatically and integrate with the existing test suite.

1. Sybil (v10)

All code blocks in a file share a namespace by default — sequential examples just work without annotation. Hidden setup blocks (<!-- invisible-code-block: python -->) handle undefined variables like feature_data. Skip (<!-- skip: next -->), namespace reset (<!-- clear-namespace -->), and full pytest fixture support are available. Configuration is ~5 lines in conftest.py.

Trade-offs: The shared-namespace default means a failing block can cascade failures through the rest of the file. The HTML comment directives (skip, invisible blocks) are invisible when rendered but add noise to the raw markdown source. More configuration surface than the other options.

2. pytest-markdown-docs (v0.9)

Pytest plugin activated with --markdown-docs. Auto-detects Python code blocks. Sequential blocks require a continuation info string (```python continuation). Globals injection via a pytest_markdown_docs_globals hook in conftest.py. Skip via ```python notest.

Trade-offs: The continuation marker is visible in raw markdown source and in rendered output (as part of the code fence info string, though most renderers ignore it). No invisible setup blocks — globals injection is the only way to provide undefined variables, and it applies to all blocks globally rather than per-file. v0.9; less mature than Sybil. Known issues with traceback line numbers in continuation blocks.

3. mktestdocs (v0.2)

Not a plugin — a helper function called from a test. Three lines of test code, zero configuration:

@pytest.mark.parametrize("fpath", Path(".").glob("**/*.md"), ids=str)
def test_docs(fpath):
    check_md_file(fpath=fpath, memory=True)

memory=True enables shared namespace across sequential blocks.

Trade-offs: No skip mechanism — every ```python block runs, with no way to exclude illustrative examples that reference undefined variables. No invisible setup blocks. No fixture support. Blocks that can't run as-is (e.g., those referencing feature_data) would require either restructuring the documentation or not using this tool for those files. Error reporting is basic (raw exception, no pytest assertion rewriting). Least mature of the three.

Considerations

  • Illustrative blocks are the main friction point. Several examples in PYDANTIC_GUIDE.md reference variables defined only in prose context. Sybil handles this with invisible setup blocks. pytest-markdown-docs handles it with a global hook. mktestdocs has no solution — those blocks would need notest-equivalent markup that the tool doesn't offer, or the documentation would need restructuring.
  • Rendered output matters. Annotations that leak into rendered markdown (like continuation or notest in info strings) are visible to readers on GitHub. HTML comments (Sybil's approach) are invisible when rendered.
  • CI integration: all three run as part of a normal pytest invocation. No separate CI step needed.
  • Maintenance burden: Sybil is actively maintained (v10, 9 years of releases). pytest-markdown-docs is maintained by Modal Labs (v0.9). mktestdocs is a small library with infrequent releases (v0.2).

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions