Skip to content

understand-knowledge: Karpathy wikis using CommonMark [](page.md) links yield 0 deterministic edges #361

@s33dunda

Description

@s33dunda

Summary

/understand-knowledge's deterministic parser (parse-knowledge-base.py) extracts links only via [[wikilink]] syntax (WIKILINK_RE). A wiki that follows the Karpathy three-layer pattern in every other respect — has index.md, multiple cross-linked .md files, schema file — but uses CommonMark [label](page.md) links instead of [[ ]] gets detected as karpathy and then produces zero deterministic edges.

This is common for wikis rendered on GitHub/GitLab, since those renderers do not support [[wikilinks]], so authors use standard markdown links.

Steps to reproduce

A minimal wiki:

index.md           # "## Topic" then  - [Alpha](pages/alpha.md)  - [Beta](pages/beta.md)
pages/alpha.md     # body: "relates to [Beta](beta.md)"
pages/beta.md

Run the parser:

python3 parse-knowledge-base.py <dir>

Expected: alpha → beta related edge, and alpha/beta categorized under "Topic".
Actual: ... 0 wikilinks (0 unresolved), edges: 0, and the index category has 0 articles.

The skill still "succeeds," so the degradation is silent — the LLM-analysis phase then has to invent the entire link structure from prose, producing a noisy, unreliable graph instead of the real one.

Root cause

  • extract_wikilinks / parse_index only scan WIKILINK_RE.
  • resolve_wikilink resolves by filename stem, not by relative path, so [label](path.md) targets are never matched even if extracted.

Relationship to existing issues

This case — a Karpathy-format wiki that uses CommonMark links — is covered by neither.

Proposed fix

Within the existing Karpathy code path, additionally extract [label](page.md) links and resolve them by normalised relative path, alongside the existing [[ ]] handling (fully backward-compatible). Same treatment for index.md category links. PR incoming.

Environment

  • understand-anything 2.7.x, skills/understand-knowledge/parse-knowledge-base.py
  • Python 3.13

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions