Emit wiki/ on code-only rebuild so report wikilinks resolve by prkash1704 · Pull Request #382 · safishamsi/graphify

prkash1704 · 2026-04-15T18:18:19Z

Summary

watch._rebuild_code and the cluster-only CLI path regenerate GRAPH_REPORT.md + graph.json but never call to_wiki().
Consequence: the [[_COMMUNITY_<name>]] and [[<god-node>]] wikilinks the report emits dead-end until the user runs a separate wiki export. The AGENTS/CLAUDE.md guidance to "navigate wiki/index.md instead of raw files" resolves to nothing on incremental rebuilds — agents fall back to reading raw source, exactly what the graph is meant to avoid.
This PR wires to_wiki() into both rebuild paths so graphify-out/wiki/ stays in sync with the report on every tick.

to_wiki() already produces rich articles (key concepts with degree, cross-community relationships, source files, audit trail, god-node neighbors grouped by relation) — no changes needed there, just call it.

Test plan

Ran _rebuild_code(Path('.')) against a real project (141 communities, 1791 nodes, 3321 edges) — graphify-out/wiki/ populated with community + god-node articles + index.md, all non-empty
Verified GRAPH_REPORT.md wikilinks now resolve to populated files
cluster-only path (not exercised end-to-end, but mirrors the same call shape)

🤖 Generated with Claude Code

- Add GitHub Actions CI workflow (Python 3.10 and 3.12) - Add CI badge to README - Add ARCHITECTURE.md: pipeline overview, module table, schema, how to add a language extractor, security summary - Move eval reports from tests/ to worked/httpx/ and worked/mixed-corpus/ - Fix README: test count 163→212, language table (13 languages via tree-sitter), extract.py description, worked examples links benchmark: 8.8x token reduction on nanoGPT + minGPT + micrograd - Run AST extraction on 29 Python files across 3 Karpathy repos - 177 nodes, 246 edges, 17 communities (Leiden) - 8.8x avg token reduction vs naive full-corpus context stuffing - Notable: micrograd cleanly splits into engine/nn communities; nanoGPT model vs training loop correctly separated - Honest: stdlib import noise flagged, config isolates documented benchmark: 71.5x token reduction on mixed corpus (code+papers+images) Full run: nanoGPT+minGPT+micrograd + 5 research papers + 4 images 285 nodes, 340 edges, 53 communities Average BFS query: 1,726 tokens vs 123,488 naive (71.5x) Code-only (AST) sub-benchmark: 8.8x on 13k-word corpus

…stance, peripheral→hub

style: replace all em dashes with hyphens fix: explain hidden .graphify/ folder in skill output and README fix: rename .graphify/ to graphify-out/ so output is visible by default

- Replace pyvis with custom vis.js renderer: node size by degree, click-to-inspect panel with clickable neighbors, search box, community filter, physics clustering by community - HTML graph generated by default on every run (no --html flag needed) - Token reduction benchmark auto-runs after every /graphify on corpora >5k words - Fix 292 edge warnings: silently skip stdlib/external edges in build.py - Fix build() to merge extractions before building (cross-extraction edges were dropped) - Add 5 HTML renderer tests (223 total) - Remove unnecessary files: lib/, tests/eval_attention.py, misplaced eval reports - Add graphify-out/ and .graphify_*.json to .gitignore - Bump version to 0.1.4, remove pyvis dependency - README: token reduction as top-level selling point, vis.js in tech stack, graph.html in output listing, correct test count and install command

Covers detect → extract → build → cluster → analyze → report → export using existing fixtures. AST-only (no LLM calls), catches regressions in how modules connect, not just individual module behaviour.

- Semantic extraction chunks: 12-15 → 20-25 files (fewer subagent round trips) - Code-only corpora skip semantic dispatch entirely (AST covers it) - Print estimated time before extraction so the wait feels intentional

…hecks, no-viz clarity - Add --graphml to Usage table (was implemented but undocumented there) - Remove early manifest save from --update merge step (Step 9 owns it; saving early meant failed pipelines left manifest ahead of graph) - query/path/explain now check graph.json exists before running, with clear "run /graphify first" message - --no-viz: clarify it skips both Obsidian vault and HTML (was contradictory)

…image changes

…ify claude install

…HTML, report section

…laude Code hooks - confidence_score required on every edge (INFERRED: 0.4-0.9, EXTRACTED: 1.0, AMBIGUOUS: 0.1-0.3) - semantically_similar_to edges for non-obvious cross-file conceptual links - hyperedges for 3+ node group relationships - fixed cache and merge pipeline that was silently dropping them - check_semantic_cache returns 4-tuple including cached_hyperedges - extract.py: mine the "why" - module/class/function docstrings and rationale comments (# NOTE: # IMPORTANT: # HACK: # WHY: # RATIONALE: # TODO: # FIXME:) as rationale_for nodes - skill.md: rationale_for in relation schema, doc files extract design rationale - obsidian output opt-in (--obsidian flag) - default output is graph.html + graph.json + GRAPH_REPORT.md only - hooks.py: post-checkout hook added alongside post-commit - graph rebuilds on branch switch - claude install: writes .claude/settings.json PreToolUse hook on Glob/Grep - Claude checks graph before searching raw files - README updated with all v2 features

…fy claude install section

…section

…, OpenClaw)

…ort.py, bound collision loop Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>