Emit wiki/ on code-only rebuild so report wikilinks resolve#382
Open
prkash1704 wants to merge 144 commits intosafishamsi:mainfrom
Open
Emit wiki/ on code-only rebuild so report wikilinks resolve#382prkash1704 wants to merge 144 commits intosafishamsi:mainfrom
prkash1704 wants to merge 144 commits intosafishamsi:mainfrom
Conversation
- Add GitHub Actions CI workflow (Python 3.10 and 3.12) - Add CI badge to README - Add ARCHITECTURE.md: pipeline overview, module table, schema, how to add a language extractor, security summary - Move eval reports from tests/ to worked/httpx/ and worked/mixed-corpus/ - Fix README: test count 163→212, language table (13 languages via tree-sitter), extract.py description, worked examples links benchmark: 8.8x token reduction on nanoGPT + minGPT + micrograd - Run AST extraction on 29 Python files across 3 Karpathy repos - 177 nodes, 246 edges, 17 communities (Leiden) - 8.8x avg token reduction vs naive full-corpus context stuffing - Notable: micrograd cleanly splits into engine/nn communities; nanoGPT model vs training loop correctly separated - Honest: stdlib import noise flagged, config isolates documented benchmark: 71.5x token reduction on mixed corpus (code+papers+images) Full run: nanoGPT+minGPT+micrograd + 5 research papers + 4 images 285 nodes, 340 edges, 53 communities Average BFS query: 1,726 tokens vs 123,488 naive (71.5x) Code-only (AST) sub-benchmark: 8.8x on 13k-word corpus
…stance, peripheral→hub
style: replace all em dashes with hyphens fix: explain hidden .graphify/ folder in skill output and README fix: rename .graphify/ to graphify-out/ so output is visible by default
- Replace pyvis with custom vis.js renderer: node size by degree, click-to-inspect panel with clickable neighbors, search box, community filter, physics clustering by community - HTML graph generated by default on every run (no --html flag needed) - Token reduction benchmark auto-runs after every /graphify on corpora >5k words - Fix 292 edge warnings: silently skip stdlib/external edges in build.py - Fix build() to merge extractions before building (cross-extraction edges were dropped) - Add 5 HTML renderer tests (223 total) - Remove unnecessary files: lib/, tests/eval_attention.py, misplaced eval reports - Add graphify-out/ and .graphify_*.json to .gitignore - Bump version to 0.1.4, remove pyvis dependency - README: token reduction as top-level selling point, vis.js in tech stack, graph.html in output listing, correct test count and install command
Covers detect → extract → build → cluster → analyze → report → export using existing fixtures. AST-only (no LLM calls), catches regressions in how modules connect, not just individual module behaviour.
- Semantic extraction chunks: 12-15 → 20-25 files (fewer subagent round trips) - Code-only corpora skip semantic dispatch entirely (AST covers it) - Print estimated time before extraction so the wait feels intentional
…hecks, no-viz clarity - Add --graphml to Usage table (was implemented but undocumented there) - Remove early manifest save from --update merge step (Step 9 owns it; saving early meant failed pipelines left manifest ahead of graph) - query/path/explain now check graph.json exists before running, with clear "run /graphify first" message - --no-viz: clarify it skips both Obsidian vault and HTML (was contradictory)
…ify claude install
…HTML, report section
…laude Code hooks - confidence_score required on every edge (INFERRED: 0.4-0.9, EXTRACTED: 1.0, AMBIGUOUS: 0.1-0.3) - semantically_similar_to edges for non-obvious cross-file conceptual links - hyperedges for 3+ node group relationships - fixed cache and merge pipeline that was silently dropping them - check_semantic_cache returns 4-tuple including cached_hyperedges - extract.py: mine the "why" - module/class/function docstrings and rationale comments (# NOTE: # IMPORTANT: # HACK: # WHY: # RATIONALE: # TODO: # FIXME:) as rationale_for nodes - skill.md: rationale_for in relation schema, doc files extract design rationale - obsidian output opt-in (--obsidian flag) - default output is graph.html + graph.json + GRAPH_REPORT.md only - hooks.py: post-checkout hook added alongside post-commit - graph rebuilds on branch switch - claude install: writes .claude/settings.json PreToolUse hook on Glob/Grep - Claude checks graph before searching raw files - README updated with all v2 features
…fy claude install section
…ort.py, bound collision loop Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…aceholder Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…afishamsi#195: skill.md requires general-purpose subagent type for extraction dispatch Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… and bump to 0.4.2 - extract.py: use str(path) for node IDs to prevent same-basename collision (safishamsi#211) - build.py: normalize from/to edge keys before KeyError (safishamsi#216) - export.py: guard ZeroDivisionError when graph has no edges (safishamsi#217) - hooks.py: remove stale CODE_EXTS filter, rebuild on any changed file (safishamsi#222) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…hamsi#221 into 0.4.2 - build/validate: accept NetworkX <=3.1 "links" key alongside "edges" (safishamsi#212) - __main__: skip version check during install/uninstall, deduplicate paths (safishamsi#220) - all file IO: explicit encoding="utf-8" to prevent crashes on Windows CJK locales (safishamsi#204) - hooks: add newline="\n" on write to prevent CRLF shebang breakage on Windows (safishamsi#204) - export: strip trailing .md from safe_name so "CLAUDE.md" doesn't become "CLAUDE.md.md" (safishamsi#221) - report: add Community Hubs navigation block so Obsidian vault stays connected (safishamsi#221) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, safishamsi#254 and bump to 0.4.3 - extract.py: resolve relative JS/TS imports to full-path IDs (fixes 0 import edges on TS codebases) (safishamsi#256) - extract.py: resolve relative Python imports to full-path IDs (safishamsi#256) - watch.py: merge fresh AST with existing semantic nodes instead of overwriting (safishamsi#253) - hooks.py: add python fallback after python3 for Windows; exit 0 if neither found (safishamsi#244) - analyze.py: guard stale _src/_tgt hints with node membership check (safishamsi#226) - detect.py + extract.py: add .vue and .svelte to CODE_EXTENSIONS and _DISPATCH (safishamsi#254) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- watch.py: preserve INFERRED/AMBIGUOUS edges (code<->doc) across rebuilds (safishamsi#261) - __main__.py: fix Codex hook - use additionalContext instead of permissionDecision:allow (safishamsi#249) - detect.py: skip common lockfiles (package-lock.json, yarn.lock, Cargo.lock etc.) (safishamsi#266) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Some MCP clients send blank lines between JSON messages. The stdio transport tried to parse every line as JSONRPCMessage, crashing with a Pydantic ValidationError. _filter_blank_stdin() installs an OS-level pipe that relays stdin while silently dropping blank-only lines. Closes safishamsi#201 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s, fixes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… path bug, .graphifyignore subfolder patterns; v0.4.10: Dart, Hermes, 6 CLI commands, PHP improvements Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…NTS.md python3 fix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e plugin, cache root, PHP missing edges, Windows stability, cross-file calls - safishamsi#352: add skill-kiro.md to pyproject.toml package-data - safishamsi#341: guard edge_betweenness at >5000 nodes; use approximate k=100 for suggest_questions on large graphs - safishamsi#354/safishamsi#229: add Step 6b in skill.md to call to_wiki() when --wiki given (before Step 9 cleanup) - safishamsi#356: call _install_opencode_plugin() from install --platform opencode path - safishamsi#350: add cache_root param to extract() so subdirectory runs keep cache at ./graphify-out/cache/ - safishamsi#230: PHP class_constant_access_expression emits references_constant edges - safishamsi#232: PHP scoped_call_expression (static method calls) emits calls edges - safishamsi#287: os.replace fallback for Windows WinError 5; graphify update exits 1 on failure; templates use graphify update . instead of python3 -c - safishamsi#348: cross-file call resolution for all languages via raw_calls + global label map pass in extract() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The watch._rebuild_code and `cluster-only` CLI paths regenerate GRAPH_REPORT.md + graph.json but never call to_wiki(). Result: the [[_COMMUNITY_...]] and [[<god-node>]] wikilinks the report emits dead-end unless the user runs a separate wiki export, and the AGENTS/CLAUDE.md guidance to "navigate wiki/index.md instead of raw files" resolves to nothing on incremental rebuilds. Wire to_wiki() into both rebuild paths so graphify-out/wiki/ stays in sync with the report on every tick. Passes through community_labels, cohesion, and a god-node list derived from analyze.god_nodes().
The god-node article emits [[<neighbor_label>]] for every neighbor, but to_wiki() only writes articles for community hubs and god nodes. Non-god neighbors have no landing page, so ~40% of god-node wikilinks dead-end. Track which labels will have articles and route the rest through the community page containing that neighbor: **<label>** (in [[<community>]]). Readers still get to a real page; agents don't chase ghosts. Verified on a 1791-node / 141-community project: broken-link ratio dropped from 43.8% (272 / 621) to 0% (0 / 484).
Author
|
Follow-up commit on this branch: fixed a related dead-link issue in god-node articles. Now the god-node connection list only wikilinks neighbors that have their own article; everyone else gets Verified on a 1791-node / 141-community project: broken-link ratio dropped from 43.8% (272/621) → 0% (0/484). Happy to split into a separate PR if you'd prefer to review them independently. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
watch._rebuild_codeand thecluster-onlyCLI path regenerateGRAPH_REPORT.md+graph.jsonbut never callto_wiki().[[_COMMUNITY_<name>]]and[[<god-node>]]wikilinks the report emits dead-end until the user runs a separate wiki export. The AGENTS/CLAUDE.md guidance to "navigate wiki/index.md instead of raw files" resolves to nothing on incremental rebuilds — agents fall back to reading raw source, exactly what the graph is meant to avoid.to_wiki()into both rebuild paths sographify-out/wiki/stays in sync with the report on every tick.to_wiki()already produces rich articles (key concepts with degree, cross-community relationships, source files, audit trail, god-node neighbors grouped by relation) — no changes needed there, just call it.Test plan
_rebuild_code(Path('.'))against a real project (141 communities, 1791 nodes, 3321 edges) —graphify-out/wiki/populated with community + god-node articles + index.md, all non-emptyGRAPH_REPORT.mdwikilinks now resolve to populated filescluster-onlypath (not exercised end-to-end, but mirrors the same call shape)🤖 Generated with Claude Code