Skip to content

Epic: Assimilate Graphify into the AgentOS ecosystem as a knowledge-graph capability #37

@shaun0927

Description

@shaun0927

Epic: Assimilate Graphify into the AgentOS ecosystem as a knowledge-graph capability

Goal

Assimilate safishamsi/graphify into the AgentOS ecosystem as a first-class knowledge-graph capability, delivered through the ouroboros-plugins repository and the plugin contract described in #27.

The important direction of assimilation is:

external capability repo → AgentOS-native plugin capability

ouroboros-plugins is the packaging, adapter, trust, audit, and distribution surface for that assimilation. It is not the strategic destination by itself. The strategic goal is to make the target repo's capability usable by AgentOS/Ouroboros workflows with native permission, provenance, state, handoff, and audit semantics.

The target user experience should preserve Graphify's existing assistant-oriented workflow while exposing it through the AgentOS command surface:

ooo graphify .
ooo graphify ./docs --update
ooo graphify query "what connects auth to the database?"
ooo graphify path "UserService" "DatabasePool"
ooo graphify explain "RateLimiter"
ooo graphify add https://arxiv.org/abs/1706.03762

The strategic objective is larger than wrapping a CLI: Graphify should become an AgentOS-native plugin capability for turning codebases, documents, SQL schemas, research corpora, media, and generated artifacts into queryable knowledge graphs that AgentOS/Ouroboros can reuse across planning, analysis, review, QA, handoff, and autonomous execution.

This is a concrete implementation candidate for the #27 thesis:

The plugin layer exists to keep core small while allowing the outside world to become Ouroboros-native.

In this issue, "Ouroboros-native" means "integrated into the AgentOS ecosystem through the plugin contract". If successful, Graphify becomes part of the AgentOS capability layer: the OS provides Seed/Ledger/State/Provenance/Permission/Audit/Handoff primitives, while Graphify contributes reusable knowledge-graph memory and navigation without bloating core.


Source capability summary

Graphify is a Python package distributed on PyPI as graphifyy, with the CLI command graphify.

Observed upstream surfaces from safishamsi/graphify:

  • Package entrypoint: graphify = graphify.__main__:main
  • Core skill file: graphify/skill.md
  • Platform skill variants:
    • skill-codex.md
    • skill-aider.md
    • skill-claw.md
    • skill-copilot.md
    • skill-droid.md
    • skill-kiro.md
    • skill-opencode.md
    • skill-pi.md
    • skill-trae.md
    • skill-vscode.md
    • skill-windows.md
  • Assistant registration/install targets include Claude Code, Codex, OpenCode, Cursor, Gemini CLI, GitHub Copilot CLI, VS Code Copilot Chat, Aider, OpenClaw, Factory Droid, Trae, Hermes, Kimi Code, Kiro, Pi, and Google Antigravity.
  • Optional extras include pdf, office, google, video, mcp, neo4j, svg, leiden, ollama, openai, gemini, bedrock, sql, and all.
  • MCP server tools exposed by upstream documentation: query_graph, get_node, get_neighbors, shortest_path, list_prs, get_pr_impact, triage_prs.

Upstream README describes the primary assistant UX as typing /graphify to produce:

graphify-out/
  graph.html
  GRAPH_REPORT.md
  graph.json

For Codex, upstream notes that users invoke $graphify rather than /graphify. In AgentOS/Ouroboros, the desired native command surface should be ooo graphify ....


Why this belongs in ouroboros-plugins

This is not about making ouroboros-plugins itself the destination of the capability. This repository is the assimilation layer that turns an external repo into an AgentOS-native plugin capability.

Graphify is a reference-quality capability-assimilation case because it exercises several important parts of the plugin contract:

  1. Filesystem read/write boundaries

    • Reads source code, docs, SQL, images, PDFs, office documents, media, and generated graph state.
    • Writes graphify-out/, reports, graph JSON, optional wiki/Obsidian/HTML/SVG/GraphML/Neo4j artifacts, cache files, manifest files, and possibly hook configuration.
  2. Runtime and model execution

    • Local AST extraction can be read-only/local.
    • Semantic extraction may call external LLM APIs or assistant-provided model backends.
    • Optional local Ollama/Claude CLI/Bedrock/OpenAI/Gemini/Kimi backends make model authority explicit.
  3. Audit/provenance requirements

    • Graph outputs become evidence artifacts for future AgentOS/Ouroboros planning, analysis, and handoffs.
    • The plugin must record input roots, file counts, skipped sensitive files, backend/model selection, output paths, graph stats, and generated artifacts.
  4. Handoff value

    • GRAPH_REPORT.md, graph.json, wiki, and query outputs can become durable handoff artifacts for ooo auto, code review, planning, debugging, and research workflows.
  5. Permission/risk granularity

    • Most graph build/query operations are bounded write or read_only operations.
    • Hook installation, Neo4j push, network ingestion, and PR triage require separate trust scopes.
  6. Command-rich external tool assimilation

    • Graphify has a rich existing skill UX. Preserving that experience while adding AgentOS audit/trust/handoff semantics is exactly the difference between a trivial wrapper and an AgentOS-native plugin.

Non-goals

  • Do not vendor or fork the Graphify codebase into Ouroboros core.
  • Do not treat Q00/ouroboros-plugins as the strategic destination; it is the adapter/distribution layer for AgentOS assimilation.
  • Do not make this merely a marketplace listing for Graphify.
  • Do not bypass the plugin firewall by teaching ooo auto Graphify-specific branches.
  • Do not expose destructive or external-write behavior under a broad catch-all permission.
  • Do not silently install global hooks, write to external databases, call paid model APIs, or fetch network content without explicit permission/trust.
  • Do not require every optional Graphify extra for the baseline plugin.

Proposed plugin identity

plugins/graphify/
  ouroboros.plugin.json
  README.md
  graphify_plugin/
    __main__.py

Suggested manifest identity:

{
  "schema_version": "0.1",
  "name": "graphify",
  "version": "0.1.0",
  "description": "Assimilate Graphify knowledge-graph capabilities into AgentOS through auditable Ouroboros plugin commands.",
  "entrypoint": {
    "type": "command",
    "command": "python -m graphify_plugin"
  }
}

The plugin entrypoint should act as a contract-aware adapter around the upstream graphify CLI. It should not reimplement Graphify internals unless a contract gap makes a thin adapter impossible.


Command surface to preserve

Expose the existing Graphify assistant skill UX as ooo graphify ... commands.

Baseline graph build commands

  • ooo graphify / ooo graphify .
    • Build a graph for the current directory.
    • Produces graphify-out/graph.html, graphify-out/GRAPH_REPORT.md, and graphify-out/graph.json.
  • ooo graphify <path>
    • Build a graph for a specific local path.
  • ooo graphify <github-url>
    • Clone a GitHub repository and build a graph for it.
  • ooo graphify <github-url> --branch <branch>
    • Clone a specific branch before graphing.
  • ooo graphify <url1> <url2> ...
    • Build multiple repository graphs and merge them into one cross-repo graph.

Build modifiers

  • --mode deep
    • Request deeper extraction and richer inferred edges.
  • --update
    • Incrementally re-extract only new/changed files.
  • --directed
    • Preserve edge direction.
  • --whisper-model <model>
    • Use a larger Whisper model for media transcription.
  • --cluster-only
    • Re-run clustering on an existing graph.
  • --no-viz
    • Skip HTML visualization.
  • --html
    • Preserve compatibility, even if upstream treats it as a no-op/default.
  • --svg
    • Export graph.svg.
  • --graphml
    • Export graph.graphml.
  • --neo4j
    • Generate Neo4j Cypher output.
  • --neo4j-push <bolt-url>
    • Push graph data to Neo4j.
  • --mcp
    • Start or expose the Graphify MCP stdio server when permitted.
  • --watch
    • Watch and rebuild on code changes.
  • --wiki
    • Build an agent-crawlable wiki.
  • --obsidian --obsidian-dir <path>
    • Write Obsidian vault output.

Corpus ingestion commands

  • ooo graphify add <url>
    • Fetch URL content into ./raw and prepare/update graph state.
  • ooo graphify add <url> --author <name>
    • Record original author metadata.
  • ooo graphify add <url> --contributor <name>
    • Record contributor metadata.

Supported upstream ingestion categories include webpages, arXiv, PDFs, images, YouTube/video/audio, and Twitter/X oEmbed. Each category must be reflected in permissions and audit evidence.

Query/navigation commands

  • ooo graphify query "<question>"
    • Query the generated graph with BFS-style broad context.
  • ooo graphify query "<question>" --dfs
    • Query with DFS-style path tracing.
  • ooo graphify query "<question>" --budget <tokens>
    • Bound answer size.
  • ooo graphify query "<question>" --graph <path>
    • Query a specific graph file.
  • ooo graphify path "<source>" "<target>"
    • Return shortest path between two concepts/nodes.
  • ooo graphify explain "<node>"
    • Explain a matching graph node and its most important connections.

Operational/helper commands to evaluate

These upstream CLI commands exist and should be classified before exposure:

  • install, uninstall
  • platform install helpers: claude, gemini, cursor, vscode, copilot, kiro, pi, aider, codex, opencode, claw, droid, trae, trae-cn, hermes, antigravity
  • prs
  • hook install|uninstall|status
  • watch
  • update
  • hook-check
  • check-update
  • tree
  • merge-driver
  • merge-graphs
  • clone
  • export
  • benchmark
  • global
  • extract
  • cache-check
  • merge-chunks
  • merge-semantic
  • save-result

For the first AgentOS plugin version, prefer exposing only commands that preserve Graphify's main skill experience. Internal maintenance helpers may remain hidden or adapter-only unless they have a clear user-facing contract.


Proposed command/risk classification

Command family Risk Confirmation Rationale
query, path, explain read_only no Reads existing graph.json; emits answer/evidence only.
Build graph for local path write no, if path is repo-bounded Reads local files and writes graphify-out/.
--update, --cluster-only write no, if path is repo-bounded Mutates existing local graph artifacts.
--wiki, --obsidian, --svg, --graphml, --neo4j export file generation write no, if output is bounded Writes derived local artifacts.
add <url> write + network:read maybe Fetches network content and writes to raw/.
GitHub URL clone write + network:read + shell:execute yes unless already trusted Clones external repo into a local cache/workdir.
Semantic extraction with cloud backend write + network:write yes Sends document/image/PDF content to configured model backend.
--mcp write/execute yes Starts a long-running tool surface.
--watch write/execute yes Starts a watcher process that continues after invocation.
hook install, merge-driver write yes Mutates git hooks/config.
--neo4j-push write / potentially destructive yes Writes to external database.
prs --triage / PR dashboard features read_only or write depending backend yes for API/model use Reads GitHub/gh state and may call model backend.

Proposed capabilities

The v0.1 manifest should likely declare:

  • ledger:write
    • Record invocation intent, selected command, graph stats, and result summary.
  • state:write
    • Persist graph build/query state and resumable metadata.
  • provenance:write
    • Record input roots, generated artifacts, backend/model, and source evidence.
  • handoff:attach
    • Attach GRAPH_REPORT.md, graph.json, wiki links, query outputs, and graph summaries to downstream AgentOS/Ouroboros runs.
  • progress:write
    • Surface extraction/chunking/build progress.
  • runtime:execute
    • Launch the upstream graphify CLI in a bounded subprocess.
  • mcp:call or equivalent future capability
    • Only if the plugin exposes or consumes Graphify's MCP server.

Note: the current schema enum has mcp with access values read|write|execute|attach; docs/contract.md mentions mcp:call. This plugin may help expose whether the schema needs a future call access value or whether mcp:execute is sufficient.


Proposed permissions

Baseline required permissions:

  • filesystem:read
    • Read target project/corpus files.
  • filesystem:write
    • Write graphify-out/, cache files, report files, graph files, wiki/Obsidian/export outputs.
  • shell:execute
    • Execute upstream graphify command or python -m graphify.

Optional permissions gated by command/flag:

  • network:read
    • Fetch URLs, clone GitHub repositories, resolve remote resources.
  • network:write
    • Call LLM/model APIs, push to Neo4j, call GitHub APIs for PR triage if needed.
  • github:read
    • PR dashboard / PR impact analysis.
  • mcp:execute or future equivalent
    • Start/use Graphify MCP server.
  • git:write or filesystem:write with explicit reason
    • Install git hooks or merge driver.
  • database:write or network:write
    • Push to Neo4j. If no database permission class exists yet, document this as a schema pressure point.

Adapter behavior requirements

The adapter should:

  1. Resolve the Graphify executable safely:
    • prefer graphify on PATH;
    • fall back to python -m graphify if importable;
    • emit an actionable blocked result if missing;
    • recommend uv tool install graphifyy or pipx install graphifyy without auto-installing unless explicitly requested.
  2. Normalize command syntax:
    • map ooo graphify ... to upstream graphify ...;
    • preserve Graphify's assistant skill semantics where possible;
    • keep aliases compatible with /graphify examples in upstream docs.
  3. Bound paths:
    • default target path should be the current project root;
    • writes should default to project-local graphify-out/;
    • cross-repo clone/cache paths must be explicit and auditable.
  4. Emit structured result metadata:
    • status: completed, failed, blocked, or cancelled;
    • generated artifacts;
    • graph node/edge/community counts where available;
    • input file counts and skipped-sensitive counts where available;
    • backend/model and optional extras used;
    • next suggested commands.
  5. Attach handoff artifacts:
    • graphify-out/GRAPH_REPORT.md;
    • graphify-out/graph.json;
    • graphify-out/graph.html if produced;
    • wiki/Obsidian/export paths if requested;
    • query/path/explain outputs as textual evidence.
  6. Respect confirmation gates:
    • network calls;
    • model API calls on user content;
    • hook installation;
    • long-running watch/MCP server modes;
    • Neo4j push;
    • any non-repo-bounded writes.
  7. Preserve upstream UX:
    • a user familiar with /graphify should be able to run ooo graphify with minimal relearning.

Implementation phases

Phase 1: Contract design and manifest

  • Add plugins/graphify/ouroboros.plugin.json.
  • Add plugins/graphify/README.md explaining install, trust, permissions, examples, and risk classes.
  • Decide the initial exposed command set.
  • Decide whether operational helpers are exposed or internal-only.
  • Validate the manifest with python3 scripts/validate_contract.py.

Phase 2: Thin adapter MVP

  • Add plugins/graphify/graphify_plugin/__main__.py.
  • Implement command dispatch for:
    • build current path;
    • build explicit local path;
    • --update;
    • query;
    • path;
    • explain.
  • Implement executable detection and missing-dependency blocked output.
  • Capture stdout/stderr/exit code.
  • Parse and summarize known output artifacts.
  • Return clear completion/failure semantics.

Phase 3: Handoff and provenance enrichment

  • Attach/report generated artifact paths.
  • Record graph stats from graph.json.
  • Record input target path, output directory, Graphify version, and backend/model hints.
  • Emit audit-friendly result JSON in addition to human output.
  • Make query/path/explain outputs reusable as downstream evidence.

Phase 4: Full skill UX preservation

  • Add support for GitHub URL clone mode.
  • Add support for multiple URL/repo graph merge.
  • Add support for add <url>.
  • Add support for --wiki, --obsidian, --svg, --graphml, --neo4j file exports.
  • Add support for --mcp and --watch with explicit long-running-process behavior.
  • Add support for --neo4j-push behind explicit trust.
  • Add support for PR dashboard/impact/triage if it fits the plugin boundary.

Phase 5: AgentOS integration

  • Document how ooo auto should consume Graphify handoff artifacts without embedding Graphify-specific logic in core.
  • Define a recommended pre-analysis flow:
    • ooo graphify .
    • ooo graphify query "what parts of this codebase are relevant to <goal>?"
    • attach result to Seed/planning/autopilot handoff.
  • Define a recommended review flow:
    • build/update graph;
    • inspect surprising connections/god nodes;
    • attach report to review evidence.
  • Define a recommended research/corpus flow using add, query, and graph handoff artifacts.

Acceptance criteria

This epic is complete when:

  • Graphify has been assimilated into the AgentOS ecosystem through a graphify plugin package under plugins/graphify/.
  • The plugin manifest validates against the current schema.
  • ooo graphify . maps to the Graphify build flow and preserves the expected output artifacts.
  • ooo graphify query, ooo graphify path, and ooo graphify explain work against a generated graph.
  • The plugin records or emits enough evidence for audit/provenance:
    • plugin version;
    • Graphify version;
    • command invoked;
    • target path or graph path;
    • generated artifacts;
    • graph stats;
    • backend/model hints when applicable;
    • permission-sensitive operations used.
  • Network/model/hook/MCP/watch/Neo4j operations are separated behind explicit permissions and confirmation behavior.
  • The README explains how Graphify's /graphify experience maps to ooo graphify.
  • The plugin does not require changes to Ouroboros core beyond the documented plugin contract unless a schema/contract gap is explicitly identified.
  • Any required schema pressure points are filed as follow-up issues instead of silently extending the manifest.
  • The implementation demonstrates SSOT: UserLevel plugin authoring and capability assimilation #27's standard for AgentOS assimilation: permissioned, auditable, resumable, handoff-capable, and more than a raw command wrapper.

Open design questions

  1. Should graphify install / platform skill installation commands be exposed through AgentOS/Ouroboros, or should the AgentOS plugin replace that installation path for AgentOS users?
  2. Should Graphify's MCP server be a plugin subcommand, a declared plugin capability, or both?
  3. Should graphify-out/ become a standard handoff artifact type in AgentOS/Ouroboros, or remain plugin-specific metadata?
  4. Should model/API backend selection be represented as permissions only, or does the plugin contract need a richer runtime/model declaration?
  5. Should Neo4j push require a new database:write permission scope, or is network:write sufficient for v0.1?
  6. How should long-running watch and MCP server processes appear in state/progress/audit events?
  7. Should PR dashboard features live in this plugin or remain separate from github-pr-ops to avoid namespace overlap?

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions