Epic: Assimilate Graphify into the AgentOS ecosystem as a knowledge-graph capability

## Epic: Assimilate Graphify into the AgentOS ecosystem as a knowledge-graph capability

### Goal

Assimilate [`safishamsi/graphify`](https://github.com/safishamsi/graphify) into the AgentOS ecosystem as a first-class knowledge-graph capability, delivered through the `ouroboros-plugins` repository and the plugin contract described in #27.

The important direction of assimilation is:

```text
external capability repo → AgentOS-native plugin capability
```

`ouroboros-plugins` is the packaging, adapter, trust, audit, and distribution surface for that assimilation. It is not the strategic destination by itself. The strategic goal is to make the target repo's capability usable by AgentOS/Ouroboros workflows with native permission, provenance, state, handoff, and audit semantics.

The target user experience should preserve Graphify's existing assistant-oriented workflow while exposing it through the AgentOS command surface:

```bash
ooo graphify .
ooo graphify ./docs --update
ooo graphify query "what connects auth to the database?"
ooo graphify path "UserService" "DatabasePool"
ooo graphify explain "RateLimiter"
ooo graphify add https://arxiv.org/abs/1706.03762
```

The strategic objective is larger than wrapping a CLI: Graphify should become an AgentOS-native plugin capability for turning codebases, documents, SQL schemas, research corpora, media, and generated artifacts into queryable knowledge graphs that AgentOS/Ouroboros can reuse across planning, analysis, review, QA, handoff, and autonomous execution.

This is a concrete implementation candidate for the #27 thesis:

> The plugin layer exists to keep core small while allowing the outside world to become Ouroboros-native.

In this issue, "Ouroboros-native" means "integrated into the AgentOS ecosystem through the plugin contract". If successful, Graphify becomes part of the AgentOS capability layer: the OS provides Seed/Ledger/State/Provenance/Permission/Audit/Handoff primitives, while Graphify contributes reusable knowledge-graph memory and navigation without bloating core.

---

## Source capability summary

Graphify is a Python package distributed on PyPI as `graphifyy`, with the CLI command `graphify`.

Observed upstream surfaces from `safishamsi/graphify`:

- Package entrypoint: `graphify = graphify.__main__:main`
- Core skill file: `graphify/skill.md`
- Platform skill variants:
  - `skill-codex.md`
  - `skill-aider.md`
  - `skill-claw.md`
  - `skill-copilot.md`
  - `skill-droid.md`
  - `skill-kiro.md`
  - `skill-opencode.md`
  - `skill-pi.md`
  - `skill-trae.md`
  - `skill-vscode.md`
  - `skill-windows.md`
- Assistant registration/install targets include Claude Code, Codex, OpenCode, Cursor, Gemini CLI, GitHub Copilot CLI, VS Code Copilot Chat, Aider, OpenClaw, Factory Droid, Trae, Hermes, Kimi Code, Kiro, Pi, and Google Antigravity.
- Optional extras include `pdf`, `office`, `google`, `video`, `mcp`, `neo4j`, `svg`, `leiden`, `ollama`, `openai`, `gemini`, `bedrock`, `sql`, and `all`.
- MCP server tools exposed by upstream documentation: `query_graph`, `get_node`, `get_neighbors`, `shortest_path`, `list_prs`, `get_pr_impact`, `triage_prs`.

Upstream README describes the primary assistant UX as typing `/graphify` to produce:

```text
graphify-out/
  graph.html
  GRAPH_REPORT.md
  graph.json
```

For Codex, upstream notes that users invoke `$graphify` rather than `/graphify`. In AgentOS/Ouroboros, the desired native command surface should be `ooo graphify ...`.

---

## Why this belongs in `ouroboros-plugins`

This is not about making `ouroboros-plugins` itself the destination of the capability. This repository is the assimilation layer that turns an external repo into an AgentOS-native plugin capability.

Graphify is a reference-quality capability-assimilation case because it exercises several important parts of the plugin contract:

1. **Filesystem read/write boundaries**
   - Reads source code, docs, SQL, images, PDFs, office documents, media, and generated graph state.
   - Writes `graphify-out/`, reports, graph JSON, optional wiki/Obsidian/HTML/SVG/GraphML/Neo4j artifacts, cache files, manifest files, and possibly hook configuration.

2. **Runtime and model execution**
   - Local AST extraction can be read-only/local.
   - Semantic extraction may call external LLM APIs or assistant-provided model backends.
   - Optional local Ollama/Claude CLI/Bedrock/OpenAI/Gemini/Kimi backends make model authority explicit.

3. **Audit/provenance requirements**
   - Graph outputs become evidence artifacts for future AgentOS/Ouroboros planning, analysis, and handoffs.
   - The plugin must record input roots, file counts, skipped sensitive files, backend/model selection, output paths, graph stats, and generated artifacts.

4. **Handoff value**
   - `GRAPH_REPORT.md`, `graph.json`, wiki, and query outputs can become durable handoff artifacts for `ooo auto`, code review, planning, debugging, and research workflows.

5. **Permission/risk granularity**
   - Most graph build/query operations are bounded `write` or `read_only` operations.
   - Hook installation, Neo4j push, network ingestion, and PR triage require separate trust scopes.

6. **Command-rich external tool assimilation**
   - Graphify has a rich existing skill UX. Preserving that experience while adding AgentOS audit/trust/handoff semantics is exactly the difference between a trivial wrapper and an AgentOS-native plugin.

---

## Non-goals

- Do not vendor or fork the Graphify codebase into Ouroboros core.
- Do not treat `Q00/ouroboros-plugins` as the strategic destination; it is the adapter/distribution layer for AgentOS assimilation.
- Do not make this merely a marketplace listing for Graphify.
- Do not bypass the plugin firewall by teaching `ooo auto` Graphify-specific branches.
- Do not expose destructive or external-write behavior under a broad catch-all permission.
- Do not silently install global hooks, write to external databases, call paid model APIs, or fetch network content without explicit permission/trust.
- Do not require every optional Graphify extra for the baseline plugin.

---

## Proposed plugin identity

```text
plugins/graphify/
  ouroboros.plugin.json
  README.md
  graphify_plugin/
    __main__.py
```

Suggested manifest identity:

```json
{
  "schema_version": "0.1",
  "name": "graphify",
  "version": "0.1.0",
  "description": "Assimilate Graphify knowledge-graph capabilities into AgentOS through auditable Ouroboros plugin commands.",
  "entrypoint": {
    "type": "command",
    "command": "python -m graphify_plugin"
  }
}
```

The plugin entrypoint should act as a contract-aware adapter around the upstream `graphify` CLI. It should not reimplement Graphify internals unless a contract gap makes a thin adapter impossible.

---

## Command surface to preserve

Expose the existing Graphify assistant skill UX as `ooo graphify ...` commands.

### Baseline graph build commands

- `ooo graphify` / `ooo graphify .`
  - Build a graph for the current directory.
  - Produces `graphify-out/graph.html`, `graphify-out/GRAPH_REPORT.md`, and `graphify-out/graph.json`.
- `ooo graphify <path>`
  - Build a graph for a specific local path.
- `ooo graphify <github-url>`
  - Clone a GitHub repository and build a graph for it.
- `ooo graphify <github-url> --branch <branch>`
  - Clone a specific branch before graphing.
- `ooo graphify <url1> <url2> ...`
  - Build multiple repository graphs and merge them into one cross-repo graph.

### Build modifiers

- `--mode deep`
  - Request deeper extraction and richer inferred edges.
- `--update`
  - Incrementally re-extract only new/changed files.
- `--directed`
  - Preserve edge direction.
- `--whisper-model <model>`
  - Use a larger Whisper model for media transcription.
- `--cluster-only`
  - Re-run clustering on an existing graph.
- `--no-viz`
  - Skip HTML visualization.
- `--html`
  - Preserve compatibility, even if upstream treats it as a no-op/default.
- `--svg`
  - Export `graph.svg`.
- `--graphml`
  - Export `graph.graphml`.
- `--neo4j`
  - Generate Neo4j Cypher output.
- `--neo4j-push <bolt-url>`
  - Push graph data to Neo4j.
- `--mcp`
  - Start or expose the Graphify MCP stdio server when permitted.
- `--watch`
  - Watch and rebuild on code changes.
- `--wiki`
  - Build an agent-crawlable wiki.
- `--obsidian --obsidian-dir <path>`
  - Write Obsidian vault output.

### Corpus ingestion commands

- `ooo graphify add <url>`
  - Fetch URL content into `./raw` and prepare/update graph state.
- `ooo graphify add <url> --author <name>`
  - Record original author metadata.
- `ooo graphify add <url> --contributor <name>`
  - Record contributor metadata.

Supported upstream ingestion categories include webpages, arXiv, PDFs, images, YouTube/video/audio, and Twitter/X oEmbed. Each category must be reflected in permissions and audit evidence.

### Query/navigation commands

- `ooo graphify query "<question>"`
  - Query the generated graph with BFS-style broad context.
- `ooo graphify query "<question>" --dfs`
  - Query with DFS-style path tracing.
- `ooo graphify query "<question>" --budget <tokens>`
  - Bound answer size.
- `ooo graphify query "<question>" --graph <path>`
  - Query a specific graph file.
- `ooo graphify path "<source>" "<target>"`
  - Return shortest path between two concepts/nodes.
- `ooo graphify explain "<node>"`
  - Explain a matching graph node and its most important connections.

### Operational/helper commands to evaluate

These upstream CLI commands exist and should be classified before exposure:

- `install`, `uninstall`
- platform install helpers: `claude`, `gemini`, `cursor`, `vscode`, `copilot`, `kiro`, `pi`, `aider`, `codex`, `opencode`, `claw`, `droid`, `trae`, `trae-cn`, `hermes`, `antigravity`
- `prs`
- `hook install|uninstall|status`
- `watch`
- `update`
- `hook-check`
- `check-update`
- `tree`
- `merge-driver`
- `merge-graphs`
- `clone`
- `export`
- `benchmark`
- `global`
- `extract`
- `cache-check`
- `merge-chunks`
- `merge-semantic`
- `save-result`

For the first AgentOS plugin version, prefer exposing only commands that preserve Graphify's main skill experience. Internal maintenance helpers may remain hidden or adapter-only unless they have a clear user-facing contract.

---

## Proposed command/risk classification

| Command family | Risk | Confirmation | Rationale |
|---|---:|---:|---|
| `query`, `path`, `explain` | `read_only` | no | Reads existing `graph.json`; emits answer/evidence only. |
| Build graph for local path | `write` | no, if path is repo-bounded | Reads local files and writes `graphify-out/`. |
| `--update`, `--cluster-only` | `write` | no, if path is repo-bounded | Mutates existing local graph artifacts. |
| `--wiki`, `--obsidian`, `--svg`, `--graphml`, `--neo4j` export file generation | `write` | no, if output is bounded | Writes derived local artifacts. |
| `add <url>` | `write` + `network:read` | maybe | Fetches network content and writes to `raw/`. |
| GitHub URL clone | `write` + `network:read` + `shell:execute` | yes unless already trusted | Clones external repo into a local cache/workdir. |
| Semantic extraction with cloud backend | `write` + `network:write` | yes | Sends document/image/PDF content to configured model backend. |
| `--mcp` | `write`/`execute` | yes | Starts a long-running tool surface. |
| `--watch` | `write`/`execute` | yes | Starts a watcher process that continues after invocation. |
| `hook install`, `merge-driver` | `write` | yes | Mutates git hooks/config. |
| `--neo4j-push` | `write` / potentially `destructive` | yes | Writes to external database. |
| `prs --triage` / PR dashboard features | `read_only` or `write` depending backend | yes for API/model use | Reads GitHub/gh state and may call model backend. |

---

## Proposed capabilities

The v0.1 manifest should likely declare:

- `ledger:write`
  - Record invocation intent, selected command, graph stats, and result summary.
- `state:write`
  - Persist graph build/query state and resumable metadata.
- `provenance:write`
  - Record input roots, generated artifacts, backend/model, and source evidence.
- `handoff:attach`
  - Attach `GRAPH_REPORT.md`, `graph.json`, wiki links, query outputs, and graph summaries to downstream AgentOS/Ouroboros runs.
- `progress:write`
  - Surface extraction/chunking/build progress.
- `runtime:execute`
  - Launch the upstream `graphify` CLI in a bounded subprocess.
- `mcp:call` or equivalent future capability
  - Only if the plugin exposes or consumes Graphify's MCP server.

Note: the current schema enum has `mcp` with access values `read|write|execute|attach`; `docs/contract.md` mentions `mcp:call`. This plugin may help expose whether the schema needs a future `call` access value or whether `mcp:execute` is sufficient.

---

## Proposed permissions

Baseline required permissions:

- `filesystem:read`
  - Read target project/corpus files.
- `filesystem:write`
  - Write `graphify-out/`, cache files, report files, graph files, wiki/Obsidian/export outputs.
- `shell:execute`
  - Execute upstream `graphify` command or `python -m graphify`.

Optional permissions gated by command/flag:

- `network:read`
  - Fetch URLs, clone GitHub repositories, resolve remote resources.
- `network:write`
  - Call LLM/model APIs, push to Neo4j, call GitHub APIs for PR triage if needed.
- `github:read`
  - PR dashboard / PR impact analysis.
- `mcp:execute` or future equivalent
  - Start/use Graphify MCP server.
- `git:write` or `filesystem:write` with explicit reason
  - Install git hooks or merge driver.
- `database:write` or `network:write`
  - Push to Neo4j. If no database permission class exists yet, document this as a schema pressure point.

---

## Adapter behavior requirements

The adapter should:

1. Resolve the Graphify executable safely:
   - prefer `graphify` on PATH;
   - fall back to `python -m graphify` if importable;
   - emit an actionable blocked result if missing;
   - recommend `uv tool install graphifyy` or `pipx install graphifyy` without auto-installing unless explicitly requested.
2. Normalize command syntax:
   - map `ooo graphify ...` to upstream `graphify ...`;
   - preserve Graphify's assistant skill semantics where possible;
   - keep aliases compatible with `/graphify` examples in upstream docs.
3. Bound paths:
   - default target path should be the current project root;
   - writes should default to project-local `graphify-out/`;
   - cross-repo clone/cache paths must be explicit and auditable.
4. Emit structured result metadata:
   - status: `completed`, `failed`, `blocked`, or `cancelled`;
   - generated artifacts;
   - graph node/edge/community counts where available;
   - input file counts and skipped-sensitive counts where available;
   - backend/model and optional extras used;
   - next suggested commands.
5. Attach handoff artifacts:
   - `graphify-out/GRAPH_REPORT.md`;
   - `graphify-out/graph.json`;
   - `graphify-out/graph.html` if produced;
   - wiki/Obsidian/export paths if requested;
   - query/path/explain outputs as textual evidence.
6. Respect confirmation gates:
   - network calls;
   - model API calls on user content;
   - hook installation;
   - long-running watch/MCP server modes;
   - Neo4j push;
   - any non-repo-bounded writes.
7. Preserve upstream UX:
   - a user familiar with `/graphify` should be able to run `ooo graphify` with minimal relearning.

---

## Implementation phases

### Phase 1: Contract design and manifest

- [ ] Add `plugins/graphify/ouroboros.plugin.json`.
- [ ] Add `plugins/graphify/README.md` explaining install, trust, permissions, examples, and risk classes.
- [ ] Decide the initial exposed command set.
- [ ] Decide whether operational helpers are exposed or internal-only.
- [ ] Validate the manifest with `python3 scripts/validate_contract.py`.

### Phase 2: Thin adapter MVP

- [ ] Add `plugins/graphify/graphify_plugin/__main__.py`.
- [ ] Implement command dispatch for:
  - [ ] build current path;
  - [ ] build explicit local path;
  - [ ] `--update`;
  - [ ] `query`;
  - [ ] `path`;
  - [ ] `explain`.
- [ ] Implement executable detection and missing-dependency blocked output.
- [ ] Capture stdout/stderr/exit code.
- [ ] Parse and summarize known output artifacts.
- [ ] Return clear completion/failure semantics.

### Phase 3: Handoff and provenance enrichment

- [ ] Attach/report generated artifact paths.
- [ ] Record graph stats from `graph.json`.
- [ ] Record input target path, output directory, Graphify version, and backend/model hints.
- [ ] Emit audit-friendly result JSON in addition to human output.
- [ ] Make query/path/explain outputs reusable as downstream evidence.

### Phase 4: Full skill UX preservation

- [ ] Add support for GitHub URL clone mode.
- [ ] Add support for multiple URL/repo graph merge.
- [ ] Add support for `add <url>`.
- [ ] Add support for `--wiki`, `--obsidian`, `--svg`, `--graphml`, `--neo4j` file exports.
- [ ] Add support for `--mcp` and `--watch` with explicit long-running-process behavior.
- [ ] Add support for `--neo4j-push` behind explicit trust.
- [ ] Add support for PR dashboard/impact/triage if it fits the plugin boundary.

### Phase 5: AgentOS integration

- [ ] Document how `ooo auto` should consume Graphify handoff artifacts without embedding Graphify-specific logic in core.
- [ ] Define a recommended pre-analysis flow:
  - `ooo graphify .`
  - `ooo graphify query "what parts of this codebase are relevant to <goal>?"`
  - attach result to Seed/planning/autopilot handoff.
- [ ] Define a recommended review flow:
  - build/update graph;
  - inspect surprising connections/god nodes;
  - attach report to review evidence.
- [ ] Define a recommended research/corpus flow using `add`, `query`, and graph handoff artifacts.

---

## Acceptance criteria

This epic is complete when:

- [ ] Graphify has been assimilated into the AgentOS ecosystem through a `graphify` plugin package under `plugins/graphify/`.
- [ ] The plugin manifest validates against the current schema.
- [ ] `ooo graphify .` maps to the Graphify build flow and preserves the expected output artifacts.
- [ ] `ooo graphify query`, `ooo graphify path`, and `ooo graphify explain` work against a generated graph.
- [ ] The plugin records or emits enough evidence for audit/provenance:
  - [ ] plugin version;
  - [ ] Graphify version;
  - [ ] command invoked;
  - [ ] target path or graph path;
  - [ ] generated artifacts;
  - [ ] graph stats;
  - [ ] backend/model hints when applicable;
  - [ ] permission-sensitive operations used.
- [ ] Network/model/hook/MCP/watch/Neo4j operations are separated behind explicit permissions and confirmation behavior.
- [ ] The README explains how Graphify's `/graphify` experience maps to `ooo graphify`.
- [ ] The plugin does not require changes to Ouroboros core beyond the documented plugin contract unless a schema/contract gap is explicitly identified.
- [ ] Any required schema pressure points are filed as follow-up issues instead of silently extending the manifest.
- [ ] The implementation demonstrates #27's standard for AgentOS assimilation: permissioned, auditable, resumable, handoff-capable, and more than a raw command wrapper.

---

## Open design questions

1. Should `graphify install` / platform skill installation commands be exposed through AgentOS/Ouroboros, or should the AgentOS plugin replace that installation path for AgentOS users?
2. Should Graphify's MCP server be a plugin subcommand, a declared plugin capability, or both?
3. Should `graphify-out/` become a standard handoff artifact type in AgentOS/Ouroboros, or remain plugin-specific metadata?
4. Should model/API backend selection be represented as permissions only, or does the plugin contract need a richer `runtime`/`model` declaration?
5. Should Neo4j push require a new `database:write` permission scope, or is `network:write` sufficient for v0.1?
6. How should long-running `watch` and MCP server processes appear in state/progress/audit events?
7. Should PR dashboard features live in this plugin or remain separate from `github-pr-ops` to avoid namespace overlap?

---

## References

- Upstream Graphify repository: https://github.com/safishamsi/graphify
- Upstream README command surface: https://github.com/safishamsi/graphify#common-commands
- Upstream package note: PyPI package is `graphifyy`; CLI command is `graphify`.
- Plugin authoring/capability assimilation RFC: #27
- Local contract docs: `docs/contract.md`, `docs/lifecycle.md`, `docs/permissions.md`, `docs/audit.md`
- Reference plugin shape: `plugins/github-pr-ops/`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epic: Assimilate Graphify into the AgentOS ecosystem as a knowledge-graph capability #37