Summary
Assimilate semgrep/semgrep into the Ouroboros plugin ecosystem as a first-class AgentOS capability while preserving the Semgrep user experience.
This epic is not about replacing Semgrep, reimplementing Semgrep, or hiding Semgrep behind a generic wrapper. The goal is to let users keep the Semgrep mental model they already know — local-first scans, familiar configs, familiar flags, JSON/SARIF output, rule testing, CI-style behavior, and optional autofix flows — while Ouroboros adds the missing AgentOS layer around it:
- explicit permissions,
- capability declarations,
- risk classification,
- audit events,
- provenance,
- normalized evidence artifacts,
- Seed / Ledger / State / Handoff compatibility,
- and resumable agent workflows.
In short:
Preserve the Semgrep experience, but make every Semgrep capability executable, inspectable, permissioned, and handoff-capable inside Ouroboros.
This directly exercises the thesis from #27:
Ouroboros plugins are not merely command wrappers. They are the capability assimilation layer that turns external tools, open-source libraries, and domain workflows into structured, auditable, permissioned, Seed-compatible Ouroboros capabilities.
Semgrep is explicitly in scope for that RFC class: a static-analysis engine that should remain outside core while becoming usable as an Ouroboros-native AgentOS capability.
Source capability
Repository: https://github.com/semgrep/semgrep
Semgrep is a mature static-analysis engine that provides:
- local code scanning,
- rule-based code search,
- security and quality guardrails,
- Semgrep YAML rules,
- JSON and SARIF outputs,
- CI-oriented scan modes,
- rule testing,
- optional registry / remote configuration,
- optional metrics,
- optional autofix / dry-run flows,
- MCP / AI-assistant integration surfaces,
- and broad language support.
Important upstream properties to preserve:
- Semgrep users expect to run scans against local repositories.
- Local scanning should remain local-first.
- Familiar Semgrep concepts such as
--config, local rule files, registry configs, --json, --sarif, --metrics=off, --autofix, --dryrun, .semgrepignore, and exit-code behavior should remain recognizable.
- Semgrep output should remain available in raw form when requested.
- Ouroboros should add structured artifacts rather than erase Semgrep's native output model.
Product goal
Make Semgrep feel like a native AgentOS capability without breaking Semgrep muscle memory.
A user should be able to say, in effect:
ooo semgrep scan . --config rules/ci.yml
and get the same kind of Semgrep experience they expect, plus Ouroboros-native execution products:
Semgrep scan ran locally
Raw Semgrep JSON/SARIF was preserved
Findings were normalized
Ledger/provenance/audit records were emitted
A handoff artifact was attached
A downstream agent can now triage, fix, suppress, or gate on the results
Non-goals
This epic must not turn ouroboros-plugins into a marketplace or a dumping ground for scanner wrappers.
Out of scope for the first implementation:
- Reimplementing Semgrep's parser, rule engine, or CLI.
- Vendoring Semgrep source into this repository.
- Making Semgrep part of Ouroboros core.
- Requiring Semgrep AppSec Platform or proprietary Semgrep services.
- Uploading source code by default.
- Enabling registry/network behavior by default without explicit permission modeling.
- Enabling autofix in the v0 read-only reference path.
- Treating raw
semgrep ... execution as sufficient plugin compliance.
- Adding schema fields speculatively before proving that the existing contract cannot represent the capability.
Desired plugin shape
Initial plugin candidate:
plugins/semgrep-static-analysis/
ouroboros.plugin.json
README.md
semgrep_static_analysis/
__init__.py
__main__.py
cli.py
runner.py
normalize.py
artifacts.py
audit.py
tests/
fixtures/
semgrep-output-empty.json
semgrep-output-findings.json
test_normalize.py
test_manifest.py
The name is intentionally capability-oriented rather than marketplace-oriented. Alternative names are acceptable if they preserve the same boundary:
semgrep-static-analysis
semgrep-code-scan
semgrep-agentos
UX preservation contract
The plugin must preserve Semgrep UX as a hard constraint.
Preserve familiar Semgrep inputs
The plugin should accept a bounded subset of familiar Semgrep concepts first:
- target path / scanning root,
--config with local rule files or directories,
- optional registry config only when network permission is modeled,
- include/exclude behavior where feasible,
- JSON output preservation,
- SARIF output preservation,
- metrics control,
- dry-run behavior for future autofix paths,
- Semgrep exit-code semantics where they matter for CI.
Preserve familiar Semgrep outputs
The plugin should not replace Semgrep output with only an Ouroboros summary.
It should produce:
- raw Semgrep JSON artifact,
- optional raw Semgrep SARIF artifact,
- normalized Ouroboros findings artifact,
- human-readable Markdown summary,
- audit/provenance records that point to artifact paths and hashes.
Preserve local-first behavior
The default invocation should be local and read-only:
semgrep scan --json --metrics=off --disable-version-check --config <local-config> <target>
Exact flags may vary by installed Semgrep version, but the intent must hold:
- no source upload by default,
- no registry fetch by default,
- metrics disabled by default,
- bounded repo-relative target paths,
- bounded artifact writes only into an Ouroboros-controlled output directory.
AgentOS-native translation
The plugin must translate Semgrep into Ouroboros primitives rather than merely execute it.
Core capabilities
Expected manifest capabilities for the read-only scan path:
ledger:write — record invocation, policy inputs, scan summary, and decision-relevant facts.
provenance:write — record Semgrep version, config source, target paths, command shape, output artifact hashes, and environment facts.
handoff:attach — attach normalized findings and summary for downstream agents.
progress:write — report scan progress and completion.
Optional future capabilities:
state:write — persist scan state across resumable long-running scans or multi-stage triage.
seed:write — generate remediation Seeds from findings when this becomes a deliberate workflow.
runtime:execute — only if delegated agent execution becomes part of the plugin itself.
External permissions
Expected baseline permissions:
filesystem:read / read_only / required — read target source files and local rule files.
shell:execute / read_only / required — invoke the installed Semgrep CLI with bounded arguments.
Expected optional permissions:
network:read / read_only / optional — only for registry configs, remote configs, version checks, or other remote Semgrep flows.
filesystem:write / write / optional — only for future autofix or explicit artifact output outside the controlled handoff directory.
The plugin must keep capabilities and permissions distinct per #27 and docs/contract.md.
Command plan
v0 command: read-only scan
ooo semgrep scan <target-path> --config <local-config>
Responsibilities:
- validate target path is repo-relative / bounded,
- validate local config path is bounded,
- invoke Semgrep with local-first safe defaults,
- preserve raw JSON output,
- optionally preserve SARIF output if requested,
- normalize findings,
- emit audit/provenance data,
- attach handoff artifacts,
- return a clear status code / summary.
Risk: read_only
Required external permissions:
filesystem:read
shell:execute
Required core capabilities:
ledger:write
provenance:write
handoff:attach
progress:write
v0.1 or v1 command: CI-style scan
ooo semgrep ci-scan <target-path> --config <config> --baseline <ref>
Responsibilities:
- preserve Semgrep CI mental model,
- optionally honor baseline / changed-findings semantics,
- emit gate result suitable for
ooo auto, PR review, or policy workflows,
- keep raw Semgrep output available.
Risk: usually read_only; network config must be separately declared if used.
Future command: rule test
ooo semgrep rule-test <rules-path>
Responsibilities:
- run Semgrep rule tests,
- normalize pass/fail results,
- attach rule-test evidence for plugin / policy development.
Risk: read_only
Future command: autofix dry run
ooo semgrep autofix-dryrun <target-path> --config <config>
Responsibilities:
- run Semgrep autofix in dry-run / preview mode,
- produce patch preview artifact,
- do not modify files.
Risk: read_only
Future command: autofix apply
ooo semgrep autofix <target-path> --config <config>
Responsibilities:
- apply deterministic Semgrep rule-defined fixes,
- emit patch provenance,
- attach before/after evidence,
- require explicit confirmation and
filesystem:write.
Risk: write
This should not ship until the read-only scan path proves the boundary.
Manifest draft
The v0 read-only manifest should fit the current 0.1 schema without expanding the manifest contract:
{
"schema_version": "0.1",
"name": "semgrep-static-analysis",
"version": "0.1.0",
"description": "Assimilates Semgrep local static-analysis scans into Ouroboros audit, provenance, and handoff artifacts while preserving Semgrep CLI UX.",
"source": {
"type": "local_path",
"path": "plugins/semgrep-static-analysis"
},
"commands": [
{
"namespace": "semgrep",
"name": "scan",
"summary": "Run a bounded read-only Semgrep scan and attach normalized findings as Ouroboros evidence.",
"usage": "ooo semgrep scan <target-path> --config <local-rule-or-pack>",
"risk": "read_only",
"requires_confirmation": false,
"arguments": [
{
"name": "target_path",
"type": "path",
"required": true,
"description": "Repo-relative file or directory to scan."
},
{
"name": "config",
"type": "string",
"required": true,
"description": "Local Semgrep config path or explicitly approved registry config."
}
]
}
],
"capabilities": [
{
"name": "ledger",
"access": "write",
"reason": "Record scan invocation, policy inputs, and summary verdict."
},
{
"name": "provenance",
"access": "write",
"reason": "Record Semgrep version, config source, target paths, and output hashes."
},
{
"name": "handoff",
"access": "attach",
"reason": "Attach normalized findings for downstream review or automated remediation."
},
{
"name": "progress",
"access": "write",
"reason": "Report scan progress and completion status."
}
],
"permissions": [
{
"scope": "filesystem:read",
"risk": "read_only",
"required": true,
"reason": "Read target source files and local Semgrep rule files."
},
{
"scope": "shell:execute",
"risk": "read_only",
"required": true,
"reason": "Invoke the installed Semgrep CLI with bounded arguments."
},
{
"scope": "network:read",
"risk": "read_only",
"required": false,
"reason": "Only needed when using Semgrep registry or remote configs."
}
],
"entrypoint": {
"type": "command",
"command": "python -m semgrep_static_analysis"
},
"audit": {
"events": [
"plugin.invoked",
"plugin.permission_used",
"plugin.completed",
"plugin.failed"
]
}
}
Artifact contract
Each successful scan should attach a handoff bundle similar to:
.omx/artifacts/semgrep/<run-id>/
semgrep.raw.json
semgrep.raw.sarif # optional
semgrep.findings.json # normalized Ouroboros finding model
semgrep.summary.md # human-readable summary
semgrep.provenance.json # bounded provenance fields and hashes
Suggested normalized finding shape:
{
"schema_version": "0.1",
"tool": "semgrep",
"tool_version": "1.x",
"rule_id": "python.lang.security.audit...",
"severity": "ERROR",
"message": "...",
"path": "src/example.py",
"start": { "line": 10, "col": 5 },
"end": { "line": 10, "col": 25 },
"metadata": {
"cwe": "...",
"owasp": "..."
},
"fix_available": false,
"fingerprint": "stable-or-derived-id",
"raw_result_ref": "semgrep.raw.json#/results/0"
}
The normalized model should preserve enough information for downstream agents to:
- summarize risk,
- decide whether a finding blocks a workflow,
- generate remediation Seeds,
- open follow-up tasks,
- compare scan runs,
- and attach evidence to PR / review workflows.
Audit and provenance requirements
The plugin should emit / prepare audit-compatible data for:
plugin.invoked
plugin.permission_used
plugin.completed
plugin.failed
Provenance should include bounded, redacted facts only:
- Semgrep version,
- plugin version,
- command namespace/name,
- target path(s),
- config path or config identifier,
- whether config was local or remote,
- metrics mode,
- network mode,
- output artifact paths,
- artifact hashes,
- result counts by severity,
- exit code,
- run duration.
Provenance must not include:
- raw source code,
- access tokens,
- unbounded Semgrep output blobs,
- raw user prompts,
- secret values found by scans.
Privacy and network behavior
The default path must be privacy-preserving:
- prefer local config,
- set metrics off by default,
- disable version checks where feasible,
- do not fetch registry rules unless the user explicitly chooses a registry / remote config path,
- require
network:read for registry or remote configuration.
If the user requests a Semgrep Registry config such as p/ci, auto, or a URL config, the plugin must surface that this is no longer a purely local invocation and requires the optional network permission path.
Dependency and license policy
Semgrep is LGPL-2.1. This repository should not vendor Semgrep source as part of the plugin.
Preferred dependency model:
- Require an installed
semgrep executable and inspect it with semgrep --version.
- Document installation options but do not silently install Semgrep in v0.
- Optionally support a future setup helper that installs Semgrep only after explicit user action.
- Preserve Semgrep license notices in plugin README / docs.
This avoids turning the Ouroboros plugin into a Semgrep fork or derivative distribution problem.
Implementation phases
Phase 0 — Contract analysis and RFC alignment
Phase 1 — Read-only reference plugin skeleton
Phase 2 — Safe Semgrep runner
Phase 3 — Output normalization and artifacts
Phase 4 — Audit, provenance, and handoff
Phase 5 — Tests and validation
Phase 6 — Future UX parity expansion
Acceptance criteria
This epic is complete when:
Why this matters for AgentOS
Ouroboros becomes a true AgentOS when external tools do not merely run beside it, but become structured capabilities inside it.
Semgrep is an ideal proof case because it already has a strong developer experience and a strong CLI identity. The challenge is therefore not to redesign Semgrep. The challenge is to preserve Semgrep's UX while adding the operating-system layer that Semgrep alone does not own:
- explicit authority,
- durable state,
- auditability,
- provenance,
- normalized artifacts,
- policy-aware risk handling,
- and downstream agent handoff.
If this succeeds, the same assimilation pattern can guide future static-analysis engines, test tools, security scanners, CI gates, and remediation loops.
References
Summary
Assimilate
semgrep/semgrepinto the Ouroboros plugin ecosystem as a first-class AgentOS capability while preserving the Semgrep user experience.This epic is not about replacing Semgrep, reimplementing Semgrep, or hiding Semgrep behind a generic wrapper. The goal is to let users keep the Semgrep mental model they already know — local-first scans, familiar configs, familiar flags, JSON/SARIF output, rule testing, CI-style behavior, and optional autofix flows — while Ouroboros adds the missing AgentOS layer around it:
In short:
This directly exercises the thesis from #27:
Semgrep is explicitly in scope for that RFC class: a static-analysis engine that should remain outside core while becoming usable as an Ouroboros-native AgentOS capability.
Source capability
Repository: https://github.com/semgrep/semgrep
Semgrep is a mature static-analysis engine that provides:
Important upstream properties to preserve:
--config, local rule files, registry configs,--json,--sarif,--metrics=off,--autofix,--dryrun,.semgrepignore, and exit-code behavior should remain recognizable.Product goal
Make Semgrep feel like a native AgentOS capability without breaking Semgrep muscle memory.
A user should be able to say, in effect:
ooo semgrep scan . --config rules/ci.ymland get the same kind of Semgrep experience they expect, plus Ouroboros-native execution products:
Non-goals
This epic must not turn
ouroboros-pluginsinto a marketplace or a dumping ground for scanner wrappers.Out of scope for the first implementation:
semgrep ...execution as sufficient plugin compliance.Desired plugin shape
Initial plugin candidate:
The name is intentionally capability-oriented rather than marketplace-oriented. Alternative names are acceptable if they preserve the same boundary:
semgrep-static-analysissemgrep-code-scansemgrep-agentosUX preservation contract
The plugin must preserve Semgrep UX as a hard constraint.
Preserve familiar Semgrep inputs
The plugin should accept a bounded subset of familiar Semgrep concepts first:
--configwith local rule files or directories,Preserve familiar Semgrep outputs
The plugin should not replace Semgrep output with only an Ouroboros summary.
It should produce:
Preserve local-first behavior
The default invocation should be local and read-only:
Exact flags may vary by installed Semgrep version, but the intent must hold:
AgentOS-native translation
The plugin must translate Semgrep into Ouroboros primitives rather than merely execute it.
Core capabilities
Expected manifest capabilities for the read-only scan path:
ledger:write— record invocation, policy inputs, scan summary, and decision-relevant facts.provenance:write— record Semgrep version, config source, target paths, command shape, output artifact hashes, and environment facts.handoff:attach— attach normalized findings and summary for downstream agents.progress:write— report scan progress and completion.Optional future capabilities:
state:write— persist scan state across resumable long-running scans or multi-stage triage.seed:write— generate remediation Seeds from findings when this becomes a deliberate workflow.runtime:execute— only if delegated agent execution becomes part of the plugin itself.External permissions
Expected baseline permissions:
filesystem:read/read_only/ required — read target source files and local rule files.shell:execute/read_only/ required — invoke the installed Semgrep CLI with bounded arguments.Expected optional permissions:
network:read/read_only/ optional — only for registry configs, remote configs, version checks, or other remote Semgrep flows.filesystem:write/write/ optional — only for future autofix or explicit artifact output outside the controlled handoff directory.The plugin must keep capabilities and permissions distinct per #27 and
docs/contract.md.Command plan
v0 command: read-only scan
Responsibilities:
Risk:
read_onlyRequired external permissions:
filesystem:readshell:executeRequired core capabilities:
ledger:writeprovenance:writehandoff:attachprogress:writev0.1 or v1 command: CI-style scan
Responsibilities:
ooo auto, PR review, or policy workflows,Risk: usually
read_only; network config must be separately declared if used.Future command: rule test
Responsibilities:
Risk:
read_onlyFuture command: autofix dry run
Responsibilities:
Risk:
read_onlyFuture command: autofix apply
Responsibilities:
filesystem:write.Risk:
writeThis should not ship until the read-only scan path proves the boundary.
Manifest draft
The v0 read-only manifest should fit the current
0.1schema without expanding the manifest contract:{ "schema_version": "0.1", "name": "semgrep-static-analysis", "version": "0.1.0", "description": "Assimilates Semgrep local static-analysis scans into Ouroboros audit, provenance, and handoff artifacts while preserving Semgrep CLI UX.", "source": { "type": "local_path", "path": "plugins/semgrep-static-analysis" }, "commands": [ { "namespace": "semgrep", "name": "scan", "summary": "Run a bounded read-only Semgrep scan and attach normalized findings as Ouroboros evidence.", "usage": "ooo semgrep scan <target-path> --config <local-rule-or-pack>", "risk": "read_only", "requires_confirmation": false, "arguments": [ { "name": "target_path", "type": "path", "required": true, "description": "Repo-relative file or directory to scan." }, { "name": "config", "type": "string", "required": true, "description": "Local Semgrep config path or explicitly approved registry config." } ] } ], "capabilities": [ { "name": "ledger", "access": "write", "reason": "Record scan invocation, policy inputs, and summary verdict." }, { "name": "provenance", "access": "write", "reason": "Record Semgrep version, config source, target paths, and output hashes." }, { "name": "handoff", "access": "attach", "reason": "Attach normalized findings for downstream review or automated remediation." }, { "name": "progress", "access": "write", "reason": "Report scan progress and completion status." } ], "permissions": [ { "scope": "filesystem:read", "risk": "read_only", "required": true, "reason": "Read target source files and local Semgrep rule files." }, { "scope": "shell:execute", "risk": "read_only", "required": true, "reason": "Invoke the installed Semgrep CLI with bounded arguments." }, { "scope": "network:read", "risk": "read_only", "required": false, "reason": "Only needed when using Semgrep registry or remote configs." } ], "entrypoint": { "type": "command", "command": "python -m semgrep_static_analysis" }, "audit": { "events": [ "plugin.invoked", "plugin.permission_used", "plugin.completed", "plugin.failed" ] } }Artifact contract
Each successful scan should attach a handoff bundle similar to:
Suggested normalized finding shape:
{ "schema_version": "0.1", "tool": "semgrep", "tool_version": "1.x", "rule_id": "python.lang.security.audit...", "severity": "ERROR", "message": "...", "path": "src/example.py", "start": { "line": 10, "col": 5 }, "end": { "line": 10, "col": 25 }, "metadata": { "cwe": "...", "owasp": "..." }, "fix_available": false, "fingerprint": "stable-or-derived-id", "raw_result_ref": "semgrep.raw.json#/results/0" }The normalized model should preserve enough information for downstream agents to:
Audit and provenance requirements
The plugin should emit / prepare audit-compatible data for:
plugin.invokedplugin.permission_usedplugin.completedplugin.failedProvenance should include bounded, redacted facts only:
Provenance must not include:
Privacy and network behavior
The default path must be privacy-preserving:
network:readfor registry or remote configuration.If the user requests a Semgrep Registry config such as
p/ci,auto, or a URL config, the plugin must surface that this is no longer a purely local invocation and requires the optional network permission path.Dependency and license policy
Semgrep is LGPL-2.1. This repository should not vendor Semgrep source as part of the plugin.
Preferred dependency model:
semgrepexecutable and inspect it withsemgrep --version.This avoids turning the Ouroboros plugin into a Semgrep fork or derivative distribution problem.
Implementation phases
Phase 0 — Contract analysis and RFC alignment
Phase 1 — Read-only reference plugin skeleton
plugins/semgrep-static-analysis/ouroboros.plugin.json.scancommand.scripts/validate_contract.py.Phase 2 — Safe Semgrep runner
semgrep --version.network:read.Phase 3 — Output normalization and artifacts
Phase 4 — Audit, provenance, and handoff
schemas/0.1/audit-event.schema.json.blockedorfailed, not silent success.Phase 5 — Tests and validation
Phase 6 — Future UX parity expansion
filesystem:write, explicit confirmation, and patch provenance.Acceptance criteria
This epic is complete when:
plugins/<name>/with a validouroboros.plugin.json.read_onlyfor scan.python3 scripts/validate_contract.pypasses.Why this matters for AgentOS
Ouroboros becomes a true AgentOS when external tools do not merely run beside it, but become structured capabilities inside it.
Semgrep is an ideal proof case because it already has a strong developer experience and a strong CLI identity. The challenge is therefore not to redesign Semgrep. The challenge is to preserve Semgrep's UX while adding the operating-system layer that Semgrep alone does not own:
If this succeeds, the same assimilation pattern can guide future static-analysis engines, test tools, security scanners, CI gates, and remediation loops.
References