You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Epic: Assimilate SWE-agent into the AgentOS ecosystem as a seamless issue-to-patch execution harness
Goal
Assimilate SWE-agent/SWE-agent into the AgentOS/Ouroboros ecosystem as a first-class, contract-aware, permissioned, auditable software-engineering execution harness delivered through the ouroboros-plugins repository and the plugin contract described in #27.
The product goal is not merely to wrap the sweagent binary. The goal is to make SWE-agent run smoothly from Ouroboros while preserving the upstream SWE-agent user experience as much as possible:
The strategic objective is to make AgentOS feel like the operating system for external software-engineering agents: users should be able to run SWE-agent from Ouroboros without losing the SWE-agent mental model, CLI shape, config-driven workflow, trajectories, patches, replay/inspection tools, or research ergonomics.
This is a concrete implementation candidate for the #27 thesis:
The plugin layer exists to keep core small while allowing the outside world to become Ouroboros-native.
In this issue, "Ouroboros-native" means SWE-agent remains recognizably SWE-agent, but its authority, artifacts, lifecycle, audit trail, and handoffs are governed by the AgentOS plugin contract instead of escaping into an unbounded command wrapper.
Source capability summary
SWE-agent is an open-source autonomous software-engineering harness that takes a GitHub issue or custom problem statement and attempts to produce a patch using a language model and a tool-enabled execution environment.
Current upstream facts observed while drafting this issue:
Recent repository activity observed: pushed on 2026-05-18
Upstream README notes that much current development effort is on SWE-agent/mini-swe-agent, which upstream describes as simpler and generally recommended going forward.
SWE-agent still remains a large, documented, config-rich harness with trajectories, run replay, batch mode, inspector tooling, SWE-ReX/Docker runtime support, GitHub issue ingestion, patch generation, and optional PR-opening hooks.
Primary upstream command families:
sweagent run # run on a single issue/problem
sweagent run-batch # batch/SWE-bench style execution
sweagent run-replay # replay a trajectory/demo
sweagent inspect # terminal trajectory inspector
sweagent inspector # web trajectory inspector
sweagent quick-stats # summarize trajectory directories
sweagent merge-preds # merge prediction files
sweagent traj-to-demo # convert trajectory to demo
sweagent remove-unfinished
sweagent shell
Important upstream execution surfaces:
RunSingleConfig: combines environment config, agent/model config, problem statement config, output directory, env var path, and action options.
EnvironmentConfig: controls deployment, repository source, startup commands, and shell environment.
SWEEnv: wraps SWE-ReX deployment/runtime, starts a shell session, copies/resets repositories, executes commands, reads/writes files, and closes the runtime.
ProblemStatementConfig: supports GitHub issues, text, files, and SWE-bench/multimodal problem statements.
SaveApplyPatchHook: saves patches and can optionally apply them to a local repository.
OpenPRHook: can push a branch and create a draft PR when enabled.
Trajectory and prediction artifacts: .traj, .pred, .patch, logs, replay config, model stats, exit status, and edited-file context.
Why this belongs in ouroboros-plugins
This is not about moving SWE-agent into Ouroboros core. This repository should remain the contract/reference/plugin layer, not a marketplace and not a dumping ground for arbitrary wrappers.
SWE-agent is a reference-quality assimilation case because it exercises exactly the boundaries #27 is trying to make explicit:
External autonomous execution harness
SWE-agent is not just a library call; it is an agent runtime that can clone/copy repos, execute shell commands, edit code, produce patches, and optionally create PRs.
This makes it an ideal stress test for the difference between a trivial command wrapper and an Ouroboros-native capability.
User-experience preservation requirement
The upstream value is tied to the sweagent CLI, YAML config model, trajectory format, replay tooling, inspectors, and research workflow.
The AgentOS plugin must preserve that experience rather than forcing users into a completely different abstraction prematurely.
Permission and risk boundary
SWE-agent can read repos, write patches, run arbitrary shell commands in a sandbox, call LLM APIs, use Docker/SWE-ReX, read GitHub issues, and optionally push branches/open PRs.
These authorities must be declared and audited through plugin capabilities/permissions.
Artifact and handoff richness
SWE-agent produces exactly the kind of artifacts AgentOS should understand: patches, trajectories, predictions, logs, replay configs, model stats, and edited-file context.
The plugin should translate these into ledger/provenance/handoff artifacts for downstream review, verification, or ooo auto continuation.
AgentOS as the execution substrate
The long-term vision is that AgentOS can run external agents as supervised capabilities, not that core knows every external agent’s internal branches.
SWE-agent should become a canonical example of “external agent harness → AgentOS-native supervised capability.”
Product principle: preserve SWE-agent UX first
The first plugin version should deliberately preserve the SWE-agent user experience:
Keep upstream sweagent subcommands recognizable.
Preserve upstream config files and dotted CLI override style.
Preserve upstream output artifacts and directory conventions where practical.
Preserve .traj, .pred, .patch, logs, replay config, and inspector compatibility.
Preserve existing SWE-agent docs/tutorial compatibility by making command translation obvious.
Add AgentOS semantics around the run rather than replacing the run with a new workflow vocabulary.
A user who already knows SWE-agent should be able to predict the AgentOS command surface.
If additional flags are needed, prefer namespaced flags such as --agentos-* or a separate wrapper mode rather than silently changing upstream SWE-agent semantics.
Rationale: the plugin assimilates an external harness. It does not reimplement SWE-agent and should not imply that SWE-agent itself has been absorbed into core.
The adapter should invoke upstream SWE-agent as an external dependency or local executable when available. It should not vendor the entire SWE-agent repository into Ouroboros core.
Proposed command surface
Baseline commands
ooo swe-agent run ...
Run SWE-agent on a single problem while preserving upstream CLI compatibility.
Validate that expected artifacts exist and are internally consistent.
Proposed capabilities
The plugin should declare only the Ouroboros substrate capabilities it actually needs.
Likely baseline capabilities:
seed:read/write # consume or generate problem/run specs when used as Seed handoff
ledger:write # record evidence, run status, patch artifacts, decisions
state:write # persist run/progress/resume state
provenance:write # record repo, issue, config, model, artifact, and source metadata
runtime:execute # invoke SWE-agent and sandbox runtime
handoff:attach # attach patch/trajectory/prediction/handoff artifacts
progress:write # stream run status and summaries
mcp:call # optional/future only, if delegated through MCP surfaces
Do not request mcp:call unless the implementation actually uses it.
Proposed permissions and risk taxonomy
Authority
Scope
Risk
Required?
Notes
Read local repo/problem/config/artifacts
filesystem:read
read_only
yes
Needed for almost all commands.
Write output artifacts/patch/handoff
filesystem:write
write
yes for run/prepare
Must be output-dir bounded.
Execute SWE-agent and sandbox commands
shell:execute
write
yes for run
Should be sandboxed and audited.
Start runtime / container / SWE-ReX deployment
shell:execute + runtime:execute
write or policy-high
yes for sandboxed run
Docker/socket access must be treated carefully.
Read GitHub issue/repo metadata
github:read, network:read
read_only
optional
Needed for problem_statement.github_url or env.repo.github_url.
Call LLM provider APIs
network:write
write
optional
Cost/data egress must be visible.
Push branch / open PR
github:pull_request:write, network:write
destructive or high write
no / deferred
Should not be enabled by default.
Apply patch to host local repo
filesystem:write
write
optional
Requires explicit command/confirmation.
Offensive cybersecurity/CTF modes
separate explicit scope
high risk
no / deferred
Must not be silently bundled with normal issue-fixing UX.
run, run-replay: write with confirmation depending on sandbox authority
apply-patch: write with confirmation
open-pr: defer, or classify as destructive/high-write with explicit trust
any security/offensive mode: defer until a separate policy issue exists
Artifact contract
Every AgentOS-managed SWE-agent run should produce an artifact bundle. Preserve upstream artifacts and add AgentOS metadata rather than replacing the upstream layout.
Suggested bundle:
.agentos/swe-agent/<run-id>/
run-spec.json # normalized AgentOS + SWE-agent invocation spec
problem.md # resolved problem statement when available
upstream-command.txt # exact sweagent command equivalent
stdout.log
stderr.log
swe-agent-output/ # upstream output directory, preserved
<instance-id>/
<instance-id>.traj
<instance-id>.pred
<instance-id>.patch
*.trace.log
*.debug.log
*.info.log
config.yaml
patch.diff # normalized pointer/copy of selected patch, if any
prediction.pred # normalized pointer/copy of selected prediction, if any
trajectory.traj # normalized pointer/copy of selected trajectory, if any
audit-summary.json
provenance.json
handoff.json
handoff.md
handoff.md should answer
What problem was attempted?
Which repo/base commit/branch was used?
Which SWE-agent config/model was used?
Which command was run?
Which permissions were exercised?
Which artifacts were produced?
Did the run complete, fail, block, submit a patch, or exit early?
What files were edited according to the patch/trajectory metadata?
What should happen next?
inspect patch
run tests
apply patch
hand off to ooo auto
open PR only if explicitly trusted
provenance.json should record bounded metadata only
full private issue bodies unless explicitly allowed
arbitrary shell history outside the run
unbounded model prompts if they contain secrets
Execution semantics
Preserve upstream execution path
The adapter should be able to run the upstream CLI directly:
sweagent run <original args>
or a configured local checkout/module when needed.
The plugin should not fork upstream behavior unless required for AgentOS safety. Prefer these layers:
preflight validation and permission checks
command construction / pass-through compatibility
sandbox/runtime guardrails
artifact collection
provenance/audit/handoff conversion
Sandboxing requirements
The plugin should make sandbox policy explicit:
local repo path must be bounded and resolved
output dir must be controlled by the plugin or explicitly supplied
host patch application must be a separate action from sandboxed patch generation
Docker/SWE-ReX access must be documented as runtime authority
startup commands must be captured in run spec
network/API use must be visible before invocation
timeout/budget/cost-limit values must be captured when provided
Failure semantics
The plugin should distinguish:
blocked permission/trust/sandbox policy prevented run
failed adapter or SWE-agent execution failed
completed run completed, no patch necessarily produced
submitted SWE-agent produced a patch/submission
partial artifacts exist but run ended early or with uncertain status
cancelled user/runtime cancelled run
Map these to the standard plugin audit vocabulary where possible:
Use plugin.failed with status=blocked for firewall/trust denials, consistent with the existing plugin contract.
Manifest / schema pressure discovered by this epic
This plugin should start within the existing v0.1 manifest contract. However, SWE-agent is likely to expose future schema needs. Do not expand the schema speculatively; document pressure and open follow-up issues only when implementation proves the need.
Likely future pressure points:
Command-level permissions
inspect needs only read permissions.
run needs shell/runtime/filesystem/network.
open-pr needs GitHub write/destructive authority.
Current v0.1 permissions are plugin-level, so command-level mapping may become necessary.
Artifact declarations
The plugin produces patch, trajectory, prediction, logs, replay config, and handoff bundles.
A future schema could declare artifact types and paths.
Secret/environment declarations
SWE-agent often relies on OPENAI_API_KEY, ANTHROPIC_API_KEY, GITHUB_TOKEN, or provider-specific variables.
A future schema may need bounded secret requirements without storing secret values.
Network endpoint declarations
GitHub, LLM providers, Modal/AWS, or other deployment targets may be involved.
A future schema may need endpoint categories or allowlists.
Long-running progress/resume metadata
SWE-agent runs can be long-running and expensive.
Better progress, cancellation, and resume semantics may be needed.
Non-goals
Do not vendor the full SWE-agent repository into Ouroboros core.
Do not teach ooo auto SWE-agent-specific branches.
Do not turn Q00/ouroboros-plugins into a marketplace listing for SWE-agent.
Do not hide SWE-agent’s CLI/config model behind an incompatible abstraction.
Do not silently apply patches to the host repository after a run.
Do not silently push branches or open PRs.
Do not grant network:write, shell:execute, or GitHub write permissions implicitly.
Do not store raw secrets or unbounded private prompts in provenance.
Do not enable offensive cybersecurity workflows under the same default trust path as ordinary issue fixing.
Do not expand the plugin manifest schema until this reference plugin proves a real contract need.
Suggested implementation phases
Phase 1 — RFC/design and UX parity spec
Deliverables:
Add a design note or RFC section documenting SWE-agent as an external agent harness assimilation case.
Define exact command parity goals.
Define the adapter-vs-vendoring boundary.
Define the artifact bundle contract.
Define permission/risk classification for each command family.
Decide whether baseline targets upstream SWE-agent, mini-SWE-agent, or both.
Recommendation: start with SWE-agent parity because this epic is scoped to SWE-agent/SWE-agent, but explicitly leave room for a future mini-swe-agent adapter or compatibility mode.
Epic: Assimilate SWE-agent into the AgentOS ecosystem as a seamless issue-to-patch execution harness
Goal
Assimilate
SWE-agent/SWE-agentinto the AgentOS/Ouroboros ecosystem as a first-class, contract-aware, permissioned, auditable software-engineering execution harness delivered through theouroboros-pluginsrepository and the plugin contract described in #27.The product goal is not merely to wrap the
sweagentbinary. The goal is to make SWE-agent run smoothly from Ouroboros while preserving the upstream SWE-agent user experience as much as possible:should have an AgentOS-native equivalent such as:
while adding Ouroboros-native semantics:
The strategic objective is to make AgentOS feel like the operating system for external software-engineering agents: users should be able to run SWE-agent from Ouroboros without losing the SWE-agent mental model, CLI shape, config-driven workflow, trajectories, patches, replay/inspection tools, or research ergonomics.
This is a concrete implementation candidate for the #27 thesis:
In this issue, "Ouroboros-native" means SWE-agent remains recognizably SWE-agent, but its authority, artifacts, lifecycle, audit trail, and handoffs are governed by the AgentOS plugin contract instead of escaping into an unbounded command wrapper.
Source capability summary
SWE-agent is an open-source autonomous software-engineering harness that takes a GitHub issue or custom problem statement and attempts to produce a patch using a language model and a tool-enabled execution environment.
Current upstream facts observed while drafting this issue:
mainv1.1.0(2025-05-22)2026-05-18SWE-agent/mini-swe-agent, which upstream describes as simpler and generally recommended going forward.Primary upstream command families:
Important upstream execution surfaces:
RunSingleConfig: combines environment config, agent/model config, problem statement config, output directory, env var path, and action options.EnvironmentConfig: controls deployment, repository source, startup commands, and shell environment.SWEEnv: wraps SWE-ReX deployment/runtime, starts a shell session, copies/resets repositories, executes commands, reads/writes files, and closes the runtime.ProblemStatementConfig: supports GitHub issues, text, files, and SWE-bench/multimodal problem statements.SaveApplyPatchHook: saves patches and can optionally apply them to a local repository.OpenPRHook: can push a branch and create a draft PR when enabled..traj,.pred,.patch, logs, replay config, model stats, exit status, and edited-file context.Why this belongs in
ouroboros-pluginsThis is not about moving SWE-agent into Ouroboros core. This repository should remain the contract/reference/plugin layer, not a marketplace and not a dumping ground for arbitrary wrappers.
SWE-agent is a reference-quality assimilation case because it exercises exactly the boundaries #27 is trying to make explicit:
External autonomous execution harness
User-experience preservation requirement
sweagentCLI, YAML config model, trajectory format, replay tooling, inspectors, and research workflow.Permission and risk boundary
Artifact and handoff richness
ooo autocontinuation.AgentOS as the execution substrate
Product principle: preserve SWE-agent UX first
The first plugin version should deliberately preserve the SWE-agent user experience:
sweagentsubcommands recognizable..traj,.pred,.patch, logs, replay config, and inspector compatibility.A user who already knows SWE-agent should be able to predict the AgentOS command surface.
Preferred mapping:
The plugin adapter may add AgentOS-specific flags, but should avoid breaking upstream CLI expectations:
If additional flags are needed, prefer namespaced flags such as
--agentos-*or a separate wrapper mode rather than silently changing upstream SWE-agent semantics.Proposed plugin identity
Suggested plugin name:
Rationale: the plugin assimilates an external harness. It does not reimplement SWE-agent and should not imply that SWE-agent itself has been absorbed into core.
Entrypoint pattern:
{ "entrypoint": { "type": "command", "command": "python -m swe_agent_harness" } }The adapter should invoke upstream SWE-agent as an external dependency or local executable when available. It should not vendor the entire SWE-agent repository into Ouroboros core.
Proposed command surface
Baseline commands
ooo swe-agent run ...Run SWE-agent on a single problem while preserving upstream CLI compatibility.
Must support upstream-style inputs:
AgentOS additions:
handoff.mdandhandoff.jsonooo auto, or open PR if separately trustedooo swe-agent run-batch ...Preserve upstream batch mode, but classify it as a higher-risk / higher-cost command.
MVP can defer full batch support if single-run artifacts and trust semantics are not complete.
ooo swe-agent run-replay ...Replay an existing trajectory/demo and attach the replay result as a new AgentOS artifact.
This should be one of the safest and most valuable early commands because it helps audit and reproduce prior runs.
ooo swe-agent inspect ...Open or summarize a trajectory using the upstream-compatible inspector path.
Risk should be
read_onlywhen it only reads existing artifacts.ooo swe-agent quick-stats ...Read trajectory directories and summarize exit statuses/model stats.
Risk should be
read_only.ooo swe-agent merge-preds ...Merge prediction files into a derived artifact.
Risk should be
writebecause it writes a new local file, but it should not need external authority.ooo swe-agent traj-to-demo ...Convert trajectory files to editable demos.
Risk should be
write.AgentOS-native helper commands
These may be plugin-specific additions that do not exist upstream:
ooo swe-agent prepare ...Create a bounded run spec / handoff without executing SWE-agent.
Purpose:
ooo swe-agent collect-artifacts <output-dir>Read an existing SWE-agent run output and attach it to the AgentOS ledger/provenance/handoff system.
This is useful for retroactive assimilation of runs performed outside Ouroboros.
ooo swe-agent handoff <run-id-or-output-dir>Generate or regenerate
handoff.md/handoff.jsonfrom SWE-agent artifacts.ooo swe-agent verify-artifacts <run-id-or-output-dir>Validate that expected artifacts exist and are internally consistent.
Proposed capabilities
The plugin should declare only the Ouroboros substrate capabilities it actually needs.
Likely baseline capabilities:
Do not request
mcp:callunless the implementation actually uses it.Proposed permissions and risk taxonomy
filesystem:readread_onlyfilesystem:writewriteshell:executewriteshell:execute+runtime:executewriteor policy-highgithub:read,network:readread_onlyproblem_statement.github_urlorenv.repo.github_url.network:writewritegithub:pull_request:write,network:writedestructiveor highwritefilesystem:writewriteInitial risk recommendation
inspect,quick-stats:read_onlyprepare,handoff,collect-artifacts,traj-to-demo,merge-preds:writerun,run-replay:writewith confirmation depending on sandbox authorityapply-patch:writewith confirmationopen-pr: defer, or classify as destructive/high-write with explicit trustArtifact contract
Every AgentOS-managed SWE-agent run should produce an artifact bundle. Preserve upstream artifacts and add AgentOS metadata rather than replacing the upstream layout.
Suggested bundle:
handoff.mdshould answerooo autoprovenance.jsonshould record bounded metadata onlyAllowed examples:
{ "source_repo": "https://github.com/org/repo", "base_commit": "abc123", "problem_statement_source": "github_issue", "problem_statement_url": "https://github.com/org/repo/issues/123", "swe_agent_repo": "https://github.com/SWE-agent/SWE-agent", "swe_agent_version": "v1.1.0", "config_files": ["config/default.yaml"], "model_name": "gpt-4o", "output_dir": ".agentos/swe-agent/<run-id>/swe-agent-output", "artifact_paths": ["trajectory.traj", "patch.diff", "prediction.pred"] }Forbidden examples:
Execution semantics
Preserve upstream execution path
The adapter should be able to run the upstream CLI directly:
or a configured local checkout/module when needed.
The plugin should not fork upstream behavior unless required for AgentOS safety. Prefer these layers:
Sandboxing requirements
The plugin should make sandbox policy explicit:
Failure semantics
The plugin should distinguish:
Map these to the standard plugin audit vocabulary where possible:
Use
plugin.failedwithstatus=blockedfor firewall/trust denials, consistent with the existing plugin contract.Manifest / schema pressure discovered by this epic
This plugin should start within the existing v0.1 manifest contract. However, SWE-agent is likely to expose future schema needs. Do not expand the schema speculatively; document pressure and open follow-up issues only when implementation proves the need.
Likely future pressure points:
Command-level permissions
inspectneeds only read permissions.runneeds shell/runtime/filesystem/network.open-prneeds GitHub write/destructive authority.Artifact declarations
Secret/environment declarations
OPENAI_API_KEY,ANTHROPIC_API_KEY,GITHUB_TOKEN, or provider-specific variables.Network endpoint declarations
Long-running progress/resume metadata
Non-goals
ooo autoSWE-agent-specific branches.Q00/ouroboros-pluginsinto a marketplace listing for SWE-agent.network:write,shell:execute, or GitHub write permissions implicitly.Suggested implementation phases
Phase 1 — RFC/design and UX parity spec
Deliverables:
Recommendation: start with SWE-agent parity because this epic is scoped to
SWE-agent/SWE-agent, but explicitly leave room for a futuremini-swe-agentadapter or compatibility mode.Phase 2 — Plugin skeleton
Deliverables:
Commands to declare first:
runrun-replayinspectquick-statscollect-artifactshandoffDefer:
run-batchapply-patchopen-prPhase 3 — Pass-through runner with artifact collection
Deliverables:
.traj,.pred,.patch, logs, confighandoff.md/handoff.jsonprovenance.jsonaudit-summary.jsonPhase 4 — Permission and trust integration
Deliverables:
runif local write/shell/runtime authority is presentPhase 5 — Replay/inspect/read-only tooling
Deliverables:
inspectover existing trajectoriesquick-statsover trajectory directoriesrun-replaywith artifact attachmentcollect-artifactsfor SWE-agent runs performed outside AgentOSPhase 6 — Controlled mutation commands
Only after the baseline is safe:
apply-patchwith explicit confirmationopen-prwith explicit GitHub write/destructive trustrun-batchwith budget/cost guardrailsAcceptance criteria
This epic is complete when:
plugins/swe-agent-harness/.ouroboros.plugin.jsonvalidates against the current schema.sweagent.ooo swe-agent run ...can pass through a normal upstream-stylesweagent run ...invocation in a bounded way.run-spec.json,provenance.json,audit-summary.json,handoff.json, andhandoff.md.inspectand/orquick-statscan operate as read-only commands over existing SWE-agent artifacts.References
docs/contract.mddocs/lifecycle.mddocs/permissions.mddocs/audit.mdplugins/github-pr-ops/