Skip to content

MVP: dogfood goose autonomous tasks as actionable IDP agent work insights #370

@ourchitectureio

Description

@ourchitectureio

Observation

The current autonomous-task skill already gives us a strong local operating model for AI-assisted development work:

  • Plan -> review -> implement -> validate -> ship flow
  • Isolated git worktrees per task
  • Human or AI review gates
  • Lock and heartbeat files for local session safety
  • local_only=true mode for fully local runs
  • OpenCode and Pi comparison hooks

This issue adds a goose dogfooding path and an IDP MVP that turns those task signals into actionable insight.

The goal is not to build another agent.

The goal is to prove the IDP can observe agent work, connect it to delivery outcomes, and answer: what should someone do differently because of this?

Why it matters

AI coding tools create local productivity signals, but those signals are usually disconnected from delivery outcomes.

A developer, architect, or leader should be able to see:

  • Which agent task is active
  • Who or what initiated it
  • What repo, issue, service, or system it affects
  • Whether it is planning, implementing, validating, blocked, or complete
  • What human review is needed
  • What validation failed
  • What cost, model, or token signals are available
  • Whether the work created value, risk, or rework

This aligns with the newer AI measurement lens: utilization, impact, and cost. It also aligns with the DORA warning that AI amplifies the surrounding engineering system. More agent activity is not the outcome. The useful signal is whether agent activity improves delivery without increasing instability or downstream chaos.

MVP scope

Dogfood goose as one supported agent runtime for the existing autonomous task workflow.

Build a small IDP surface that shows actionable agent work insight from local task execution.

This MVP should work locally first. GitHub and PR integration can be read or linked where available, but the first version should not require hosted infrastructure.

Proposed user story

As a developer using the IDP repo,
I want to run an autonomous task through goose and see the task state in an IDP view,
so I can understand what the agent is doing, where it is blocked, what needs human review, and whether the work is ready to validate or ship.

Proposed dogfood workflow

  1. Install and configure goose locally.
  2. Run goose against a small issue or free-form task using the existing autonomous-task skill behavior as the operating contract.
  3. Keep the existing worktree, lock, heartbeat, and local-only safety model.
  4. Capture a local task event stream or snapshot from the worktree.
  5. Render the task as an actionable IDP insight.

Example task states:

  • worktree-claimed
  • planning
  • needs-review
  • implementing
  • impl-validation-failed
  • impl-validated
  • ship
  • complete-local
  • failed
  • blocked

Proposed implementation shape

1. Add a goose launcher path

Add a small local script, make target, or package command that documents and runs the goose dogfood path.

Possible shape:

make agent-goose-task ISSUE_NUMBER=123 LOCAL_ONLY=true
make agent-goose-task TASK_DESCRIPTION="small docs update" LOCAL_ONLY=true

The launcher should:

  • Create or reuse the task worktree
  • Set the expected worktree path
  • Preserve the .agent-lock and .agent-heartbeat convention
  • Start goose in a mode suitable for local dogfooding
  • Prefer approval or smart approval mode for write actions during early dogfooding
  • Avoid requiring remote push or PR creation for the MVP

2. Add a normalized agent task snapshot

Create a small JSON snapshot file that can be read by the IDP.

Possible path:

.agents/worktrees/<task-slug>/.agent-task.json

Possible schema:

{
  "taskId": "issue-123-example-task",
  "tool": "goose",
  "runtime": "local",
  "issueNumber": 123,
  "taskDescription": "...",
  "worktreePath": ".agents/worktrees/issue-123-example-task",
  "branchName": "task/issue-123-example-task",
  "status": "impl-validation-failed",
  "phase": "validate",
  "humanActionRequired": true,
  "recommendedAction": "Review the validation failure and decide whether to let the agent retry.",
  "lastHeartbeatAt": "2026-05-16T00:00:00Z",
  "lastErrorSummary": "make check failed in contract tests",
  "lastCommitSha": null,
  "validationCommand": "make check",
  "modelProvider": null,
  "modelName": null,
  "tokenUsage": null,
  "estimatedCostUsd": null,
  "links": {
    "issueUrl": null,
    "prUrl": null
  }
}

This file is intentionally small. It should support the IDP view before deeper telemetry exists.

3. Add a local BFF endpoint or static fixture reader

Expose active local agent tasks through the existing local dev app path.

Possible endpoint:

GET /api/agent-work/tasks
GET /api/agent-work/tasks/:taskId

The endpoint should read .agent-task.json and .agent-heartbeat files from known local worktree paths.

For the MVP, file polling is acceptable. Do not require a database.

4. Add an actionable IDP UI surface

Create a small Agent Work view.

The UI should not be a dashboard dump. It should show what needs attention.

Suggested cards:

  • Needs review
  • Validation failed
  • Stale or possibly hung
  • Ready for local review
  • Completed recently

Each card should use the IDP insight shape:

Observation
The goose task for issue 123 failed validation.

Why it matters
The implementation is not ready to ship and may need test or contract updates.

What to do
Review the failure summary, then retry implementation or mark the task blocked.

5. Add minimal outcome fields

The first version should capture enough fields to support later measurement:

Utilization:

  • Tool used
  • Task count
  • Phase reached
  • Human review required
  • Retry count if available

Impact:

  • Validation result
  • PR URL if available
  • Commit SHA if available
  • Files changed count if available
  • Linked issue number

Cost:

  • Model provider if available
  • Model name if available
  • Token usage if available
  • Estimated cost if available

If goose OpenTelemetry or MLflow tracing is enabled locally, link to it or note its presence. Do not make it mandatory for the MVP.

Acceptance criteria

  • A documented goose dogfood command or guide exists for running an autonomous task locally.
  • The goose workflow preserves the existing worktree, lock, heartbeat, and local_only safety model.
  • At least one local task writes or updates a normalized .agent-task.json snapshot.
  • A local API or fixture reader exposes active agent task snapshots.
  • A UI surface shows active and recent agent work as actionable insights, not raw telemetry.
  • The UI highlights at least these states: needs review, validation failed, stale heartbeat, ready for local review, complete-local.
  • The implementation includes at least one test or fixture proving that a failed validation snapshot becomes a recommended human action.
  • Documentation explains how this MVP supports later utilization, impact, and cost measurement.

Out of scope for this MVP

  • Full cost attribution across all model providers
  • Enterprise policy engine
  • Remote hosted agent execution
  • Automatic PR merge
  • Long-term database persistence
  • Multi-user identity reconciliation
  • Full OpenTelemetry ingestion pipeline

Design guardrails

  • Local first.
  • Human approval by default for write-sensitive actions.
  • No individual productivity scoring.
  • No dashboard dumping.
  • Prefer actionable insight over raw events.
  • Treat agents as extensions of the developer or team directing the work.
  • Show uncertainty when cost, token, or model data is unavailable.

References

Suggested first dogfood task

Use goose to implement the smallest slice of this issue:

  1. Add a fixture .agent-task.json for a failed validation state.
  2. Add a local reader that loads the fixture.
  3. Add one UI card that renders Observation, Why it matters, and What to do.

That proves the IDP behavior before investing in deeper agent runtime integration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent-eligibleSuitable for autonomous AI agent processingaiAI domainenhancementNew feature or requestreadyTriaged and ready for work

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions