MVP: dogfood goose autonomous tasks as actionable IDP agent work insights

## Observation

The current `autonomous-task` skill already gives us a strong local operating model for AI-assisted development work:

- Plan -> review -> implement -> validate -> ship flow
- Isolated git worktrees per task
- Human or AI review gates
- Lock and heartbeat files for local session safety
- `local_only=true` mode for fully local runs
- OpenCode and Pi comparison hooks

This issue adds a goose dogfooding path and an IDP MVP that turns those task signals into actionable insight.

The goal is not to build another agent.

The goal is to prove the IDP can observe agent work, connect it to delivery outcomes, and answer: what should someone do differently because of this?

## Why it matters

AI coding tools create local productivity signals, but those signals are usually disconnected from delivery outcomes.

A developer, architect, or leader should be able to see:

- Which agent task is active
- Who or what initiated it
- What repo, issue, service, or system it affects
- Whether it is planning, implementing, validating, blocked, or complete
- What human review is needed
- What validation failed
- What cost, model, or token signals are available
- Whether the work created value, risk, or rework

This aligns with the newer AI measurement lens: utilization, impact, and cost. It also aligns with the DORA warning that AI amplifies the surrounding engineering system. More agent activity is not the outcome. The useful signal is whether agent activity improves delivery without increasing instability or downstream chaos.

## MVP scope

Dogfood goose as one supported agent runtime for the existing autonomous task workflow.

Build a small IDP surface that shows actionable agent work insight from local task execution.

This MVP should work locally first. GitHub and PR integration can be read or linked where available, but the first version should not require hosted infrastructure.

## Proposed user story

As a developer using the IDP repo,
I want to run an autonomous task through goose and see the task state in an IDP view,
so I can understand what the agent is doing, where it is blocked, what needs human review, and whether the work is ready to validate or ship.

## Proposed dogfood workflow

1. Install and configure goose locally.
2. Run goose against a small issue or free-form task using the existing `autonomous-task` skill behavior as the operating contract.
3. Keep the existing worktree, lock, heartbeat, and local-only safety model.
4. Capture a local task event stream or snapshot from the worktree.
5. Render the task as an actionable IDP insight.

Example task states:

- `worktree-claimed`
- `planning`
- `needs-review`
- `implementing`
- `impl-validation-failed`
- `impl-validated`
- `ship`
- `complete-local`
- `failed`
- `blocked`

## Proposed implementation shape

### 1. Add a goose launcher path

Add a small local script, make target, or package command that documents and runs the goose dogfood path.

Possible shape:

```bash
make agent-goose-task ISSUE_NUMBER=123 LOCAL_ONLY=true
make agent-goose-task TASK_DESCRIPTION="small docs update" LOCAL_ONLY=true
```

The launcher should:

- Create or reuse the task worktree
- Set the expected worktree path
- Preserve the `.agent-lock` and `.agent-heartbeat` convention
- Start goose in a mode suitable for local dogfooding
- Prefer approval or smart approval mode for write actions during early dogfooding
- Avoid requiring remote push or PR creation for the MVP

### 2. Add a normalized agent task snapshot

Create a small JSON snapshot file that can be read by the IDP.

Possible path:

```text
.agents/worktrees/<task-slug>/.agent-task.json
```

Possible schema:

```json
{
  "taskId": "issue-123-example-task",
  "tool": "goose",
  "runtime": "local",
  "issueNumber": 123,
  "taskDescription": "...",
  "worktreePath": ".agents/worktrees/issue-123-example-task",
  "branchName": "task/issue-123-example-task",
  "status": "impl-validation-failed",
  "phase": "validate",
  "humanActionRequired": true,
  "recommendedAction": "Review the validation failure and decide whether to let the agent retry.",
  "lastHeartbeatAt": "2026-05-16T00:00:00Z",
  "lastErrorSummary": "make check failed in contract tests",
  "lastCommitSha": null,
  "validationCommand": "make check",
  "modelProvider": null,
  "modelName": null,
  "tokenUsage": null,
  "estimatedCostUsd": null,
  "links": {
    "issueUrl": null,
    "prUrl": null
  }
}
```

This file is intentionally small. It should support the IDP view before deeper telemetry exists.

### 3. Add a local BFF endpoint or static fixture reader

Expose active local agent tasks through the existing local dev app path.

Possible endpoint:

```text
GET /api/agent-work/tasks
GET /api/agent-work/tasks/:taskId
```

The endpoint should read `.agent-task.json` and `.agent-heartbeat` files from known local worktree paths.

For the MVP, file polling is acceptable. Do not require a database.

### 4. Add an actionable IDP UI surface

Create a small Agent Work view.

The UI should not be a dashboard dump. It should show what needs attention.

Suggested cards:

- Needs review
- Validation failed
- Stale or possibly hung
- Ready for local review
- Completed recently

Each card should use the IDP insight shape:

```text
Observation
The goose task for issue 123 failed validation.

Why it matters
The implementation is not ready to ship and may need test or contract updates.

What to do
Review the failure summary, then retry implementation or mark the task blocked.
```

### 5. Add minimal outcome fields

The first version should capture enough fields to support later measurement:

Utilization:

- Tool used
- Task count
- Phase reached
- Human review required
- Retry count if available

Impact:

- Validation result
- PR URL if available
- Commit SHA if available
- Files changed count if available
- Linked issue number

Cost:

- Model provider if available
- Model name if available
- Token usage if available
- Estimated cost if available

If goose OpenTelemetry or MLflow tracing is enabled locally, link to it or note its presence. Do not make it mandatory for the MVP.

## Acceptance criteria

- [ ] A documented goose dogfood command or guide exists for running an autonomous task locally.
- [ ] The goose workflow preserves the existing worktree, lock, heartbeat, and `local_only` safety model.
- [ ] At least one local task writes or updates a normalized `.agent-task.json` snapshot.
- [ ] A local API or fixture reader exposes active agent task snapshots.
- [ ] A UI surface shows active and recent agent work as actionable insights, not raw telemetry.
- [ ] The UI highlights at least these states: needs review, validation failed, stale heartbeat, ready for local review, complete-local.
- [ ] The implementation includes at least one test or fixture proving that a failed validation snapshot becomes a recommended human action.
- [ ] Documentation explains how this MVP supports later utilization, impact, and cost measurement.

## Out of scope for this MVP

- Full cost attribution across all model providers
- Enterprise policy engine
- Remote hosted agent execution
- Automatic PR merge
- Long-term database persistence
- Multi-user identity reconciliation
- Full OpenTelemetry ingestion pipeline

## Design guardrails

- Local first.
- Human approval by default for write-sensitive actions.
- No individual productivity scoring.
- No dashboard dumping.
- Prefer actionable insight over raw events.
- Treat agents as extensions of the developer or team directing the work.
- Show uncertainty when cost, token, or model data is unavailable.

## References

- Existing skill: `.agents/skills/autonomous-task/SKILL.md`
- goose docs: https://goose-docs.ai/docs/quickstart/
- goose permissions: https://goose-docs.ai/docs/guides/goose-permissions/
- goose observability with MLflow and OTLP: https://goose-docs.ai/docs/tutorials/mlflow/

## Suggested first dogfood task

Use goose to implement the smallest slice of this issue:

1. Add a fixture `.agent-task.json` for a failed validation state.
2. Add a local reader that loads the fixture.
3. Add one UI card that renders Observation, Why it matters, and What to do.

That proves the IDP behavior before investing in deeper agent runtime integration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MVP: dogfood goose autonomous tasks as actionable IDP agent work insights #370

Observation

Why it matters

MVP scope

Proposed user story

Proposed dogfood workflow

Proposed implementation shape

1. Add a goose launcher path

2. Add a normalized agent task snapshot

3. Add a local BFF endpoint or static fixture reader

4. Add an actionable IDP UI surface

5. Add minimal outcome fields

Acceptance criteria

Out of scope for this MVP

Design guardrails

References

Suggested first dogfood task

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

MVP: dogfood goose autonomous tasks as actionable IDP agent work insights #370

Description

Observation

Why it matters

MVP scope

Proposed user story

Proposed dogfood workflow

Proposed implementation shape

1. Add a goose launcher path

2. Add a normalized agent task snapshot

3. Add a local BFF endpoint or static fixture reader

4. Add an actionable IDP UI surface

5. Add minimal outcome fields

Acceptance criteria

Out of scope for this MVP

Design guardrails

References

Suggested first dogfood task

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions