Skip to content

feat: add lightweight task tracking for multi-step execution#28

Open
Davidwadesmith wants to merge 2 commits into
LiuMengxuan04:mainfrom
Davidwadesmith:feature/task-tracking
Open

feat: add lightweight task tracking for multi-step execution#28
Davidwadesmith wants to merge 2 commits into
LiuMengxuan04:mainfrom
Davidwadesmith:feature/task-tracking

Conversation

@Davidwadesmith

Copy link
Copy Markdown
Contributor

Summary

Adds a task_tracker tool that lets the model self-manage a lightweight task list during long multi-step work. Addresses ROADMAP P1 #7.

What changed:

  • src/task-state.ts (new): TaskStatus/Task/TaskState types, state CRUD, snapshot serialization, formatTaskList
  • src/tools/task-tracker.ts (new): single tool with 4 actions — create, update_status, complete, list. Zod validation + JSON Schema for API
  • test/task-tracker.test.ts (new): 15 unit tests covering state transitions, snapshot round-trip, formatting
  • src/session.ts: task_snapshot JSONL event type, appendTaskSnapshot, loadTaskState (follows loadContextCollapseState pattern)
  • src/tty-app.ts: TaskState in TtyAppArgs, /tasks handler, post-task_tracker side effects (transcript entry + snapshot persist), resume/fork/new integration
  • src/tui/types.ts + src/tui/transcript.ts: task_update transcript entry kind with created/updated/completed icons
  • src/tui/chrome.ts: [tasks] N/M badge in header banner
  • src/cli-commands.ts: /tasks slash command
  • src/prompt.ts: task tracking instructions in system prompt
  • src/agent-loop.ts: pass toolInput to onToolResult callback (backward-compatible)
  • src/tools/index.ts + src/index.ts: wire TaskState through tool registry and TTY app

Why it stays lightweight:

  • ~170 lines of new production code across 2 new files
  • No new dependencies (uses existing zod)
  • Mutable-state-in-a-box pattern matches ContextCollapseState
  • Flat task list, no subtasks, no priority ordering
  • Session-scoped, no cross-session state

How it aligns with Claude Code:

  • Similar to Claude Code's TodoWrite tool concept
  • Same agent-self-management pattern (model creates/completes, user observes)
  • Follows existing MiniCode tool/session/TUI conventions exactly

Verification

  • npm run check passes (tsc --noEmit)
  • 167/167 tests pass (including 15 new task-tracker tests)
  • Smoke tested: model creates tasks, header badge updates, /tasks displays list, session resume restores tasks

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 9, 2026 07:47

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds session-scoped task tracking to MiniCode so the agent can create/update/complete/list a lightweight task list, surface it in the TUI (header badge + transcript entries), and persist/restore it via session JSONL snapshots.

Changes:

  • Introduces TaskState/snapshot utilities and a new task_tracker tool for managing tasks.
  • Persists task snapshots into session JSONL and restores task state on resume; adds /tasks command and TUI rendering/badge.
  • Extends agent loop callback plumbing to pass toolInput into onToolResult for richer UI side effects.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
test/task-tracker.test.ts Adds unit tests for task state transitions, snapshot round-trip, and formatting.
src/task-state.ts Implements TaskState CRUD, allowed transitions, snapshot serialization, and task list formatting.
src/tools/task-tracker.ts Adds the task_tracker tool (create/update_status/complete/list) with Zod validation + JSON schema.
src/tools/index.ts Wires task_tracker into the default tool registry when a TaskState is provided.
src/session.ts Adds task_snapshot session event, append + load helpers, and transcript representation on replay.
src/tty-app.ts Threads TaskState through TTY app, adds /tasks, renders task updates, and persists snapshots after tool use.
src/tui/types.ts Adds task_update transcript entry kind.
src/tui/transcript.ts Renders task_update entries with icons/colors.
src/tui/chrome.ts Adds [tasks] N/M header badge support.
src/cli-commands.ts Adds /tasks slash command metadata.
src/prompt.ts Updates system prompt with task tracking instructions for the agent.
src/agent-loop.ts Passes toolInput into onToolResult callback (backward compatible).
src/index.ts Creates initial TaskState and passes it into tool registry + TTY app.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/tty-app.ts
createContextCollapseState()
const loadedTaskState = await loadTaskState(args.cwd, sessionId)
if (loadedTaskState) {
args.taskState = loadedTaskState
Comment thread src/tty-app.ts
args.sessionId = crypto.randomUUID().slice(0, 8)
args.alreadySavedCount = 0
args.contextCollapseState = createContextCollapseState()
args.taskState = createTaskState()
Comment thread src/tty-app.ts
Comment on lines +1346 to +1348
state.transcriptScrollOffset = 0
void appendTaskSnapshot(args.cwd, args.sessionId, toSnapshot(args.taskState))
}
Comment thread src/tui/transcript.ts
: entry.action === 'completed' ? '✓'
: '→'
const color = entry.action === 'completed' ? GREEN : YELLOW
return `${DIM}task${RESET} ${color}${icon}${RESET} ${entry.taskSummary}`
Comment thread src/session.ts
import path from 'node:path'
import { MINI_CODE_PROJECTS_DIR } from './config.js'
import {
createTaskState,
@GateJustice

Copy link
Copy Markdown
Collaborator

Looks good overall. The design feels appropriately lightweight and fits the existing MiniCode session/TUI patterns.

Before merging, I’d like to confirm a few edge cases around state consistency:

  • Does loadTaskState always restore the latest valid task_snapshot, and tolerate older sessions with no snapshot or malformed snapshots?
  • Are task ids stable across createupdate_status / complete → snapshot → resume?
  • What is the intended behavior for invalid updates, e.g. unknown task id, completing an already completed task, or reopening a completed task?
  • Are the transcript entry and snapshot persistence ordered so that TUI-visible state cannot diverge from resume state?
  • Is the task_tracker tool covered in non-TTY/headless mode, not just through /tasks and TUI badge behavior?
  • Do the prompt instructions avoid overusing task tracking for trivial one-step requests?

If these are already covered, it would be helpful to mention them in the PR description. Otherwise I’d suggest adding targeted tests for latest-snapshot restore, invalid task id handling, completed-task behavior, and non-TTY task tracker usage.

@Davidwadesmith

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review. Answers inline:

1. loadTaskState robustness — Yes. It scans backwards to find the last compact_boundary, then forwards from there to collect the last task_snapshot (session.ts:482-499). Malformed lines are handled by parseEvent returning null which gets filtered out; the entire function is wrapped in try/catch returning null on unrecoverable errors. Old sessions without snapshots simply return null and the app creates a fresh empty TaskState.

2. Task ID stability — Yes. fromSnapshot restores both the tasks array and nextId counter from the snapshot (task-state.ts:78-83). Since addTask always consumes nextId and increments it before toSnapshot is called (the snapshot happens after the tool returns), IDs are monotonically increasing and guaranteed not to collide across resume. Covered by the existing toSnapshot/fromSnapshot round-trips state test.

3. Invalid transitionscompleted is a terminal state with no outgoing transitions (task-state.ts:26). All three cases return { ok: false, output: "..." }:

  • Unknown id → "Task #N not found"
  • Already completed → "Task #N is completed and cannot transition to completed"
  • Reopen completed → "Task #N is completed and cannot transition to pending"

Failed calls skip both transcript entry and snapshot persistence (!isError guard at tty-app.ts:1335). Covered by transitionTask tests for all three rejection cases.

4. Transcript/snapshot ordering — Transcript entry is pushed synchronously first, then appendTaskSnapshot is called fire-and-forget with .catch(). TUI state is always up-to-date in-memory. The snapshot write is best-effort — if it fails, the in-memory state is correct for the current session but would be lost on resume. Added .catch() in the follow-up commit to prevent unhandled promise rejections.

5. Non-TTY modetaskState is created before the TTY/non-TTY branch (index.ts:69), so the tool is registered and functional in both modes. However, the non-TTY path has no session persistence at all — no sessionId, no saveSession. Task snapshot persistence is therefore a TTY-only concern, consistent with the existing architecture where session management is part of the TTY app lifecycle. This is not a regression from the task tracking feature.

6. Prompt instructions — Added a guard in the follow-up commit: "Only use task_tracker when the work involves 3 or more distinct steps. Skip it for simple single-step requests." Combined with the existing "Keep tasks at a high level (3-10 items)" guidance, this should prevent overuse.

Follow-up commit: 6726a06
🤖 Generated with Claude Code

Davidwadesmith and others added 2 commits May 11, 2026 19:46
Add a task_tracker tool that lets the model self-manage a task list
during long multi-step work. Tasks persist with the session and are
visible via a header banner badge, transcript entries, and /tasks.

- New task-state.ts: TaskStatus/Task/TaskState types, CRUD + snapshot
- New task-tracker.ts tool: create/update_status/complete/list actions
- Session persistence: task_snapshot JSONL events, resume/fork support
- TUI: [tasks] N/M banner badge, task_update transcript entries
- /tasks slash command for full task list view
- System prompt instructions guiding model usage
- agent-loop: pass toolInput to onToolResult callback

ROADMAP P1 LiuMengxuan04#7. 167/167 tests pass, tsc --noEmit clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add .catch() to appendTaskSnapshot fire-and-forget to prevent
  unhandled promise rejection
- Add prompt guard: skip task_tracker for simple single-step requests
  (only use when work involves 3+ distinct steps)

Note: non-TTY headless mode has no session persistence at all (no
sessionId, no saveSession), so task snapshot persistence is a TTY-only
concern. This matches the existing architecture where session management
is part of the TTY app lifecycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Davidwadesmith Davidwadesmith force-pushed the feature/task-tracking branch from 6726a06 to d930c51 Compare May 11, 2026 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants