diff --git a/AGENTS.md b/AGENTS.md index d87217453..fcb8e96f8 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -44,6 +44,6 @@ bun run skill:check # health dashboard for all skills ## Key conventions - SKILL.md files are **generated** from `.tmpl` templates. Edit the template, not the output. -- Run `bun run gen:skill-docs --host codex` to regenerate Codex-specific output. +- Run `bun run gen:skill-docs --host codex` to regenerate Codex output (Copilot derives from this at setup time). - The browse binary provides headless browser access. Use `$B ` in skills. - Safety skills (careful, freeze, guard) use inline advisory prose — always confirm before destructive operations. diff --git a/CHANGELOG.md b/CHANGELOG.md index 0ed769e54..dbd496392 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,21 @@ # Changelog +## [0.11.10.0] - 2026-03-23 — Copilot CLI Support + +### Added + +- **gstack now works with GitHub Copilot CLI.** Run `./setup --host copilot` and all 28 skills install to `~/.copilot/skills/`. Works the same as Codex — skills live in `.agents/skills/` and are discovered automatically. +- **Auto-detection finds Copilot CLI.** `./setup --host auto` now detects `copilot` alongside Claude Code, Codex, and Kiro. Install once, works everywhere. +- **Session discovery includes Copilot.** The global discover tool (`bin/gstack-global-discover.ts`) scans `~/.copilot/session-state/` so `/retro` and cross-project dashboards count Copilot sessions. +- **Health checks cover Copilot.** `bun run skill:check` and CI now verify Copilot skill freshness alongside Claude and Codex. + +### For contributors + +- Added `'copilot'` host type to `gen-skill-docs.ts`, `setup`, `gstack-global-discover.ts`, and `skill-check.ts`. +- New E2E test infrastructure: `copilot-e2e.test.ts` and `copilot-session-runner.ts` paralleling Codex equivalents. +- Updated `CONTRIBUTING.md` "Dual-host" → "Multi-host" with Copilot generation commands and testing guidance. +- Targets the standalone GA Copilot CLI (`copilot` binary via `npm install -g @github/copilot`), not the legacy `gh copilot` extension. + ## [0.11.9.0] - 2026-03-23 — Codex Skill Loading Fix ### Fixed diff --git a/CLAUDE.md b/CLAUDE.md index 5c0389c1f..bc603356a 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -22,6 +22,7 @@ bun run eval:summary # aggregate stats across all eval runs `test:evals` requires `ANTHROPIC_API_KEY`. Codex E2E tests (`test/codex-e2e.test.ts`) use Codex's own auth from `~/.codex/` config — no `OPENAI_API_KEY` env var needed. +Copilot E2E tests (`test/copilot-e2e.test.ts`) require the standalone Copilot CLI (`npm install -g @github/copilot`). E2E tests stream progress in real-time (tool-by-tool via `--output-format stream-json --verbose`). Results are persisted to `~/.gstack-dev/evals/` with auto-comparison against the previous run. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index c4c315716..4f43b88c2 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -213,11 +213,11 @@ SKILL.md files are **generated** from `.tmpl` templates. Don't edit the `.md` di # 1. Edit the template vim SKILL.md.tmpl # or browse/SKILL.md.tmpl -# 2. Regenerate for both hosts +# 2. Regenerate for all hosts bun run gen:skill-docs bun run gen:skill-docs --host codex -# 3. Check health (reports both Claude and Codex) +# 3. Check health (reports Claude and Codex; Copilot derives from Codex at setup time) bun run skill:check # Or use watch mode — auto-regenerates on save @@ -228,17 +228,17 @@ For template authoring best practices (natural language over bash-isms, dynamic To add a browse command, add it to `browse/src/commands.ts`. To add a snapshot flag, add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts`. Then rebuild. -## Dual-host development (Claude + Codex) +## Multi-host development (Claude + Codex + Copilot) -gstack generates SKILL.md files for two hosts: **Claude** (`.claude/skills/`) and **Codex** (`.agents/skills/`). Every template change needs to be generated for both. +gstack generates SKILL.md files for multiple hosts: **Claude** (`.claude/skills/`), **Codex** (`.agents/skills/`), and **Copilot** (`.agents/skills/`). Codex and Copilot share the same `.agents/skills/` output directory — only `--host codex` needs to be run during generation. Copilot-specific paths are rewritten at `setup --host copilot` time via sed. Every template change needs to be generated for Claude and Codex. -### Generating for both hosts +### Generating for all hosts ```bash # Generate Claude output (default) bun run gen:skill-docs -# Generate Codex output +# Generate Codex output (Copilot shares this — paths are rewritten at setup time) bun run gen:skill-docs --host codex # --host agents is an alias for --host codex @@ -248,37 +248,37 @@ bun run build ### What changes between hosts -| Aspect | Claude | Codex | -|--------|--------|-------| -| Output directory | `{skill}/SKILL.md` | `.agents/skills/gstack-{skill}/SKILL.md` (generated at setup, gitignored) | -| Frontmatter | Full (name, description, allowed-tools, hooks, version) | Minimal (name + description only) | -| Paths | `~/.claude/skills/gstack` | `$GSTACK_ROOT` (`.agents/skills/gstack` in a repo, otherwise `~/.codex/skills/gstack`) | -| Hook skills | `hooks:` frontmatter (enforced by Claude) | Inline safety advisory prose (advisory only) | -| `/codex` skill | Included (Claude wraps codex exec) | Excluded (self-referential) | +| Aspect | Claude | Codex | Copilot | +|--------|--------|-------|---------| +| Output directory | `{skill}/SKILL.md` | `.agents/skills/gstack-{skill}/SKILL.md` (generated at setup, gitignored) | `.agents/skills/gstack-{skill}/SKILL.md` (generated at setup, gitignored) | +| Frontmatter | Full (name, description, allowed-tools, hooks, version) | Minimal (name + description only) | Minimal (name + description only) | +| Paths | `~/.claude/skills/gstack` | `$GSTACK_ROOT` (`.agents/skills/gstack` in a repo, otherwise `~/.codex/skills/gstack`) | `$GSTACK_ROOT` (`.agents/skills/gstack` in a repo, otherwise `~/.copilot/skills/gstack`) | +| Hook skills | `hooks:` frontmatter (enforced by Claude) | Inline safety advisory prose (advisory only) | Inline safety advisory prose (advisory only) | +| `/codex` skill | Included (Claude wraps codex exec) | Excluded (self-referential) | Excluded (shares Codex output) | -### Testing Codex output +### Testing Codex and Copilot output ```bash -# Run all static tests (includes Codex validation) +# Run all static tests (includes Codex and Copilot validation) bun test -# Check freshness for both hosts +# Check freshness for all hosts bun run gen:skill-docs --dry-run bun run gen:skill-docs --host codex --dry-run -# Health dashboard covers both hosts +# Health dashboard covers all hosts bun run skill:check ``` ### Dev setup for .agents/ -When you run `bin/dev-setup`, it creates symlinks in both `.claude/skills/` and `.agents/skills/` (if applicable), so Codex-compatible agents can discover your dev skills too. The `.agents/` directory is generated at setup time from `.tmpl` templates — it is gitignored and not committed. +When you run `bin/dev-setup`, it creates symlinks in both `.claude/skills/` and `.agents/skills/` (if applicable), so Codex, Copilot, and other compatible agents can discover your dev skills too. The `.agents/` directory is generated at setup time from `.tmpl` templates — it is gitignored and not committed. ### Adding a new skill -When you add a new skill template, both hosts get it automatically: +When you add a new skill template, all hosts get it automatically: 1. Create `{skill}/SKILL.md.tmpl` -2. Run `bun run gen:skill-docs` (Claude output) and `bun run gen:skill-docs --host codex` (Codex output) +2. Run `bun run gen:skill-docs` (Claude output) and `bun run gen:skill-docs --host codex` (Codex output — Copilot derives from this at setup time) 3. The dynamic template discovery picks it up — no static list to update 4. Commit `{skill}/SKILL.md` — `.agents/` is generated at setup time and gitignored diff --git a/README.md b/README.md index 253d54252..6bad1e848 100644 --- a/README.md +++ b/README.md @@ -54,7 +54,7 @@ Open Claude Code and paste this. Claude does the rest. Real files get committed to your repo (not a submodule), so `git clone` just works. Everything lives inside `.claude/`. Nothing touches your PATH or runs in the background. -### Codex, Gemini CLI, or Cursor +### Codex, Gemini CLI, Copilot CLI, or Cursor gstack works on any agent that supports the [SKILL.md standard](https://github.com/anthropics/claude-code). Skills live in `.agents/skills/` and are discovered automatically. @@ -62,20 +62,21 @@ Install to one repo: ```bash git clone https://github.com/garrytan/gstack.git .agents/skills/gstack -cd .agents/skills/gstack && ./setup --host codex +cd .agents/skills/gstack && ./setup --host codex # or --host copilot ``` -When setup runs from `.agents/skills/gstack`, it installs the generated Codex skills next to it in the same repo and does not write to `~/.codex/skills`. +When setup runs from `.agents/skills/gstack`, it installs the generated Codex and Copilot skills next to it in the same repo and does not write to `~/.codex/skills` or `~/.copilot/skills`. Install once for your user account: ```bash git clone https://github.com/garrytan/gstack.git ~/gstack -cd ~/gstack && ./setup --host codex +cd ~/gstack && ./setup --host codex # or --host copilot ``` `setup --host codex` creates the runtime root at `~/.codex/skills/gstack` and -links the generated Codex skills at the top level. This avoids duplicate skill +links the generated Codex skills at the top level. `setup --host copilot` does +the same at `~/.copilot/skills/gstack`. This avoids duplicate skill discovery from the source repo checkout. Or let setup auto-detect which agents you have installed: @@ -85,7 +86,7 @@ git clone https://github.com/garrytan/gstack.git ~/gstack cd ~/gstack && ./setup --host auto ``` -For Codex-compatible hosts, setup now supports both repo-local installs from `.agents/skills/gstack` and user-global installs from `~/.codex/skills/gstack`. All 28 skills work across all supported agents. Hook-based safety skills (careful, freeze, guard) use inline safety advisory prose on non-Claude hosts. +For Codex, Copilot, and other compatible hosts, setup supports both repo-local installs from `.agents/skills/gstack` and user-global installs from `~/.codex/skills/gstack` or `~/.copilot/skills/gstack`. All 28 skills work across all supported agents. Hook-based safety skills (careful, freeze, guard) use inline safety advisory prose on non-Claude hosts. ## See it work @@ -156,7 +157,7 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan- | `/canary` | **SRE** | Post-deploy monitoring loop. Watches for console errors, performance regressions, and page failures. | | `/benchmark` | **Performance Engineer** | Baseline page load times, Core Web Vitals, and resource sizes. Compare before/after on every PR. | | `/document-release` | **Technical Writer** | Update all project docs to match what you just shipped. Catches stale READMEs automatically. | -| `/retro` | **Eng Manager** | Team-aware weekly retro. Per-person breakdowns, shipping streaks, test health trends, growth opportunities. `/retro global` runs across all your projects and AI tools (Claude Code, Codex, Gemini). | +| `/retro` | **Eng Manager** | Team-aware weekly retro. Per-person breakdowns, shipping streaks, test health trends, growth opportunities. `/retro global` runs across all your projects and AI tools (Claude Code, Codex, Gemini, Copilot). | | `/browse` | **QA Engineer** | Real Chromium browser, real clicks, real screenshots. ~100ms per command. | | `/setup-browser-cookies` | **Session Manager** | Import cookies from your real browser (Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages. | | `/autoplan` | **Review Pipeline** | One command, fully reviewed plan. Runs CEO → design → eng review automatically with encoded decision principles. Surfaces only taste decisions for your approval. | diff --git a/VERSION b/VERSION index b1d9a7913..6bfbae754 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.11.9.0 +0.11.10.0 diff --git a/bin/gstack-global-discover.ts b/bin/gstack-global-discover.ts index e6c64f561..2e1670041 100644 --- a/bin/gstack-global-discover.ts +++ b/bin/gstack-global-discover.ts @@ -1,6 +1,6 @@ #!/usr/bin/env bun /** - * gstack-global-discover — Discover AI coding sessions across Claude Code, Codex CLI, and Gemini CLI. + * gstack-global-discover — Discover AI coding sessions across Claude Code, Codex CLI, Gemini CLI, and Copilot CLI. * Resolves each session's working directory to a git repo, deduplicates by normalized remote URL, * and outputs structured JSON to stdout. * @@ -17,7 +17,7 @@ import { homedir } from "os"; // ── Types ────────────────────────────────────────────────────────────────── interface Session { - tool: "claude_code" | "codex" | "gemini"; + tool: "claude_code" | "codex" | "gemini" | "copilot"; cwd: string; } @@ -25,7 +25,7 @@ interface Repo { name: string; remote: string; paths: string[]; - sessions: { claude_code: number; codex: number; gemini: number }; + sessions: { claude_code: number; codex: number; gemini: number; copilot: number }; } interface DiscoveryResult { @@ -36,6 +36,7 @@ interface DiscoveryResult { claude_code: { total_sessions: number; repos: number }; codex: { total_sessions: number; repos: number }; gemini: { total_sessions: number; repos: number }; + copilot: { total_sessions: number; repos: number }; }; total_sessions: number; total_repos: number; @@ -440,7 +441,69 @@ function scanGemini(since: Date): Session[] { return sessions; } -// ── Deduplication ────────────────────────────────────────────────────────── +function scanCopilot(since: Date): Session[] { + // GitHub Copilot CLI (standalone) stores session data in ~/.copilot/session-state/ + // Each session is a subdirectory: ~/.copilot/session-state/{session-id}/ + // containing events.jsonl, workspace.yaml, and other session files. + const sessionStateDir = join(homedir(), ".copilot", "session-state"); + if (!existsSync(sessionStateDir)) return []; + + const sessions: Session[] = []; + + try { + const sessionDirs = readdirSync(sessionStateDir); + for (const sessionId of sessionDirs) { + const sessionDir = join(sessionStateDir, sessionId); + try { + const dirStat = statSync(sessionDir); + if (!dirStat.isDirectory() || dirStat.mtime < since) continue; + + // Look for session files within the subdirectory + const sessionFiles = readdirSync(sessionDir); + let found = false; + + for (const file of sessionFiles) { + if (found) break; + const filePath = join(sessionDir, file); + try { + const fileStat = statSync(filePath); + if (!fileStat.isFile()) continue; + if (!file.endsWith(".json") && !file.endsWith(".jsonl") && !file.endsWith(".yaml")) continue; + + const fd = openSync(filePath, "r"); + const buf = Buffer.alloc(4096); + const bytesRead = readSync(fd, buf, 0, 4096, 0); + closeSync(fd); + const text = buf.toString("utf-8", 0, bytesRead); + + // Look for cwd in JSON/JSONL metadata + for (const line of text.split("\n").slice(0, 15)) { + if (!line.trim()) continue; + try { + const obj = JSON.parse(line); + if (obj.cwd && existsSync(obj.cwd)) { + sessions.push({ tool: "copilot", cwd: obj.cwd }); + found = true; + break; + } + } catch { + continue; + } + } + } catch { + continue; + } + } + } catch { + continue; + } + } + } catch { + // Directory read error + } + + return sessions; +} async function resolveAndDeduplicate(sessions: Session[]): Promise { // Group sessions by cwd @@ -496,7 +559,7 @@ async function resolveAndDeduplicate(sessions: Session[]): Promise { } } - const sessionCounts = { claude_code: 0, codex: 0, gemini: 0 }; + const sessionCounts = { claude_code: 0, codex: 0, gemini: 0, copilot: 0 }; for (const s of data.sessions) { sessionCounts[s.tool]++; } @@ -512,8 +575,8 @@ async function resolveAndDeduplicate(sessions: Session[]): Promise { // Sort by total sessions descending repos.sort( (a, b) => - b.sessions.claude_code + b.sessions.codex + b.sessions.gemini - - (a.sessions.claude_code + a.sessions.codex + a.sessions.gemini) + b.sessions.claude_code + b.sessions.codex + b.sessions.gemini + b.sessions.copilot - + (a.sessions.claude_code + a.sessions.codex + a.sessions.gemini + a.sessions.copilot) ); return repos; @@ -530,12 +593,13 @@ async function main() { const ccSessions = scanClaudeCode(sinceDate); const codexSessions = scanCodex(sinceDate); const geminiSessions = scanGemini(sinceDate); + const copilotSessions = scanCopilot(sinceDate); - const allSessions = [...ccSessions, ...codexSessions, ...geminiSessions]; + const allSessions = [...ccSessions, ...codexSessions, ...geminiSessions, ...copilotSessions]; // Summary to stderr console.error( - `Discovered: ${ccSessions.length} CC sessions, ${codexSessions.length} Codex sessions, ${geminiSessions.length} Gemini sessions` + `Discovered: ${ccSessions.length} CC sessions, ${codexSessions.length} Codex sessions, ${geminiSessions.length} Gemini sessions, ${copilotSessions.length} Copilot sessions` ); // Deduplicate @@ -547,6 +611,7 @@ async function main() { const ccRepos = new Set(repos.filter((r) => r.sessions.claude_code > 0).map((r) => r.remote)).size; const codexRepos = new Set(repos.filter((r) => r.sessions.codex > 0).map((r) => r.remote)).size; const geminiRepos = new Set(repos.filter((r) => r.sessions.gemini > 0).map((r) => r.remote)).size; + const copilotRepos = new Set(repos.filter((r) => r.sessions.copilot > 0).map((r) => r.remote)).size; const result: DiscoveryResult = { window: since, @@ -556,6 +621,7 @@ async function main() { claude_code: { total_sessions: ccSessions.length, repos: ccRepos }, codex: { total_sessions: codexSessions.length, repos: codexRepos }, gemini: { total_sessions: geminiSessions.length, repos: geminiRepos }, + copilot: { total_sessions: copilotSessions.length, repos: copilotRepos }, }, total_sessions: allSessions.length, total_repos: repos.length, @@ -566,15 +632,16 @@ async function main() { } else { // Summary format console.log(`Window: ${since} (since ${startDate})`); - console.log(`Sessions: ${allSessions.length} total (CC: ${ccSessions.length}, Codex: ${codexSessions.length}, Gemini: ${geminiSessions.length})`); + console.log(`Sessions: ${allSessions.length} total (CC: ${ccSessions.length}, Codex: ${codexSessions.length}, Gemini: ${geminiSessions.length}, Copilot: ${copilotSessions.length})`); console.log(`Repos: ${repos.length} unique`); console.log(""); for (const repo of repos) { - const total = repo.sessions.claude_code + repo.sessions.codex + repo.sessions.gemini; + const total = repo.sessions.claude_code + repo.sessions.codex + repo.sessions.gemini + repo.sessions.copilot; const tools = []; if (repo.sessions.claude_code > 0) tools.push(`CC:${repo.sessions.claude_code}`); if (repo.sessions.codex > 0) tools.push(`Codex:${repo.sessions.codex}`); if (repo.sessions.gemini > 0) tools.push(`Gemini:${repo.sessions.gemini}`); + if (repo.sessions.copilot > 0) tools.push(`Copilot:${repo.sessions.copilot}`); console.log(` ${repo.name} (${total} sessions) — ${tools.join(", ")}`); console.log(` Remote: ${repo.remote}`); console.log(` Paths: ${repo.paths.join(", ")}`); diff --git a/package.json b/package.json index b24b52535..a7096f4f1 100644 --- a/package.json +++ b/package.json @@ -1,7 +1,7 @@ { "name": "gstack", "version": "0.11.9.0", - "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", + "description": "Garry's Stack — Claude Code / Codex / Copilot CLI skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module", "bin": { @@ -12,16 +12,18 @@ "gen:skill-docs": "bun run scripts/gen-skill-docs.ts", "dev": "bun run browse/src/cli.ts", "server": "bun run browse/src/server.ts", - "test": "bun test browse/test/ test/ --ignore 'test/skill-e2e-*.test.ts' --ignore test/skill-llm-eval.test.ts --ignore test/skill-routing-e2e.test.ts --ignore test/codex-e2e.test.ts --ignore test/gemini-e2e.test.ts", - "test:evals": "EVALS=1 bun test --retry 2 --concurrent --max-concurrency ${EVALS_CONCURRENCY:-15} test/skill-llm-eval.test.ts test/skill-e2e-*.test.ts test/skill-routing-e2e.test.ts test/codex-e2e.test.ts test/gemini-e2e.test.ts", - "test:evals:all": "EVALS=1 EVALS_ALL=1 bun test --retry 2 --concurrent --max-concurrency ${EVALS_CONCURRENCY:-15} test/skill-llm-eval.test.ts test/skill-e2e-*.test.ts test/skill-routing-e2e.test.ts test/codex-e2e.test.ts test/gemini-e2e.test.ts", - "test:e2e": "EVALS=1 bun test --retry 2 --concurrent --max-concurrency ${EVALS_CONCURRENCY:-15} test/skill-e2e-*.test.ts test/skill-routing-e2e.test.ts test/codex-e2e.test.ts test/gemini-e2e.test.ts", - "test:e2e:all": "EVALS=1 EVALS_ALL=1 bun test --retry 2 --concurrent --max-concurrency ${EVALS_CONCURRENCY:-15} test/skill-e2e-*.test.ts test/skill-routing-e2e.test.ts test/codex-e2e.test.ts test/gemini-e2e.test.ts", + "test": "bun test browse/test/ test/ --ignore 'test/skill-e2e-*.test.ts' --ignore test/skill-llm-eval.test.ts --ignore test/skill-routing-e2e.test.ts --ignore test/codex-e2e.test.ts --ignore test/gemini-e2e.test.ts --ignore test/copilot-e2e.test.ts", + "test:evals": "EVALS=1 bun test --retry 2 --concurrent --max-concurrency ${EVALS_CONCURRENCY:-15} test/skill-llm-eval.test.ts test/skill-e2e-*.test.ts test/skill-routing-e2e.test.ts test/codex-e2e.test.ts test/gemini-e2e.test.ts test/copilot-e2e.test.ts", + "test:evals:all": "EVALS=1 EVALS_ALL=1 bun test --retry 2 --concurrent --max-concurrency ${EVALS_CONCURRENCY:-15} test/skill-llm-eval.test.ts test/skill-e2e-*.test.ts test/skill-routing-e2e.test.ts test/codex-e2e.test.ts test/gemini-e2e.test.ts test/copilot-e2e.test.ts", + "test:e2e": "EVALS=1 bun test --retry 2 --concurrent --max-concurrency ${EVALS_CONCURRENCY:-15} test/skill-e2e-*.test.ts test/skill-routing-e2e.test.ts test/codex-e2e.test.ts test/gemini-e2e.test.ts test/copilot-e2e.test.ts", + "test:e2e:all": "EVALS=1 EVALS_ALL=1 bun test --retry 2 --concurrent --max-concurrency ${EVALS_CONCURRENCY:-15} test/skill-e2e-*.test.ts test/skill-routing-e2e.test.ts test/codex-e2e.test.ts test/gemini-e2e.test.ts test/copilot-e2e.test.ts", "test:e2e:fast": "EVALS=1 EVALS_FAST=1 bun test --retry 2 --concurrent --max-concurrency ${EVALS_CONCURRENCY:-15} test/skill-e2e-*.test.ts test/skill-routing-e2e.test.ts", "test:codex": "EVALS=1 bun test test/codex-e2e.test.ts", "test:codex:all": "EVALS=1 EVALS_ALL=1 bun test test/codex-e2e.test.ts", "test:gemini": "EVALS=1 bun test test/gemini-e2e.test.ts", "test:gemini:all": "EVALS=1 EVALS_ALL=1 bun test test/gemini-e2e.test.ts", + "test:copilot": "EVALS=1 bun test test/copilot-e2e.test.ts", + "test:copilot:all": "EVALS=1 EVALS_ALL=1 bun test test/copilot-e2e.test.ts", "skill:check": "bun run scripts/skill-check.ts", "dev:skill": "bun run scripts/dev-skill.ts", "start": "bun run browse/src/server.ts", diff --git a/scripts/gen-skill-docs.ts b/scripts/gen-skill-docs.ts index 340dbb3ca..2ad2ac890 100644 --- a/scripts/gen-skill-docs.ts +++ b/scripts/gen-skill-docs.ts @@ -19,7 +19,7 @@ const DRY_RUN = process.argv.includes('--dry-run'); // ─── Template Context ─────────────────────────────────────── -type Host = 'claude' | 'codex'; +type Host = 'claude' | 'codex' | 'copilot'; const OPENAI_SHORT_DESCRIPTION_LIMIT = 120; const HOST_ARG = process.argv.find(a => a.startsWith('--host')); @@ -27,8 +27,9 @@ const HOST: Host = (() => { if (!HOST_ARG) return 'claude'; const val = HOST_ARG.includes('=') ? HOST_ARG.split('=')[1] : process.argv[process.argv.indexOf(HOST_ARG) + 1]; if (val === 'codex' || val === 'agents') return 'codex'; + if (val === 'copilot') return 'copilot'; if (val === 'claude') return 'claude'; - throw new Error(`Unknown host: ${val}. Use claude, codex, or agents.`); + throw new Error(`Unknown host: ${val}. Use claude, codex, copilot, or agents.`); })(); interface HostPaths { @@ -51,6 +52,12 @@ const HOST_PATHS: Record = { binDir: '$GSTACK_BIN', browseDir: '$GSTACK_BROWSE', }, + copilot: { + skillRoot: '$GSTACK_ROOT', + localSkillRoot: '.agents/skills/gstack', + binDir: '$GSTACK_BIN', + browseDir: '$GSTACK_BROWSE', + }, }; interface TemplateContext { @@ -183,6 +190,13 @@ GSTACK_ROOT="$HOME/.codex/skills/gstack" [ -n "$_ROOT" ] && [ -d "$_ROOT/.agents/skills/gstack" ] && GSTACK_ROOT="$_ROOT/.agents/skills/gstack" GSTACK_BIN="$GSTACK_ROOT/bin" GSTACK_BROWSE="$GSTACK_ROOT/browse/dist" +` + : ctx.host === 'copilot' + ? `_ROOT=$(git rev-parse --show-toplevel 2>/dev/null) +GSTACK_ROOT="$HOME/.copilot/skills/gstack" +[ -n "$_ROOT" ] && [ -d "$_ROOT/.agents/skills/gstack" ] && GSTACK_ROOT="$_ROOT/.agents/skills/gstack" +GSTACK_BIN="$GSTACK_ROOT/bin" +GSTACK_BROWSE="$GSTACK_ROOT/browse/dist" ` : ''; @@ -2969,8 +2983,8 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath: // Determine skill directory relative to ROOT const skillDir = path.relative(ROOT, path.dirname(tmplPath)); - // For codex host, route output to .agents/skills/{codexSkillName}/SKILL.md - if (host === 'codex') { + // For non-Claude hosts (codex, copilot), route output to .agents/skills/{codexSkillName}/SKILL.md + if (host === 'codex' || host === 'copilot') { const codexName = codexSkillName(skillDir === '.' ? '' : skillDir); outputDir = path.join(ROOT, '.agents', 'skills', codexName); fs.mkdirSync(outputDir, { recursive: true }); @@ -3002,8 +3016,8 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath: throw new Error(`Unresolved placeholders in ${relTmplPath}: ${remaining.join(', ')}`); } - // For codex host: transform frontmatter and replace Claude-specific paths - if (host === 'codex') { + // For non-Claude hosts (codex, copilot): transform frontmatter and replace Claude-specific paths + if (host === 'codex' || host === 'copilot') { // Extract hook safety prose BEFORE transforming frontmatter (which strips hooks) const safetyProse = extractHookSafetyProse(tmplContent); @@ -3063,8 +3077,8 @@ function findTemplates(): string[] { let hasChanges = false; for (const tmplPath of findTemplates()) { - // Skip /codex skill for codex host (self-referential — it's a Claude wrapper around codex exec) - if (HOST === 'codex') { + // Skip /codex skill for non-Claude hosts (self-referential — it's a Claude wrapper around codex exec) + if (HOST === 'codex' || HOST === 'copilot') { const dir = path.basename(path.dirname(tmplPath)); if (dir === 'codex') continue; } diff --git a/setup b/setup index 4d7d29c01..d23296d8c 100755 --- a/setup +++ b/setup @@ -1,5 +1,5 @@ #!/usr/bin/env bash -# gstack setup — build browser binary + register skills with Claude Code / Codex +# gstack setup — build browser binary + register skills with Claude Code / Codex / Copilot set -e if ! command -v bun >/dev/null 2>&1; then @@ -14,6 +14,8 @@ INSTALL_SKILLS_DIR="$(dirname "$INSTALL_GSTACK_DIR")" BROWSE_BIN="$SOURCE_GSTACK_DIR/browse/dist/browse" CODEX_SKILLS="$HOME/.codex/skills" CODEX_GSTACK="$CODEX_SKILLS/gstack" +COPILOT_SKILLS="$HOME/.copilot/skills" +COPILOT_GSTACK="$COPILOT_SKILLS/gstack" IS_WINDOWS=0 case "$(uname -s)" in @@ -24,33 +26,37 @@ esac HOST="claude" while [ $# -gt 0 ]; do case "$1" in - --host) [ -z "$2" ] && echo "Missing value for --host (expected claude, codex, kiro, or auto)" >&2 && exit 1; HOST="$2"; shift 2 ;; + --host) [ -z "$2" ] && echo "Missing value for --host (expected claude, codex, copilot, kiro, or auto)" >&2 && exit 1; HOST="$2"; shift 2 ;; --host=*) HOST="${1#--host=}"; shift ;; *) shift ;; esac done case "$HOST" in - claude|codex|kiro|auto) ;; - *) echo "Unknown --host value: $HOST (expected claude, codex, kiro, or auto)" >&2; exit 1 ;; + claude|codex|copilot|kiro|auto) ;; + *) echo "Unknown --host value: $HOST (expected claude, codex, copilot, kiro, or auto)" >&2; exit 1 ;; esac # For auto: detect which agents are installed INSTALL_CLAUDE=0 INSTALL_CODEX=0 +INSTALL_COPILOT=0 INSTALL_KIRO=0 if [ "$HOST" = "auto" ]; then command -v claude >/dev/null 2>&1 && INSTALL_CLAUDE=1 command -v codex >/dev/null 2>&1 && INSTALL_CODEX=1 + command -v copilot >/dev/null 2>&1 && INSTALL_COPILOT=1 command -v kiro-cli >/dev/null 2>&1 && INSTALL_KIRO=1 # If none found, default to claude - if [ "$INSTALL_CLAUDE" -eq 0 ] && [ "$INSTALL_CODEX" -eq 0 ] && [ "$INSTALL_KIRO" -eq 0 ]; then + if [ "$INSTALL_CLAUDE" -eq 0 ] && [ "$INSTALL_CODEX" -eq 0 ] && [ "$INSTALL_COPILOT" -eq 0 ] && [ "$INSTALL_KIRO" -eq 0 ]; then INSTALL_CLAUDE=1 fi elif [ "$HOST" = "claude" ]; then INSTALL_CLAUDE=1 elif [ "$HOST" = "codex" ]; then INSTALL_CODEX=1 +elif [ "$HOST" = "copilot" ]; then + INSTALL_COPILOT=1 elif [ "$HOST" = "kiro" ]; then INSTALL_KIRO=1 fi @@ -128,11 +134,12 @@ if [ ! -x "$BROWSE_BIN" ]; then exit 1 fi -# 1b. Generate .agents/ Codex skill docs — always regenerate to prevent stale descriptions. +# 1b. Generate .agents/ skill docs — always regenerate to prevent stale descriptions. # .agents/ is no longer committed — generated at setup time from .tmpl templates. # bun run build already does this, but we need it when NEEDS_BUILD=0 (binary is fresh). # Always regenerate: generation is fast (<2s) and mtime-based staleness checks are fragile # (miss stale files when timestamps match after clone/checkout/upgrade). +# Codex, Copilot, and other .agents/ hosts share .agents/skills/ — generate once with --host codex. AGENTS_DIR="$SOURCE_GSTACK_DIR/.agents/skills" NEEDS_AGENTS_GEN=1 @@ -421,14 +428,68 @@ if [ "$INSTALL_KIRO" -eq 1 ]; then fi fi -# 7. Create .agents/ sidecar symlinks for the real Codex skill target. -# The root Codex skill ends up pointing at $SOURCE_GSTACK_DIR/.agents/skills/gstack, +# 7. Install for Copilot CLI (copy from .agents/skills, rewrite paths) +if [ "$INSTALL_COPILOT" -eq 1 ]; then + AGENTS_DIR="$SOURCE_GSTACK_DIR/.agents/skills" + mkdir -p "$COPILOT_SKILLS" + + # Create gstack dir with symlinks for runtime assets, copy+sed for SKILL.md + # Remove old whole-dir symlink from previous installs + [ -L "$COPILOT_GSTACK" ] && rm -f "$COPILOT_GSTACK" + mkdir -p "$COPILOT_GSTACK" "$COPILOT_GSTACK/browse" "$COPILOT_GSTACK/gstack-upgrade" "$COPILOT_GSTACK/review" + ln -snf "$SOURCE_GSTACK_DIR/bin" "$COPILOT_GSTACK/bin" + ln -snf "$SOURCE_GSTACK_DIR/browse/dist" "$COPILOT_GSTACK/browse/dist" + ln -snf "$SOURCE_GSTACK_DIR/browse/bin" "$COPILOT_GSTACK/browse/bin" + # ETHOS.md — referenced by "Search Before Building" in all skill preambles + if [ -f "$SOURCE_GSTACK_DIR/ETHOS.md" ]; then + ln -snf "$SOURCE_GSTACK_DIR/ETHOS.md" "$COPILOT_GSTACK/ETHOS.md" + fi + # gstack-upgrade skill + if [ -f "$AGENTS_DIR/gstack-upgrade/SKILL.md" ]; then + ln -snf "$AGENTS_DIR/gstack-upgrade/SKILL.md" "$COPILOT_GSTACK/gstack-upgrade/SKILL.md" + fi + # Review runtime assets (individual files, not whole dir) + for f in checklist.md design-checklist.md greptile-triage.md TODOS-format.md; do + if [ -f "$SOURCE_GSTACK_DIR/review/$f" ]; then + ln -snf "$SOURCE_GSTACK_DIR/review/$f" "$COPILOT_GSTACK/review/$f" + fi + done + + # Rewrite root SKILL.md paths for Copilot + sed -e "s|~/.claude/skills/gstack|~/.copilot/skills/gstack|g" \ + -e "s|\.claude/skills/gstack|.copilot/skills/gstack|g" \ + -e "s|\.claude/skills|.copilot/skills|g" \ + "$SOURCE_GSTACK_DIR/SKILL.md" > "$COPILOT_GSTACK/SKILL.md" + + if [ ! -d "$AGENTS_DIR" ]; then + echo " warning: no .agents/skills/ directory found — run 'bun run build' first" >&2 + else + for skill_dir in "$AGENTS_DIR"/gstack*/; do + [ -f "$skill_dir/SKILL.md" ] || continue + skill_name="$(basename "$skill_dir")" + target_dir="$COPILOT_SKILLS/$skill_name" + mkdir -p "$target_dir" + # Generated Codex skills use $HOME/.codex (not ~/), plus $GSTACK_ROOT variables. + # Rewrite the default GSTACK_ROOT value and any remaining literal paths. + sed -e 's|\$HOME/.codex/skills/gstack|$HOME/.copilot/skills/gstack|g' \ + -e "s|~/.codex/skills/gstack|~/.copilot/skills/gstack|g" \ + -e "s|~/.claude/skills/gstack|~/.copilot/skills/gstack|g" \ + "$skill_dir/SKILL.md" > "$target_dir/SKILL.md" + done + echo "gstack ready (copilot)." + echo " browse: $BROWSE_BIN" + echo " copilot skills: $COPILOT_SKILLS" + fi +fi + +# 8. Create .agents/ sidecar symlinks for Codex and Copilot skill targets. +# The root skill ends up pointing at $SOURCE_GSTACK_DIR/.agents/skills/gstack, # so the runtime assets must live there for both global and repo-local installs. -if [ "$INSTALL_CODEX" -eq 1 ]; then +if [ "$INSTALL_CODEX" -eq 1 ] || [ "$INSTALL_COPILOT" -eq 1 ]; then create_agents_sidecar "$SOURCE_GSTACK_DIR" fi -# 8. First-time welcome + legacy cleanup +# 9. First-time welcome + legacy cleanup if [ ! -d "$HOME/.gstack" ]; then mkdir -p "$HOME/.gstack" echo " Welcome! Run /gstack-upgrade anytime to stay current." diff --git a/test/copilot-e2e.test.ts b/test/copilot-e2e.test.ts new file mode 100644 index 000000000..bac72ef7b --- /dev/null +++ b/test/copilot-e2e.test.ts @@ -0,0 +1,118 @@ +/** + * Copilot CLI E2E tests — verify skills work when invoked by GitHub Copilot CLI. + * + * Spawns `copilot -p` with skills installed in a temp HOME, captures + * output, and validates results. Follows the same pattern as codex-e2e.test.ts + * but adapted for the standalone GitHub Copilot CLI. + * + * Prerequisites: + * - Copilot CLI installed (`npm install -g @github/copilot`) + * - Copilot authenticated (`copilot` → `/login`) + * - EVALS=1 env var set (same gate as Claude/Codex E2E tests) + * + * Skips gracefully when prerequisites are not met. + */ + +import { describe, test, expect, afterAll } from 'bun:test'; +import { runCopilotSkill, installSkillToTempHome } from './helpers/copilot-session-runner'; +import type { CopilotResult } from './helpers/copilot-session-runner'; +import { EvalCollector } from './helpers/eval-store'; +import type { EvalTestEntry } from './helpers/eval-store'; +import { selectTests, detectBaseBranch, getChangedFiles, E2E_TOUCHFILES, GLOBAL_TOUCHFILES } from './helpers/touchfiles'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; + +const ROOT = path.resolve(import.meta.dir, '..'); + +// --- Prerequisites check --- + +const COPILOT_AVAILABLE = (() => { + try { + const result = Bun.spawnSync(['copilot', '--version']); + return result.exitCode === 0; + } catch { return false; } +})(); + +const evalsEnabled = !!process.env.EVALS; + +// Skip all tests if copilot CLI is not available or EVALS is not set. +const SKIP = !COPILOT_AVAILABLE || !evalsEnabled; + +const describeCopilot = SKIP ? describe.skip : describe; + +// Log why we're skipping (helpful for debugging CI) +if (!evalsEnabled) { + // Silent — same as Claude/Codex E2E tests, EVALS=1 required +} else if (!COPILOT_AVAILABLE) { + process.stderr.write('\nCopilot E2E: SKIPPED — copilot CLI not found (install: npm install -g @github/copilot)\n'); +} + +// --- Diff-based test selection --- + +// Copilot E2E touchfiles — keyed by test name, same pattern as Codex E2E_TOUCHFILES +const COPILOT_E2E_TOUCHFILES: Record = { + 'copilot-discover-skill': ['codex/**', '.agents/skills/**', 'test/helpers/copilot-session-runner.ts'], +}; + +let selectedTests: string[] | null = null; // null = run all + +// --- Eval collector --- + +const evalCollector = new EvalCollector('copilot-e2e'); + +afterAll(async () => { + if (!SKIP) { + await evalCollector.flush(); + } +}); + +// --- Tests --- + +describeCopilot('Copilot CLI E2E', () => { + test('copilot-discover-skill: copilot CLI can run with gstack skill context', async () => { + // Diff-based test selection + if (selectedTests !== null && !selectedTests.includes('copilot-discover-skill')) { + return; // skip — not affected by current diff + } + + const agentsDir = path.join(ROOT, '.agents', 'skills'); + if (!fs.existsSync(agentsDir)) { + process.stderr.write(' Copilot E2E: .agents/skills/ not found — skipping\n'); + return; + } + + // Use the generated root gstack skill + const skillDir = path.join(agentsDir, 'gstack'); + if (!fs.existsSync(path.join(skillDir, 'SKILL.md'))) { + process.stderr.write(' Copilot E2E: gstack SKILL.md not found in .agents/skills/ — skipping\n'); + return; + } + + const result = await runCopilotSkill({ + skillDir, + prompt: 'list available gstack skills', + timeoutMs: 120_000, + cwd: ROOT, + skillName: 'gstack', + }); + + // Record eval + const entry: EvalTestEntry = { + test_name: 'copilot-discover-skill', + skill_name: 'gstack', + host: 'copilot', + prompt: 'list available gstack skills', + expected_behavior: 'Copilot acknowledges gstack skill context', + actual_output: result.output.slice(0, 1000), + exit_code: result.exitCode, + duration_ms: result.durationMs, + pass: result.exitCode === 0 && result.output.length > 0, + }; + evalCollector.add(entry); + + // Basic validation — copilot ran and produced output + expect(result.exitCode).toBe(0); + expect(result.output.length).toBeGreaterThan(0); + }, 180_000); +}); diff --git a/test/gen-skill-docs.test.ts b/test/gen-skill-docs.test.ts index 32e77a368..a992c5a21 100644 --- a/test/gen-skill-docs.test.ts +++ b/test/gen-skill-docs.test.ts @@ -1308,14 +1308,15 @@ describe('setup script validation', () => { expect(fnBody).toContain('ln -snf "gstack/$skill_name"'); }); - test('setup supports --host auto|claude|codex|kiro', () => { + test('setup supports --host auto|claude|codex|copilot|kiro', () => { expect(setupContent).toContain('--host'); - expect(setupContent).toContain('claude|codex|kiro|auto'); + expect(setupContent).toContain('claude|codex|copilot|kiro|auto'); }); - test('auto mode detects claude, codex, and kiro binaries', () => { + test('auto mode detects claude, codex, copilot, and kiro binaries', () => { expect(setupContent).toContain('command -v claude'); expect(setupContent).toContain('command -v codex'); + expect(setupContent).toContain('command -v copilot'); expect(setupContent).toContain('command -v kiro-cli'); }); diff --git a/test/global-discover.test.ts b/test/global-discover.test.ts index c8d489f4a..aecd2c8ff 100644 --- a/test/global-discover.test.ts +++ b/test/global-discover.test.ts @@ -151,6 +151,7 @@ describe("gstack-global-discover", () => { expect(repo.sessions).toHaveProperty("claude_code"); expect(repo.sessions).toHaveProperty("codex"); expect(repo.sessions).toHaveProperty("gemini"); + expect(repo.sessions).toHaveProperty("copilot"); } }); @@ -166,7 +167,8 @@ describe("gstack-global-discover", () => { const toolTotal = json.tools.claude_code.total_sessions + json.tools.codex.total_sessions + - json.tools.gemini.total_sessions; + json.tools.gemini.total_sessions + + json.tools.copilot.total_sessions; expect(json.total_sessions).toBe(toolTotal); }); diff --git a/test/helpers/copilot-session-runner.ts b/test/helpers/copilot-session-runner.ts new file mode 100644 index 000000000..c5013eaba --- /dev/null +++ b/test/helpers/copilot-session-runner.ts @@ -0,0 +1,150 @@ +/** + * Copilot CLI subprocess runner for skill E2E testing. + * + * Spawns `copilot -p` as a completely independent process and returns + * structured results. Follows the same pattern as codex-session-runner.ts but + * adapted for the standalone GitHub Copilot CLI. + * + * Key differences from Codex session-runner: + * - Uses `copilot -p` instead of `codex exec` + * - Copilot CLI is a standalone binary (not a gh extension) + * - Needs temp HOME with skill installed at ~/.copilot/skills/{skillName}/SKILL.md + */ + +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; + +// --- Interfaces --- + +export interface CopilotResult { + output: string; // Full agent message text + exitCode: number; // Process exit code + durationMs: number; // Wall clock time + rawOutput: string; // Raw stdout for debugging +} + +// --- Skill installation helper --- + +/** + * Install a SKILL.md into a temp HOME directory for Copilot to discover. + * Creates ~/.copilot/skills/{skillName}/SKILL.md in the temp HOME. + * + * Returns the temp HOME path. Caller is responsible for cleanup. + */ +export function installSkillToTempHome( + skillDir: string, + skillName: string, + tempHome?: string, +): string { + const home = tempHome || fs.mkdtempSync(path.join(os.tmpdir(), 'copilot-e2e-')); + const destDir = path.join(home, '.copilot', 'skills', skillName); + fs.mkdirSync(destDir, { recursive: true }); + + const srcSkill = path.join(skillDir, 'SKILL.md'); + if (fs.existsSync(srcSkill)) { + fs.copyFileSync(srcSkill, path.join(destDir, 'SKILL.md')); + } + + return home; +} + +// --- Main runner --- + +/** + * Run a Copilot skill via `copilot -p` and return structured results. + * + * Spawns copilot in a temp HOME with the skill installed, captures output, + * and returns a CopilotResult. Skips gracefully if copilot is not found. + */ +export async function runCopilotSkill(opts: { + skillDir: string; // Path to skill directory containing SKILL.md + prompt: string; // What to ask Copilot to suggest + timeoutMs?: number; // Default 300000 (5 min) + cwd?: string; // Working directory + skillName?: string; // Skill name for installation (default: dirname) +}): Promise { + const { + skillDir, + prompt, + timeoutMs = 300_000, + cwd, + skillName, + } = opts; + + const startTime = Date.now(); + const name = skillName || path.basename(skillDir) || 'gstack'; + + // Check if copilot CLI is available + const whichResult = Bun.spawnSync(['copilot', '--version']); + if (whichResult.exitCode !== 0) { + return { + output: 'SKIP: copilot CLI not found', + exitCode: -1, + durationMs: Date.now() - startTime, + rawOutput: '', + }; + } + + // Set up temp HOME with skill installed + const tempHome = fs.mkdtempSync(path.join(os.tmpdir(), 'copilot-e2e-')); + + try { + installSkillToTempHome(skillDir, name, tempHome); + + // Copy auth config from real ~/.copilot/ so the spawned process can authenticate. + // Copilot CLI stores login state in ~/.copilot/config.json (or $COPILOT_HOME). + const realCopilotDir = process.env.COPILOT_HOME || path.join(os.homedir(), '.copilot'); + const tempCopilotDir = path.join(tempHome, '.copilot'); + for (const authFile of ['config.json', 'hosts.json']) { + const src = path.join(realCopilotDir, authFile); + if (fs.existsSync(src)) { + fs.mkdirSync(tempCopilotDir, { recursive: true }); + fs.copyFileSync(src, path.join(tempCopilotDir, authFile)); + } + } + + // Build copilot command + const args = ['-p', prompt]; + + // Spawn copilot with temp HOME + const proc = Bun.spawn(['copilot', ...args], { + cwd: cwd || skillDir, + stdout: 'pipe', + stderr: 'pipe', + env: { + ...process.env, + HOME: tempHome, + }, + }); + + // Race against timeout + let timedOut = false; + const timeoutId = setTimeout(() => { + timedOut = true; + proc.kill(); + }, timeoutMs); + + const stdoutText = await new Response(proc.stdout).text(); + const stderrText = await new Response(proc.stderr).text(); + const exitCode = await proc.exited; + clearTimeout(timeoutId); + + const durationMs = Date.now() - startTime; + + // Log stderr if non-empty + if (stderrText.trim()) { + process.stderr.write(` [copilot stderr] ${stderrText.trim().slice(0, 200)}\n`); + } + + return { + output: stdoutText.trim(), + exitCode: timedOut ? 124 : exitCode, + durationMs, + rawOutput: stdoutText, + }; + } finally { + // Clean up temp HOME + try { fs.rmSync(tempHome, { recursive: true, force: true }); } catch { /* non-fatal */ } + } +}