diff --git a/.opencode/skills/babysit-pr/SKILL.md b/.opencode/skills/babysit-pr/SKILL.md new file mode 100644 index 000000000000..d472fad66a74 --- /dev/null +++ b/.opencode/skills/babysit-pr/SKILL.md @@ -0,0 +1,187 @@ +--- +name: babysit-pr +description: Babysit a GitHub pull request after creation by continuously polling review comments, CI checks/workflow runs, and mergeability state until the PR is merged/closed or user help is required. Diagnose failures, retry likely flaky failures up to 3 times, auto-fix/push branch-related issues when appropriate, and keep watching open PRs so fresh review feedback is surfaced promptly. Use when the user asks Codex to monitor a PR, watch CI, handle review comments, or keep an eye on failures and feedback on an open PR. +--- + +# PR Babysitter + +## Objective +Babysit a PR persistently until one of these terminal outcomes occurs: + +- The PR is merged or closed. +- A situation requires user help (for example CI infrastructure issues, repeated flaky failures after retry budget is exhausted, permission problems, or ambiguity that cannot be resolved safely). +- Optional handoff milestone: the PR is currently green + mergeable + review-clean. Treat this as a progress state, not a watcher stop, so late-arriving review comments are still surfaced promptly while the PR remains open. + +Do not stop merely because a single snapshot returns `idle` while checks are still pending. + +## Inputs +Accept any of the following: + +- No PR argument: infer the PR from the current branch (`--pr auto`) +- PR number +- PR URL + +## Core Workflow + +1. When the user asks to "monitor"/"watch"/"babysit" a PR, start with the watcher's continuous mode (`--watch`) unless you are intentionally doing a one-shot diagnostic snapshot. +2. Run the watcher script to snapshot PR/review/CI state (or consume each streamed snapshot from `--watch`). +3. Inspect the `actions` list in the JSON response. +4. If `diagnose_ci_failure` is present, inspect failed run logs and classify the failure. +5. If the failure is likely caused by the current branch, patch code locally, commit, and push. +6. If `process_review_comment` is present, inspect surfaced review items and decide whether to address them. +7. If a review item is actionable and correct, patch code locally, commit, push, and then mark the associated review thread/comment as resolved once the fix is on GitHub. +8. If a review item from another author is non-actionable, already addressed, or not valid, post one reply on the comment/thread explaining that decision (for example answering the question or explaining why no change is needed). Prefix the GitHub reply body with `[codex]` so it is clear the response is automated. If the watcher later surfaces your own reply, treat that self-authored item as already handled and do not reply again. +9. If the failure is likely flaky/unrelated and `retry_failed_checks` is present, rerun failed jobs with `--retry-failed-now`. +10. If both actionable review feedback and `retry_failed_checks` are present, prioritize review feedback first; a new commit will retrigger CI, so avoid rerunning flaky checks on the old SHA unless you intentionally defer the review change. +11. On every loop, look for newly surfaced review feedback before acting on CI failures or mergeability state, then verify mergeability / merge-conflict status (for example via `gh pr view`) alongside CI. +12. After any push or rerun action, immediately return to step 1 and continue polling on the updated SHA/state. +13. If you had been using `--watch` before pausing to patch/commit/push, relaunch `--watch` yourself in the same turn immediately after the push (do not wait for the user to re-invoke the skill). +14. Repeat polling until `stop_pr_closed` appears or a user-help-required blocker is reached. A green + review-clean + mergeable PR is a progress milestone, not a reason to stop the watcher while the PR is still open. +15. Maintain terminal/session ownership: while babysitting is active, keep consuming watcher output in the same turn; do not leave a detached `--watch` process running and then end the turn as if monitoring were complete. + +## Commands + +### One-shot snapshot + +```bash +python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr auto --once +``` + +### Continuous watch (JSONL) + +```bash +python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr auto --watch +``` + +### Trigger flaky retry cycle (only when watcher indicates) + +```bash +python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr auto --retry-failed-now +``` + +### Explicit PR target + +```bash +python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr --once +``` + +## CI Failure Classification +Use `gh` commands to inspect failed runs before deciding to rerun. + +- `gh run view --json jobs,name,workflowName,conclusion,status,url,headSha` +- `gh run view --log-failed` + +Prefer treating failures as branch-related when logs point to changed code (compile/test/lint/typecheck/snapshots/static analysis in touched areas). + +Prefer treating failures as flaky/unrelated when logs show transient infra/external issues (timeouts, runner provisioning failures, registry/network outages, GitHub Actions infra errors). + +If classification is ambiguous, perform one manual diagnosis attempt before choosing rerun. + +Read `.codex/skills/babysit-pr/references/heuristics.md` for a concise checklist. + +## Review Comment Handling +The watcher surfaces review items from: + +- PR issue comments +- Inline review comments +- Review submissions (COMMENT / APPROVED / CHANGES_REQUESTED) + +It intentionally surfaces Codex reviewer bot feedback (for example comments/reviews from `chatgpt-codex-connector[bot]`) in addition to human reviewer feedback. Most unrelated bot noise should still be ignored. +For safety, the watcher only auto-surfaces trusted human review authors (for example repo OWNER/MEMBER/COLLABORATOR, plus the authenticated operator) and approved review bots such as Codex. +On a fresh watcher state file, existing pending review feedback may be surfaced immediately (not only comments that arrive after monitoring starts). This is intentional so already-open review comments are not missed. + +When you agree with a comment and it is actionable: + +1. Patch code locally. +2. Commit with `codex: address PR review feedback (#)`. +3. Push to the PR head branch. +4. After the push succeeds, mark the associated GitHub review thread/comment as resolved. +5. Resume watching on the new SHA immediately (do not stop after reporting the push). +6. If monitoring was running in `--watch` mode, restart `--watch` immediately after the push in the same turn; do not wait for the user to ask again. + +If you disagree or the comment is non-actionable/already addressed, reply once directly on the GitHub comment/thread so the reviewer gets an explicit answer, then continue the watcher loop. Prefix any GitHub reply to a code review comment/thread with `[codex]` so it is clear the response is automated and not from the human user. If the watcher later surfaces your own reply because the authenticated operator is treated as a trusted review author, treat that self-authored item as already handled and do not reply again. +If a code review comment/thread is already marked as resolved in GitHub, treat it as non-actionable and safely ignore it unless new unresolved follow-up feedback appears. + +## Git Safety Rules + +- Work only on the PR head branch. +- Avoid destructive git commands. +- Do not switch branches unless necessary to recover context. +- Before editing, check for unrelated uncommitted changes. If present, stop and ask the user. +- After each successful fix, commit and `git push`, then re-run the watcher. +- If you interrupted a live `--watch` session to make the fix, restart `--watch` immediately after the push in the same turn. +- Do not run multiple concurrent `--watch` processes for the same PR/state file; keep one watcher session active and reuse it until it stops or you intentionally restart it. +- A push is not a terminal outcome; continue the monitoring loop unless a strict stop condition is met. + +Commit message defaults: + +- `codex: fix CI failure on PR #` +- `codex: address PR review feedback (#)` + +## Monitoring Loop Pattern +Use this loop in a live Codex session: + +1. Run `--once`. +2. Read `actions`. +3. First check whether the PR is now merged or otherwise closed; if so, report that terminal state and stop polling immediately. +4. Check CI summary, new review items, and mergeability/conflict status. +5. Diagnose CI failures and classify branch-related vs flaky/unrelated. +6. For each surfaced review item from another author, either reply once with an explanation if it is non-actionable or patch/commit/push and then resolve it if it is actionable. If a later snapshot surfaces your own reply, treat it as informational and continue without responding again. +7. Process actionable review comments before flaky reruns when both are present; if a review fix requires a commit, push it and skip rerunning failed checks on the old SHA. +8. Retry failed checks only when `retry_failed_checks` is present and you are not about to replace the current SHA with a review/CI fix commit. +9. If you pushed a commit, resolved a review thread, replied to a review comment, or triggered a rerun, report the action briefly and continue polling (do not stop). +10. After a review-fix push, proactively restart continuous monitoring (`--watch`) in the same turn unless a strict stop condition has already been reached. +11. If everything is passing, mergeable, not blocked on required review approval, and there are no unaddressed review items, report that the PR is currently ready to merge but keep the watcher running so new review comments are surfaced quickly while the PR remains open. +12. If blocked on a user-help-required issue (infra outage, exhausted flaky retries, unclear reviewer request, permissions), report the blocker and stop. +13. Otherwise sleep according to the polling cadence below and repeat. + +When the user explicitly asks to monitor/watch/babysit a PR, prefer `--watch` so polling continues autonomously in one command. Use repeated `--once` snapshots only for debugging, local testing, or when the user explicitly asks for a one-shot check. +Do not stop to ask the user whether to continue polling; continue autonomously until a strict stop condition is met or the user explicitly interrupts. +Do not hand control back to the user after a review-fix push just because a new SHA was created; restarting the watcher and re-entering the poll loop is part of the same babysitting task. +If a `--watch` process is still running and no strict stop condition has been reached, the babysitting task is still in progress; keep streaming/consuming watcher output instead of ending the turn. + +## Polling Cadence +Keep review polling aggressive and continue monitoring even after CI turns green: + +- While CI is not green (pending/running/queued or failing): poll every 1 minute. +- After CI turns green: keep polling at the base cadence while the PR remains open so newly posted review comments are surfaced promptly instead of waiting on a long green-state backoff. +- Reset the cadence immediately whenever anything changes (new commit/SHA, check status changes, new review comments, mergeability changes, review decision changes). +- If CI stops being green again (new commit, rerun, or regression): stay on the base polling cadence. +- If any poll shows the PR is merged or otherwise closed: stop polling immediately and report the terminal state. + +## Stop Conditions (Strict) +Stop only when one of the following is true: + +- PR merged or closed (stop as soon as a poll/snapshot confirms this). +- User intervention is required and Codex cannot safely proceed alone. + +Keep polling when: + +- `actions` contains only `idle` but checks are still pending. +- CI is still running/queued. +- Review state is quiet but CI is not terminal. +- CI is green but mergeability is unknown/pending. +- CI is green and mergeable, but the PR is still open and you are waiting for possible new review comments or merge-conflict changes. +- The PR is green but blocked on review approval (`REVIEW_REQUIRED` / similar); continue polling at the base cadence and surface any new review comments without asking for confirmation to keep watching. + +## Output Expectations +Provide concise progress updates while monitoring and a final summary that includes: + +- During long unchanged monitoring periods, avoid emitting a full update on every poll; summarize only status changes plus occasional heartbeat updates. +- Treat push confirmations, intermediate CI snapshots, ready-to-merge snapshots, and review-action updates as progress updates only; do not emit the final summary or end the babysitting session unless a strict stop condition is met. +- A user request to "monitor" is not satisfied by a couple of sample polls; remain in the loop until a strict stop condition or an explicit user interruption. +- A review-fix commit + push is not a completion event; immediately resume live monitoring (`--watch`) in the same turn and continue reporting progress updates. +- When CI first transitions to all green for the current SHA, emit a one-time celebratory progress update (do not repeat it on every green poll). Preferred style: `πŸš€ CI is all green! 33/33 passed. Still on watch for review approval.` +- Do not send the final summary while a watcher terminal is still running unless the watcher has emitted/confirmed a strict stop condition; otherwise continue with progress updates. + +- Final PR SHA +- CI status summary +- Mergeability / conflict status +- Fixes pushed +- Flaky retry cycles used +- Remaining unresolved failures or review comments + +## References + +- Heuristics and decision tree: `.codex/skills/babysit-pr/references/heuristics.md` +- GitHub CLI/API details used by the watcher: `.codex/skills/babysit-pr/references/github-api-notes.md` diff --git a/.opencode/skills/babysit-pr/agents/openai.yaml b/.opencode/skills/babysit-pr/agents/openai.yaml new file mode 100644 index 000000000000..b68a7287a244 --- /dev/null +++ b/.opencode/skills/babysit-pr/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "PR Babysitter" + short_description: "Watch PR review comments, CI, and merge conflicts" + default_prompt: "Babysit the current PR: monitor reviewer comments, CI, and merge-conflict status (prefer the watcher’s --watch mode for live monitoring); surface new review feedback before acting on CI or mergeability work, fix valid issues, push updates, and rerun flaky failures up to 3 times. Keep exactly one watcher session active for the PR (do not leave duplicate --watch terminals running). If you pause monitoring to patch review/CI feedback, restart --watch yourself immediately after the push in the same turn. If a watcher is still running and no strict stop condition has been reached, the task is still in progress: keep consuming watcher output and sending progress updates instead of ending the turn. Do not treat a green + mergeable PR as a terminal stop while it is still open; continue polling autonomously after any push/rerun so newly posted review comments are surfaced until a strict terminal stop condition is reached or the user interrupts." diff --git a/.opencode/skills/babysit-pr/scripts/gh_pr_watch.py b/.opencode/skills/babysit-pr/scripts/gh_pr_watch.py new file mode 100644 index 000000000000..2650770b2a97 --- /dev/null +++ b/.opencode/skills/babysit-pr/scripts/gh_pr_watch.py @@ -0,0 +1,806 @@ +#!/usr/bin/env python3 +"""Watch GitHub PR CI and review activity for Codex PR babysitting workflows.""" + +import argparse +import json +import os +import re +import subprocess +import sys +import tempfile +import time +from pathlib import Path +from urllib.parse import urlparse + +FAILED_RUN_CONCLUSIONS = { + "failure", + "timed_out", + "cancelled", + "action_required", + "startup_failure", + "stale", +} +PENDING_CHECK_STATES = { + "QUEUED", + "IN_PROGRESS", + "PENDING", + "WAITING", + "REQUESTED", +} +REVIEW_BOT_LOGIN_KEYWORDS = { + "codex", +} +TRUSTED_AUTHOR_ASSOCIATIONS = { + "OWNER", + "MEMBER", + "COLLABORATOR", +} +MERGE_BLOCKING_REVIEW_DECISIONS = { + "REVIEW_REQUIRED", + "CHANGES_REQUESTED", +} +MERGE_CONFLICT_OR_BLOCKING_STATES = { + "BLOCKED", + "DIRTY", + "DRAFT", + "UNKNOWN", +} + + +class GhCommandError(RuntimeError): + pass + + +def parse_args(): + parser = argparse.ArgumentParser( + description=( + "Normalize PR/CI/review state for Codex PR babysitting and optionally " + "trigger flaky reruns." + ) + ) + parser.add_argument("--pr", default="auto", help="auto, PR number, or PR URL") + parser.add_argument("--repo", help="Optional OWNER/REPO override") + parser.add_argument("--poll-seconds", type=int, default=30, help="Watch poll interval") + parser.add_argument( + "--max-flaky-retries", + type=int, + default=3, + help="Max rerun cycles per head SHA before stop recommendation", + ) + parser.add_argument("--state-file", help="Path to state JSON file") + parser.add_argument("--once", action="store_true", help="Emit one snapshot and exit") + parser.add_argument("--watch", action="store_true", help="Continuously emit JSONL snapshots") + parser.add_argument( + "--retry-failed-now", + action="store_true", + help="Rerun failed jobs for current failed workflow runs when policy allows", + ) + parser.add_argument( + "--json", + action="store_true", + help="Emit machine-readable output (default behavior for --once and --retry-failed-now)", + ) + args = parser.parse_args() + + if args.poll_seconds <= 0: + parser.error("--poll-seconds must be > 0") + if args.max_flaky_retries < 0: + parser.error("--max-flaky-retries must be >= 0") + if args.watch and args.retry_failed_now: + parser.error("--watch cannot be combined with --retry-failed-now") + if not args.once and not args.watch and not args.retry_failed_now: + args.once = True + return args + + +def _format_gh_error(cmd, err): + stdout = (err.stdout or "").strip() + stderr = (err.stderr or "").strip() + parts = [f"GitHub CLI command failed: {' '.join(cmd)}"] + if stdout: + parts.append(f"stdout: {stdout}") + if stderr: + parts.append(f"stderr: {stderr}") + return "\n".join(parts) + + +def gh_text(args, repo=None): + cmd = ["gh"] + # `gh api` does not accept `-R/--repo` on all gh versions. The watcher's + # API calls use explicit endpoints (e.g. repos/{owner}/{repo}/...), so the + # repo flag is unnecessary there. + if repo and (not args or args[0] != "api"): + cmd.extend(["-R", repo]) + cmd.extend(args) + try: + proc = subprocess.run(cmd, check=True, capture_output=True, text=True) + except FileNotFoundError as err: + raise GhCommandError("`gh` command not found") from err + except subprocess.CalledProcessError as err: + raise GhCommandError(_format_gh_error(cmd, err)) from err + return proc.stdout + + +def gh_json(args, repo=None): + raw = gh_text(args, repo=repo).strip() + if not raw: + return None + try: + return json.loads(raw) + except json.JSONDecodeError as err: + raise GhCommandError(f"Failed to parse JSON from gh output for {' '.join(args)}") from err + + +def parse_pr_spec(pr_spec): + if pr_spec == "auto": + return {"mode": "auto", "value": None} + if re.fullmatch(r"\d+", pr_spec): + return {"mode": "number", "value": pr_spec} + parsed = urlparse(pr_spec) + if parsed.scheme and parsed.netloc and "/pull/" in parsed.path: + return {"mode": "url", "value": pr_spec} + raise ValueError("--pr must be 'auto', a PR number, or a PR URL") + + +def pr_view_fields(): + return ( + "number,url,state,mergedAt,closedAt,headRefName,headRefOid," + "headRepository,headRepositoryOwner,mergeable,mergeStateStatus,reviewDecision" + ) + + +def checks_fields(): + return "name,state,bucket,link,workflow,event,startedAt,completedAt" + + +def resolve_pr(pr_spec, repo_override=None): + parsed = parse_pr_spec(pr_spec) + cmd = ["pr", "view"] + if parsed["value"] is not None: + cmd.append(parsed["value"]) + cmd.extend(["--json", pr_view_fields()]) + data = gh_json(cmd, repo=repo_override) + if not isinstance(data, dict): + raise GhCommandError("Unexpected PR payload from `gh pr view`") + + pr_url = str(data.get("url") or "") + repo = ( + repo_override + or extract_repo_from_pr_url(pr_url) + or extract_repo_from_pr_view(data) + ) + if not repo: + raise GhCommandError("Unable to determine OWNER/REPO for the PR") + + state = str(data.get("state") or "") + merged = bool(data.get("mergedAt")) + closed = bool(data.get("closedAt")) or state.upper() == "CLOSED" + + return { + "number": int(data["number"]), + "url": pr_url, + "repo": repo, + "head_sha": str(data.get("headRefOid") or ""), + "head_branch": str(data.get("headRefName") or ""), + "state": state, + "merged": merged, + "closed": closed, + "mergeable": str(data.get("mergeable") or ""), + "merge_state_status": str(data.get("mergeStateStatus") or ""), + "review_decision": str(data.get("reviewDecision") or ""), + } + + +def extract_repo_from_pr_view(data): + head_repo = data.get("headRepository") + head_owner = data.get("headRepositoryOwner") + owner = None + name = None + if isinstance(head_owner, dict): + owner = head_owner.get("login") or head_owner.get("name") + elif isinstance(head_owner, str): + owner = head_owner + if isinstance(head_repo, dict): + name = head_repo.get("name") + repo_owner = head_repo.get("owner") + if not owner and isinstance(repo_owner, dict): + owner = repo_owner.get("login") or repo_owner.get("name") + elif isinstance(head_repo, str): + name = head_repo + if owner and name: + return f"{owner}/{name}" + return None +def extract_repo_from_pr_url(pr_url): + parsed = urlparse(pr_url) + parts = [p for p in parsed.path.split("/") if p] + if len(parts) >= 4 and parts[2] == "pull": + return f"{parts[0]}/{parts[1]}" + return None + + +def load_state(path): + if path.exists(): + try: + data = json.loads(path.read_text()) + except json.JSONDecodeError as err: + raise RuntimeError(f"State file is not valid JSON: {path}") from err + if not isinstance(data, dict): + raise RuntimeError(f"State file must contain an object: {path}") + return data, False + return { + "pr": {}, + "started_at": None, + "last_seen_head_sha": None, + "retries_by_sha": {}, + "seen_issue_comment_ids": [], + "seen_review_comment_ids": [], + "seen_review_ids": [], + "last_snapshot_at": None, + }, True + + +def save_state(path, state): + path.parent.mkdir(parents=True, exist_ok=True) + payload = json.dumps(state, indent=2, sort_keys=True) + "\n" + fd, tmp_name = tempfile.mkstemp(prefix=f"{path.name}.", suffix=".tmp", dir=path.parent) + tmp_path = Path(tmp_name) + try: + with os.fdopen(fd, "w", encoding="utf-8") as tmp_file: + tmp_file.write(payload) + os.replace(tmp_path, path) + except Exception: + try: + tmp_path.unlink(missing_ok=True) + except OSError: + pass + raise + + +def default_state_file_for(pr): + repo_slug = pr["repo"].replace("/", "-") + return Path(f"/tmp/codex-babysit-pr-{repo_slug}-pr{pr['number']}.json") + + +def get_pr_checks(pr_spec, repo): + parsed = parse_pr_spec(pr_spec) + cmd = ["pr", "checks"] + if parsed["value"] is not None: + cmd.append(parsed["value"]) + cmd.extend(["--json", checks_fields()]) + data = gh_json(cmd, repo=repo) + if data is None: + return [] + if not isinstance(data, list): + raise GhCommandError("Unexpected payload from `gh pr checks`") + return data + + +def is_pending_check(check): + bucket = str(check.get("bucket") or "").lower() + state = str(check.get("state") or "").upper() + return bucket == "pending" or state in PENDING_CHECK_STATES + + +def summarize_checks(checks): + pending_count = 0 + failed_count = 0 + passed_count = 0 + for check in checks: + bucket = str(check.get("bucket") or "").lower() + if is_pending_check(check): + pending_count += 1 + if bucket == "fail": + failed_count += 1 + if bucket == "pass": + passed_count += 1 + return { + "pending_count": pending_count, + "failed_count": failed_count, + "passed_count": passed_count, + "all_terminal": pending_count == 0, + } + + +def get_workflow_runs_for_sha(repo, head_sha): + endpoint = f"repos/{repo}/actions/runs" + data = gh_json( + ["api", endpoint, "-X", "GET", "-f", f"head_sha={head_sha}", "-f", "per_page=100"], + repo=repo, + ) + if not isinstance(data, dict): + raise GhCommandError("Unexpected payload from actions runs API") + runs = data.get("workflow_runs") or [] + if not isinstance(runs, list): + raise GhCommandError("Expected `workflow_runs` to be a list") + return runs + + +def failed_runs_from_workflow_runs(runs, head_sha): + failed_runs = [] + for run in runs: + if not isinstance(run, dict): + continue + if str(run.get("head_sha") or "") != head_sha: + continue + conclusion = str(run.get("conclusion") or "") + if conclusion not in FAILED_RUN_CONCLUSIONS: + continue + failed_runs.append( + { + "run_id": run.get("id"), + "workflow_name": run.get("name") or run.get("display_title") or "", + "status": str(run.get("status") or ""), + "conclusion": conclusion, + "html_url": str(run.get("html_url") or ""), + } + ) + failed_runs.sort(key=lambda item: (str(item.get("workflow_name") or ""), str(item.get("run_id") or ""))) + return failed_runs + + +def get_authenticated_login(): + data = gh_json(["api", "user"]) + if not isinstance(data, dict) or not data.get("login"): + raise GhCommandError("Unable to determine authenticated GitHub login from `gh api user`") + return str(data["login"]) + + +def comment_endpoints(repo, pr_number): + return { + "issue_comment": f"repos/{repo}/issues/{pr_number}/comments", + "review_comment": f"repos/{repo}/pulls/{pr_number}/comments", + "review": f"repos/{repo}/pulls/{pr_number}/reviews", + } + + +def gh_api_list_paginated(endpoint, repo=None, per_page=100): + items = [] + page = 1 + while True: + sep = "&" if "?" in endpoint else "?" + page_endpoint = f"{endpoint}{sep}per_page={per_page}&page={page}" + payload = gh_json(["api", page_endpoint], repo=repo) + if payload is None: + break + if not isinstance(payload, list): + raise GhCommandError(f"Unexpected paginated payload from gh api {endpoint}") + items.extend(payload) + if len(payload) < per_page: + break + page += 1 + return items + + +def normalize_issue_comments(items): + out = [] + for item in items: + if not isinstance(item, dict): + continue + out.append( + { + "kind": "issue_comment", + "id": str(item.get("id") or ""), + "author": extract_login(item.get("user")), + "author_association": str(item.get("author_association") or ""), + "created_at": str(item.get("created_at") or ""), + "body": str(item.get("body") or ""), + "path": None, + "line": None, + "url": str(item.get("html_url") or ""), + } + ) + return out + + +def normalize_review_comments(items): + out = [] + for item in items: + if not isinstance(item, dict): + continue + line = item.get("line") + if line is None: + line = item.get("original_line") + out.append( + { + "kind": "review_comment", + "id": str(item.get("id") or ""), + "author": extract_login(item.get("user")), + "author_association": str(item.get("author_association") or ""), + "created_at": str(item.get("created_at") or ""), + "body": str(item.get("body") or ""), + "path": item.get("path"), + "line": line, + "url": str(item.get("html_url") or ""), + } + ) + return out + + +def normalize_reviews(items): + out = [] + for item in items: + if not isinstance(item, dict): + continue + out.append( + { + "kind": "review", + "id": str(item.get("id") or ""), + "author": extract_login(item.get("user")), + "author_association": str(item.get("author_association") or ""), + "created_at": str(item.get("submitted_at") or item.get("created_at") or ""), + "body": str(item.get("body") or ""), + "path": None, + "line": None, + "url": str(item.get("html_url") or ""), + } + ) + return out + + +def extract_login(user_obj): + if isinstance(user_obj, dict): + return str(user_obj.get("login") or "") + return "" + + +def is_bot_login(login): + return bool(login) and login.endswith("[bot]") + + +def is_actionable_review_bot_login(login): + if not is_bot_login(login): + return False + lower_login = login.lower() + return any(keyword in lower_login for keyword in REVIEW_BOT_LOGIN_KEYWORDS) + + +def is_trusted_human_review_author(item, authenticated_login): + author = str(item.get("author") or "") + if not author: + return False + if authenticated_login and author == authenticated_login: + return True + association = str(item.get("author_association") or "").upper() + return association in TRUSTED_AUTHOR_ASSOCIATIONS + + +def fetch_new_review_items(pr, state, fresh_state, authenticated_login=None): + repo = pr["repo"] + pr_number = pr["number"] + endpoints = comment_endpoints(repo, pr_number) + + issue_payload = gh_api_list_paginated(endpoints["issue_comment"], repo=repo) + review_comment_payload = gh_api_list_paginated(endpoints["review_comment"], repo=repo) + review_payload = gh_api_list_paginated(endpoints["review"], repo=repo) + + issue_items = normalize_issue_comments(issue_payload) + review_comment_items = normalize_review_comments(review_comment_payload) + review_items = normalize_reviews(review_payload) + all_items = issue_items + review_comment_items + review_items + + seen_issue = {str(x) for x in state.get("seen_issue_comment_ids") or []} + seen_review_comment = {str(x) for x in state.get("seen_review_comment_ids") or []} + seen_review = {str(x) for x in state.get("seen_review_ids") or []} + + # On a brand-new state file, surface existing review activity instead of + # silently treating it as seen. This avoids missing already-pending review + # feedback when monitoring starts after comments were posted. + + new_items = [] + for item in all_items: + item_id = item.get("id") + if not item_id: + continue + author = item.get("author") or "" + if not author: + continue + if is_bot_login(author): + if not is_actionable_review_bot_login(author): + continue + elif not is_trusted_human_review_author(item, authenticated_login): + continue + + kind = item["kind"] + if kind == "issue_comment" and item_id in seen_issue: + continue + if kind == "review_comment" and item_id in seen_review_comment: + continue + if kind == "review" and item_id in seen_review: + continue + + new_items.append(item) + if kind == "issue_comment": + seen_issue.add(item_id) + elif kind == "review_comment": + seen_review_comment.add(item_id) + elif kind == "review": + seen_review.add(item_id) + + new_items.sort(key=lambda item: (item.get("created_at") or "", item.get("kind") or "", item.get("id") or "")) + state["seen_issue_comment_ids"] = sorted(seen_issue) + state["seen_review_comment_ids"] = sorted(seen_review_comment) + state["seen_review_ids"] = sorted(seen_review) + return new_items + + +def current_retry_count(state, head_sha): + retries = state.get("retries_by_sha") or {} + value = retries.get(head_sha, 0) + try: + return int(value) + except (TypeError, ValueError): + return 0 + + +def set_retry_count(state, head_sha, count): + retries = state.get("retries_by_sha") + if not isinstance(retries, dict): + retries = {} + retries[head_sha] = int(count) + state["retries_by_sha"] = retries + + +def unique_actions(actions): + out = [] + seen = set() + for action in actions: + if action not in seen: + out.append(action) + seen.add(action) + return out + + +def is_pr_ready_to_merge(pr, checks_summary, new_review_items): + if pr["closed"] or pr["merged"]: + return False + if not checks_summary["all_terminal"]: + return False + if checks_summary["failed_count"] > 0 or checks_summary["pending_count"] > 0: + return False + if new_review_items: + return False + if str(pr.get("mergeable") or "") != "MERGEABLE": + return False + if str(pr.get("merge_state_status") or "") in MERGE_CONFLICT_OR_BLOCKING_STATES: + return False + if str(pr.get("review_decision") or "") in MERGE_BLOCKING_REVIEW_DECISIONS: + return False + return True + + +def recommend_actions(pr, checks_summary, failed_runs, new_review_items, retries_used, max_retries): + actions = [] + if pr["closed"] or pr["merged"]: + if new_review_items: + actions.append("process_review_comment") + actions.append("stop_pr_closed") + return unique_actions(actions) + + if is_pr_ready_to_merge(pr, checks_summary, new_review_items): + actions.append("ready_to_merge") + return unique_actions(actions) + + if new_review_items: + actions.append("process_review_comment") + + has_failed_pr_checks = checks_summary["failed_count"] > 0 + if has_failed_pr_checks: + if checks_summary["all_terminal"] and retries_used >= max_retries: + actions.append("stop_exhausted_retries") + else: + actions.append("diagnose_ci_failure") + if checks_summary["all_terminal"] and failed_runs and retries_used < max_retries: + actions.append("retry_failed_checks") + + if not actions: + actions.append("idle") + return unique_actions(actions) + + +def collect_snapshot(args): + pr = resolve_pr(args.pr, repo_override=args.repo) + state_path = Path(args.state_file) if args.state_file else default_state_file_for(pr) + state, fresh_state = load_state(state_path) + + if not state.get("started_at"): + state["started_at"] = int(time.time()) + + authenticated_login = get_authenticated_login() + new_review_items = fetch_new_review_items( + pr, + state, + fresh_state=fresh_state, + authenticated_login=authenticated_login, + ) + # Surface review feedback before drilling into CI and mergeability details. + # That keeps the babysitter responsive to new comments even when other + # actions are also available. + # `gh pr checks -R ` requires an explicit PR/branch/url argument. + # After resolving `--pr auto`, reuse the concrete PR number. + checks = get_pr_checks(str(pr["number"]), repo=pr["repo"]) + checks_summary = summarize_checks(checks) + workflow_runs = get_workflow_runs_for_sha(pr["repo"], pr["head_sha"]) + failed_runs = failed_runs_from_workflow_runs(workflow_runs, pr["head_sha"]) + + retries_used = current_retry_count(state, pr["head_sha"]) + actions = recommend_actions( + pr, + checks_summary, + failed_runs, + new_review_items, + retries_used, + args.max_flaky_retries, + ) + + state["pr"] = {"repo": pr["repo"], "number": pr["number"]} + state["last_seen_head_sha"] = pr["head_sha"] + state["last_snapshot_at"] = int(time.time()) + save_state(state_path, state) + + snapshot = { + "pr": pr, + "checks": checks_summary, + "failed_runs": failed_runs, + "new_review_items": new_review_items, + "actions": actions, + "retry_state": { + "current_sha_retries_used": retries_used, + "max_flaky_retries": args.max_flaky_retries, + }, + } + return snapshot, state_path + + +def retry_failed_now(args): + snapshot, state_path = collect_snapshot(args) + pr = snapshot["pr"] + checks_summary = snapshot["checks"] + failed_runs = snapshot["failed_runs"] + retries_used = snapshot["retry_state"]["current_sha_retries_used"] + max_retries = snapshot["retry_state"]["max_flaky_retries"] + + result = { + "snapshot": snapshot, + "state_file": str(state_path), + "rerun_attempted": False, + "rerun_count": 0, + "rerun_run_ids": [], + "reason": None, + } + + if pr["closed"] or pr["merged"]: + result["reason"] = "pr_closed" + return result + if checks_summary["failed_count"] <= 0: + result["reason"] = "no_failed_pr_checks" + return result + if not failed_runs: + result["reason"] = "no_failed_runs" + return result + if not checks_summary["all_terminal"]: + result["reason"] = "checks_still_pending" + return result + if retries_used >= max_retries: + result["reason"] = "retry_budget_exhausted" + return result + + for run in failed_runs: + run_id = run.get("run_id") + if run_id in (None, ""): + continue + gh_text(["run", "rerun", str(run_id), "--failed"], repo=pr["repo"]) + result["rerun_run_ids"].append(run_id) + + if result["rerun_run_ids"]: + state, _ = load_state(state_path) + new_count = current_retry_count(state, pr["head_sha"]) + 1 + set_retry_count(state, pr["head_sha"], new_count) + state["last_snapshot_at"] = int(time.time()) + save_state(state_path, state) + result["rerun_attempted"] = True + result["rerun_count"] = len(result["rerun_run_ids"]) + result["reason"] = "rerun_triggered" + else: + result["reason"] = "failed_runs_missing_ids" + + return result + + +def print_json(obj): + sys.stdout.write(json.dumps(obj, sort_keys=True) + "\n") + sys.stdout.flush() + + +def print_event(event, payload): + print_json({"event": event, "payload": payload}) + + +def is_ci_green(snapshot): + checks = snapshot.get("checks") or {} + return ( + bool(checks.get("all_terminal")) + and int(checks.get("failed_count") or 0) == 0 + and int(checks.get("pending_count") or 0) == 0 + ) + + +def snapshot_change_key(snapshot): + pr = snapshot.get("pr") or {} + checks = snapshot.get("checks") or {} + review_items = snapshot.get("new_review_items") or [] + return ( + str(pr.get("head_sha") or ""), + str(pr.get("state") or ""), + str(pr.get("mergeable") or ""), + str(pr.get("merge_state_status") or ""), + str(pr.get("review_decision") or ""), + int(checks.get("passed_count") or 0), + int(checks.get("failed_count") or 0), + int(checks.get("pending_count") or 0), + tuple( + (str(item.get("kind") or ""), str(item.get("id") or "")) + for item in review_items + if isinstance(item, dict) + ), + tuple(snapshot.get("actions") or []), + ) + + +def run_watch(args): + poll_seconds = args.poll_seconds + last_change_key = None + while True: + snapshot, state_path = collect_snapshot(args) + print_event( + "snapshot", + { + "snapshot": snapshot, + "state_file": str(state_path), + "next_poll_seconds": poll_seconds, + }, + ) + actions = set(snapshot.get("actions") or []) + if ( + "stop_pr_closed" in actions + or "stop_exhausted_retries" in actions + ): + print_event("stop", {"actions": snapshot.get("actions"), "pr": snapshot.get("pr")}) + return 0 + + current_change_key = snapshot_change_key(snapshot) + changed = current_change_key != last_change_key + green = is_ci_green(snapshot) + pr = snapshot.get("pr") or {} + pr_open = not bool(pr.get("closed")) and not bool(pr.get("merged")) + + if not green or pr_open: + poll_seconds = args.poll_seconds + elif changed or last_change_key is None: + poll_seconds = args.poll_seconds + + last_change_key = current_change_key + time.sleep(poll_seconds) + + +def main(): + args = parse_args() + try: + if args.retry_failed_now: + print_json(retry_failed_now(args)) + return 0 + if args.watch: + return run_watch(args) + snapshot, state_path = collect_snapshot(args) + snapshot["state_file"] = str(state_path) + print_json(snapshot) + return 0 + except (GhCommandError, RuntimeError, ValueError) as err: + sys.stderr.write(f"gh_pr_watch.py error: {err}\n") + return 1 + except KeyboardInterrupt: + sys.stderr.write("gh_pr_watch.py interrupted\n") + return 130 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/.opencode/skills/babysit-pr/scripts/test_gh_pr_watch.py b/.opencode/skills/babysit-pr/scripts/test_gh_pr_watch.py new file mode 100644 index 000000000000..c6a5d2568243 --- /dev/null +++ b/.opencode/skills/babysit-pr/scripts/test_gh_pr_watch.py @@ -0,0 +1,155 @@ +import argparse +import importlib.util +from pathlib import Path + +import pytest + + +MODULE_PATH = Path(__file__).with_name("gh_pr_watch.py") +MODULE_SPEC = importlib.util.spec_from_file_location("gh_pr_watch", MODULE_PATH) +gh_pr_watch = importlib.util.module_from_spec(MODULE_SPEC) +assert MODULE_SPEC.loader is not None +MODULE_SPEC.loader.exec_module(gh_pr_watch) + + +def sample_pr(): + return { + "number": 123, + "url": "https://github.com/openai/codex/pull/123", + "repo": "openai/codex", + "head_sha": "abc123", + "head_branch": "feature", + "state": "OPEN", + "merged": False, + "closed": False, + "mergeable": "MERGEABLE", + "merge_state_status": "CLEAN", + "review_decision": "", + } + + +def sample_checks(**overrides): + checks = { + "pending_count": 0, + "failed_count": 0, + "passed_count": 12, + "all_terminal": True, + } + checks.update(overrides) + return checks + + +def test_collect_snapshot_fetches_review_items_before_ci(monkeypatch, tmp_path): + call_order = [] + pr = sample_pr() + + monkeypatch.setattr(gh_pr_watch, "resolve_pr", lambda *args, **kwargs: pr) + monkeypatch.setattr(gh_pr_watch, "load_state", lambda path: ({}, True)) + monkeypatch.setattr( + gh_pr_watch, + "get_authenticated_login", + lambda: call_order.append("auth") or "octocat", + ) + monkeypatch.setattr( + gh_pr_watch, + "fetch_new_review_items", + lambda *args, **kwargs: call_order.append("review") or [], + ) + monkeypatch.setattr( + gh_pr_watch, + "get_pr_checks", + lambda *args, **kwargs: call_order.append("checks") or [], + ) + monkeypatch.setattr( + gh_pr_watch, + "summarize_checks", + lambda checks: call_order.append("summarize") or sample_checks(), + ) + monkeypatch.setattr( + gh_pr_watch, + "get_workflow_runs_for_sha", + lambda *args, **kwargs: call_order.append("workflow") or [], + ) + monkeypatch.setattr( + gh_pr_watch, + "failed_runs_from_workflow_runs", + lambda *args, **kwargs: call_order.append("failed_runs") or [], + ) + monkeypatch.setattr( + gh_pr_watch, + "recommend_actions", + lambda *args, **kwargs: call_order.append("recommend") or ["idle"], + ) + monkeypatch.setattr(gh_pr_watch, "save_state", lambda *args, **kwargs: None) + + args = argparse.Namespace( + pr="123", + repo=None, + state_file=str(tmp_path / "watcher-state.json"), + max_flaky_retries=3, + ) + + gh_pr_watch.collect_snapshot(args) + + assert call_order.index("review") < call_order.index("checks") + assert call_order.index("review") < call_order.index("workflow") + + +def test_recommend_actions_prioritizes_review_comments(): + actions = gh_pr_watch.recommend_actions( + sample_pr(), + sample_checks(failed_count=1), + [{"run_id": 99}], + [{"kind": "review_comment", "id": "1"}], + 0, + 3, + ) + + assert actions == [ + "process_review_comment", + "diagnose_ci_failure", + "retry_failed_checks", + ] + + +def test_run_watch_keeps_polling_open_ready_to_merge_pr(monkeypatch): + sleeps = [] + events = [] + snapshot = { + "pr": sample_pr(), + "checks": sample_checks(), + "failed_runs": [], + "new_review_items": [], + "actions": ["ready_to_merge"], + "retry_state": { + "current_sha_retries_used": 0, + "max_flaky_retries": 3, + }, + } + + monkeypatch.setattr( + gh_pr_watch, + "collect_snapshot", + lambda args: (snapshot, Path("/tmp/codex-babysit-pr-state.json")), + ) + monkeypatch.setattr( + gh_pr_watch, + "print_event", + lambda event, payload: events.append((event, payload)), + ) + + class StopWatch(Exception): + pass + + def fake_sleep(seconds): + sleeps.append(seconds) + if len(sleeps) >= 2: + raise StopWatch + + monkeypatch.setattr(gh_pr_watch.time, "sleep", fake_sleep) + + with pytest.raises(StopWatch): + gh_pr_watch.run_watch(argparse.Namespace(poll_seconds=30)) + + assert sleeps == [30, 30] + assert [event for event, _ in events] == ["snapshot", "snapshot"] diff --git a/.opencode/skills/code-review-breaking-changes/SKILL.md b/.opencode/skills/code-review-breaking-changes/SKILL.md new file mode 100644 index 000000000000..d0bddf281e8b --- /dev/null +++ b/.opencode/skills/code-review-breaking-changes/SKILL.md @@ -0,0 +1,12 @@ +--- +name: code-breaking-changes +description: Breaking changes +--- + +Search for breaking changes in external integration surfaces: +- app-server APIs +- CLI parameters +- configuration loading +- resuming sessions from existing rollouts + +Do not stop after finding one issue; analyze all possible ways breaking changes can happen. diff --git a/.opencode/skills/code-review-change-size/SKILL.md b/.opencode/skills/code-review-change-size/SKILL.md new file mode 100644 index 000000000000..4e8048dcd4b4 --- /dev/null +++ b/.opencode/skills/code-review-change-size/SKILL.md @@ -0,0 +1,11 @@ +--- +name: code-review-change-size +description: Change size guidance (800 lines) +--- + +Unless the change is mechanical the total number of changed lines should not exceed 800 lines. +For complex logic changes the size should be under 500 lines. + +If the change is larger, explain whether it can be split into reviewable stages and identify the smallest coherent stage to land first. +Base the staging suggestion on the actual diff, dependencies, and affected call sites. + diff --git a/.opencode/skills/code-review-context/SKILL.md b/.opencode/skills/code-review-context/SKILL.md new file mode 100644 index 000000000000..7faf3d7cd25b --- /dev/null +++ b/.opencode/skills/code-review-context/SKILL.md @@ -0,0 +1,13 @@ +--- +name: code-review-context +description: Model visible context +--- + +Codex maintains a context (history of messages) that is sent to the model in inference requests. + +1. No history rewrite - the context must be built up incrementally. +2. Avoid frequent changes to context that cause cache misses. +3. No unbounded items - everything injected in the model context must have a bounded size and a hard cap. +4. No items larger than 10K tokens. +5. Highlight new individual items that can cross >1k tokens as P0. These need an additional manual review. +6. All injected fragments must be defined as structs in `core/context` and implement ContextualUserFragment trait \ No newline at end of file diff --git a/.opencode/skills/code-review-testing/SKILL.md b/.opencode/skills/code-review-testing/SKILL.md new file mode 100644 index 000000000000..c8d99e13cdeb --- /dev/null +++ b/.opencode/skills/code-review-testing/SKILL.md @@ -0,0 +1,14 @@ +--- +name: code-review-testing +description: Test authoring guidance +--- + +For agent changes prefer integration tests over unit tests. Integration tests are under `core/suite` and use `test_codex` to set up a test instance of codex. + +Features that change the agent logic MUST add an integration test: +- Provide a list of major logic changes and user-facing behaviors that need to be tested. + +If unit tests are needed, put them in a dedicated test file (*_tests.rs). +Avoid test-only functions in the main implementation. + +Check whether there are existing helpers to make tests more streamlined and readable. diff --git a/.opencode/skills/code-review/SKILL.md b/.opencode/skills/code-review/SKILL.md new file mode 100644 index 000000000000..eec0787c2093 --- /dev/null +++ b/.opencode/skills/code-review/SKILL.md @@ -0,0 +1,14 @@ +--- +name: code-review +description: Run a final code review on a pull request +--- + +Use subagents to review code using all code-review-* skills in this repository other than this orchestrator. One subagent per skill. Pass full skill path to subagents. Use xhigh reasoning. + +You must return every single issue from every subagent. You can return an unlimited number of findings. +Use raw Markdown to report findings. +Number findings for ease of reference. +Each finding must include a specific file path and line number. + +If the GitHub user running the review is the owner of the pull request add a `code-reviewed` label. +Do not leave GitHub comments unless explicitly asked. diff --git a/.opencode/skills/codex-bug/SKILL.md b/.opencode/skills/codex-bug/SKILL.md new file mode 100644 index 000000000000..c7a688e64f61 --- /dev/null +++ b/.opencode/skills/codex-bug/SKILL.md @@ -0,0 +1,48 @@ +--- +name: codex-bug +description: Diagnose GitHub bug reports in openai/codex. Use when given a GitHub issue URL from openai/codex and asked to decide next steps such as verifying against the repo, requesting more info, or explaining why it is not a bug; follow any additional user-provided instructions. +--- + +# Codex Bug + +## Overview + +Diagnose a Codex GitHub bug report and decide the next action: verify against sources, request more info, or explain why it is not a bug. + +## Workflow + +1. Confirm the input + +- Require a GitHub issue URL that points to `github.com/openai/codex/issues/…`. +- If the URL is missing or not in the right repo, ask the user for the correct link. + +2. Network access + +- Always access the issue over the network immediately, even if you think access is blocked or unavailable. +- Prefer the GitHub API over HTML pages because the HTML is noisy: + - Issue: `https://api.github.com/repos/openai/codex/issues/` + - Comments: `https://api.github.com/repos/openai/codex/issues//comments` +- If the environment requires explicit approval, request it on demand via the tool and continue without additional user prompting. +- Only if the network attempt fails after requesting approval, explain what you can do offline (e.g., draft a response template) and ask how to proceed. + +3. Read the issue + +- Use the GitHub API responses (issue + comments) as the source of truth rather than scraping the HTML issue page. +- Extract: title, body, repro steps, expected vs actual, environment, logs, and any attachments. +- Note whether the report already includes logs or session details. +- If the report includes a thread ID, mention it in the summary and use it to look up the logs and session details if you have access to them. + +4. Summarize the bug before investigating + +- Before inspecting code, docs, or logs in depth, write a short summary of the report in your own words. +- Include the reported behavior, expected behavior, repro steps, environment, and what evidence is already attached or missing. + +5. Decide the course of action + +- **Verify with sources** when the report is specific and likely reproducible. Inspect relevant Codex files (or mention the files to inspect if access is unavailable). +- **Request more information** when the report is vague, missing repro steps, or lacks logs/environment. +- **Explain not a bug** when the report contradicts current behavior or documented constraints (cite the evidence from the issue and any local sources you checked). + +6. Respond + +- Provide a concise report of your findings and next steps. diff --git a/.opencode/skills/codex-issue-digest/SKILL.md b/.opencode/skills/codex-issue-digest/SKILL.md new file mode 100644 index 000000000000..b531748f8c47 --- /dev/null +++ b/.opencode/skills/codex-issue-digest/SKILL.md @@ -0,0 +1,102 @@ +--- +name: codex-issue-digest +description: Run a GitHub issue digest for openai/codex by feature-area labels, all areas, and configurable time windows. Use when asked to summarize recent Codex bug reports or enhancement requests, especially for owner-specific labels such as tui, exec, app, or similar areas. +--- + +# Codex Issue Digest + +## Objective + +Produce a concise, insight-oriented digest of `openai/codex` issues for the requested feature-area labels over the previous 24 hours by default. Honor a different duration when the user asks for one, for example "past week" or "48 hours". + +Include only issues that currently have `bug` or `enhancement` plus at least one requested owner label. If the user asks for all areas or all labels, collect `bug`/`enhancement` issues across all labels. + +## Inputs + +- Feature-area labels, for example `tui exec` +- `all areas` / `all labels` to scan all current feature labels +- Optional repo override, default `openai/codex` +- Optional time window, default previous 24 hours; examples: `48h`, `7d`, `1w`, `past week` + +## Workflow + +1. Run the collector from a current Codex repo checkout: + +```bash +python3 .codex/skills/codex-issue-digest/scripts/collect_issue_digest.py --labels tui exec --window-hours 24 +``` + +Use `--window "past week"` or `--window-hours 168` when the user asks for a non-default duration. Use `--all-labels` when the user says all areas or all labels. + +2. Use the JSON as the source of truth. It includes new issues, new issue comments, new reactions/upvotes, current labels, current reaction counts, model-ready `summary_inputs`, and detailed `digest_rows`. +3. Start the report with `## Summary`, then `## Details`. +4. In `## Summary`, write skim-first headlines: + - Lead with the most important fact or judgment. Do not start with aggregate counts unless the aggregate itself is the story. + - Make the first 1-3 bullets answer "what should owners pay attention to right now?" + - Bold only the critical insight phrase in each high-priority bullet, for example `**GPT-5.5 context is the dominant pressure point**`. + - Keep summary bullets short enough to scan in about 20 seconds. + - Put broad stats near the end of the summary, after the owner-relevant takeaways. + - Say clearly when there is nothing significant to act on. + - Call out any areas or themes receiving lots of user attention. + - Cluster and name themes yourself from `summary_inputs`; the collector intentionally does not hard-code issue categories. + - Use a cluster only when the issues genuinely share the same product problem. If several issues merely share a broad platform or label, describe them individually. + - Do not omit a repeated theme just because its individual issues fall below the details table cutoff. Several similar reports should be called out as a repeated customer concern. + - For single-issue rows, summarize the concern directly instead of calling it a cluster. + - Use inline numbered issue links from each relevant row's `ref_markdown`. +5. In `## Details`, include a compact table only when useful: + - Prefer rows from `digest_rows`; include a `Refs` column using each row's `ref_markdown`. + - Keep the table short; omit low-signal rows when the summary already covers them. + - Use compact columns such as marker, area, type, description, interactions, and refs. + - The `Description` cell should be a short owner-readable phrase. Use row `description`, title, body excerpts, and recent comments, but do not mechanically copy the raw GitHub issue title when it contains incidental details. + - A clear quiet/no-concern sentence when there is no meaningful signal. +6. Use the JSON `attention_marker` exactly. It is empty for normal rows, `πŸ”₯` for elevated rows, and `πŸ”₯πŸ”₯` for very high-attention rows. The actual cutoffs are in `attention_thresholds`. +7. Use inline numbered references where a row or bullet points to issues, for example `Compaction bugs [1](https://github.com/openai/codex/issues/123), [2](https://github.com/openai/codex/issues/456)`. Do not add a separate footnotes section. +8. Label `interactions` as `Interactions`; it counts posts/comments/reactions during the requested window, not unique people. +9. Mention the collector `script_version`, repo checkout `git_head`, and time window in the digest footer or final line. + +## Reaction Handling + +The collector uses GitHub reactions endpoints, which include `created_at`, to count reactions created during the digest window for hydrated issues. It reports both in-window reaction counts and current reaction totals. Treat current reaction totals as standing engagement, and treat `new_reactions` / `new_upvotes` as windowed activity. + +By default, the collector fetches issue comments with `since=` and caps the number of comment pages per issue. This keeps very long historical threads from dominating a digest run and focuses the report on recent posts. Use `--fetch-all-comments` only when exhaustive comment history is more important than runtime. + +GitHub issue search is still seeded by issue `updated_at`, so a purely reaction-only issue may be missed if reactions do not bump `updated_at`. Covering every reaction-only case would require either a persisted snapshot store or a broader scan of labeled issues. + +## Attention Markers + +The collector scales attention markers by the requested time window. The baseline is 10 human user interactions for `πŸ”₯` and 20 for `πŸ”₯πŸ”₯` over 24 hours; longer or shorter windows scale those cutoffs linearly and round up. For example, a one-week report uses 70 and 140 interactions. Human user interactions are human-authored new issue posts, human-authored new comments, and human reactions created during the window, including upvotes. Bot posts and bot reactions are excluded. In prose, explain this as high user interaction rather than naming the emoji. + +## Freshness + +The automation should run from a repo checkout that contains this skill. For shared daily use, prefer one of these patterns: + +- Run the automation in a checkout that is refreshed before the automation starts, for example with `git pull --ff-only`. +- If the automation cannot safely mutate the checkout, have it report the current `git_head` from the collector output so readers know which skill/script version produced the digest. + +## Sample Owner Prompt + +```text +Use $codex-issue-digest to run the Codex issue digest for labels tui and exec over the previous 24 hours. +``` + +```text +Use $codex-issue-digest to run the Codex issue digest for all areas over the past week. +``` + +## Validation + +Dry run the collector against recent issues: + +```bash +python3 .codex/skills/codex-issue-digest/scripts/collect_issue_digest.py --labels tui exec --window-hours 24 +``` + +```bash +python3 .codex/skills/codex-issue-digest/scripts/collect_issue_digest.py --all-labels --window "past week" --limit-issues 10 +``` + +Run the focused script tests: + +```bash +pytest .codex/skills/codex-issue-digest/scripts/test_collect_issue_digest.py +``` diff --git a/.opencode/skills/codex-issue-digest/agents/openai.yaml b/.opencode/skills/codex-issue-digest/agents/openai.yaml new file mode 100644 index 000000000000..706ce5e11b3e --- /dev/null +++ b/.opencode/skills/codex-issue-digest/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Codex Issue Digest" + short_description: "Summarize Codex issues by labels or all areas" + default_prompt: "Use $codex-issue-digest to run the Codex issue digest for labels tui and exec over the previous 24 hours." diff --git a/.opencode/skills/codex-issue-digest/scripts/collect_issue_digest.py b/.opencode/skills/codex-issue-digest/scripts/collect_issue_digest.py new file mode 100644 index 000000000000..e211af08f8b9 --- /dev/null +++ b/.opencode/skills/codex-issue-digest/scripts/collect_issue_digest.py @@ -0,0 +1,988 @@ +#!/usr/bin/env python3 +"""Collect recent openai/codex issue activity for owner-focused digests.""" + +import argparse +import json +import math +import re +import subprocess +import sys +from datetime import datetime, timedelta, timezone +from pathlib import Path +from urllib.parse import quote + +SCRIPT_VERSION = 2 +QUALIFYING_KIND_LABELS = ("bug", "enhancement") +REACTION_KEYS = ("+1", "-1", "laugh", "hooray", "confused", "heart", "rocket", "eyes") +BASE_ATTENTION_WINDOW_HOURS = 24.0 +ONE_ATTENTION_INTERACTION_THRESHOLD = 10 +TWO_ATTENTION_INTERACTION_THRESHOLD = 20 +ALL_LABEL_PHRASES = {"all", "all areas", "all labels", "all-areas", "all-labels", "*"} + + +class GhCommandError(RuntimeError): + pass + + +def parse_args(): + parser = argparse.ArgumentParser( + description="Collect recent GitHub issue activity for a Codex owner digest." + ) + parser.add_argument( + "--repo", default="openai/codex", help="OWNER/REPO, default openai/codex" + ) + parser.add_argument( + "--labels", + nargs="+", + default=[], + help="Feature-area labels owned by the digest recipient, for example: tui exec", + ) + parser.add_argument( + "--all-labels", + action="store_true", + help="Collect bug/enhancement issues across all feature-area labels", + ) + parser.add_argument( + "--window", + help='Lookback duration such as "24h", "7d", "1w", or "past week"', + ) + parser.add_argument( + "--window-hours", type=float, default=24.0, help="Lookback window" + ) + parser.add_argument( + "--since", help="UTC ISO timestamp override for the window start" + ) + parser.add_argument("--until", help="UTC ISO timestamp override for the window end") + parser.add_argument( + "--limit-issues", + type=int, + default=200, + help="Maximum candidate issues to hydrate after search", + ) + parser.add_argument( + "--body-chars", type=int, default=1200, help="Issue body excerpt length" + ) + parser.add_argument( + "--comment-chars", type=int, default=900, help="Comment excerpt length" + ) + parser.add_argument( + "--max-comment-pages", + type=int, + default=3, + help=( + "Maximum pages of issue comments to hydrate per issue after applying the " + "window filter. Use 0 with --fetch-all-comments for no page cap." + ), + ) + parser.add_argument( + "--fetch-all-comments", + action="store_true", + help="Hydrate complete issue comment histories instead of only window-updated comments.", + ) + return parser.parse_args() + + +def parse_timestamp(value, arg_name): + if value is None: + return None + normalized = value.strip() + if not normalized: + return None + if normalized.endswith("Z"): + normalized = f"{normalized[:-1]}+00:00" + try: + parsed = datetime.fromisoformat(normalized) + except ValueError as err: + raise ValueError(f"{arg_name} must be an ISO timestamp") from err + if parsed.tzinfo is None: + parsed = parsed.replace(tzinfo=timezone.utc) + return parsed.astimezone(timezone.utc) + + +def format_timestamp(value): + return ( + value.astimezone(timezone.utc) + .replace(microsecond=0) + .isoformat() + .replace("+00:00", "Z") + ) + + +def resolve_window(args): + until = parse_timestamp(args.until, "--until") or datetime.now(timezone.utc) + since = parse_timestamp(args.since, "--since") + if since is None: + hours = parse_duration_hours(getattr(args, "window", None)) + if hours is None: + hours = getattr(args, "window_hours", 24.0) + if hours <= 0: + raise ValueError("window duration must be > 0") + since = until - timedelta(hours=hours) + if since >= until: + raise ValueError("--since must be before --until") + return since, until + + +def parse_duration_hours(value): + if value is None: + return None + text = value.strip().casefold().replace("_", " ") + if not text: + return None + text = re.sub(r"^(past|last)\s+", "", text) + aliases = { + "day": 24.0, + "24h": 24.0, + "week": 168.0, + "7d": 168.0, + } + if text in aliases: + return aliases[text] + match = re.fullmatch(r"(\d+(?:\.\d+)?)\s*(h|hr|hrs|hour|hours)", text) + if match: + return float(match.group(1)) + match = re.fullmatch(r"(\d+(?:\.\d+)?)\s*(d|day|days)", text) + if match: + return float(match.group(1)) * 24.0 + match = re.fullmatch(r"(\d+(?:\.\d+)?)\s*(w|week|weeks)", text) + if match: + return float(match.group(1)) * 168.0 + raise ValueError(f"Unsupported duration: {value}") + + +def normalize_requested_labels(labels, all_labels=False): + out = [] + seen = set() + for raw in labels: + for piece in raw.split(","): + label = piece.strip() + if not label: + continue + key = label.casefold() + if key not in seen: + out.append(label) + seen.add(key) + phrase = " ".join(label.casefold() for label in out) + if all_labels or phrase in ALL_LABEL_PHRASES: + return [], True + if not out: + raise ValueError( + "At least one feature-area label is required, or use --all-labels" + ) + return out, False + + +def quote_label(label): + if re.fullmatch(r"[A-Za-z0-9_.:-]+", label): + return f"label:{label}" + escaped = label.replace('"', '\\"') + return f'label:"{escaped}"' + + +def build_search_queries( + repo, owner_labels, since, kind_labels=QUALIFYING_KIND_LABELS, all_labels=False +): + since_date = since.date().isoformat() + queries = [] + if all_labels: + for kind_label in kind_labels: + queries.append( + " ".join( + [ + f"repo:{repo}", + "is:issue", + f"updated:>={since_date}", + quote_label(kind_label), + ] + ) + ) + return queries + for owner_label in owner_labels: + for kind_label in kind_labels: + queries.append( + " ".join( + [ + f"repo:{repo}", + "is:issue", + f"updated:>={since_date}", + quote_label(owner_label), + quote_label(kind_label), + ] + ) + ) + return queries + + +def _format_gh_error(cmd, err): + stdout = (err.stdout or "").strip() + stderr = (err.stderr or "").strip() + parts = [f"GitHub CLI command failed: {' '.join(cmd)}"] + if stdout: + parts.append(f"stdout: {stdout}") + if stderr: + parts.append(f"stderr: {stderr}") + return "\n".join(parts) + + +def gh_json(args): + cmd = ["gh", *args] + try: + proc = subprocess.run(cmd, check=True, capture_output=True, text=True) + except FileNotFoundError as err: + raise GhCommandError("`gh` command not found") from err + except subprocess.CalledProcessError as err: + raise GhCommandError(_format_gh_error(cmd, err)) from err + raw = proc.stdout.strip() + if not raw: + return None + try: + return json.loads(raw) + except json.JSONDecodeError as err: + raise GhCommandError( + f"Failed to parse JSON from gh output for {' '.join(args)}" + ) from err + + +def gh_text(args): + cmd = ["gh", *args] + try: + proc = subprocess.run(cmd, check=True, capture_output=True, text=True) + except (FileNotFoundError, subprocess.CalledProcessError): + return "" + return proc.stdout.strip() + + +def git_head(): + try: + proc = subprocess.run( + ["git", "rev-parse", "--short=12", "HEAD"], + check=True, + capture_output=True, + text=True, + ) + except (FileNotFoundError, subprocess.CalledProcessError): + return None + return proc.stdout.strip() or None + + +def skill_relative_path(): + try: + return str(Path(__file__).resolve().relative_to(Path.cwd().resolve())) + except ValueError: + return str(Path(__file__).resolve()) + + +def gh_api_list_paginated(endpoint, per_page=100, max_pages=None, with_metadata=False): + items = [] + page = 1 + truncated = False + while True: + sep = "&" if "?" in endpoint else "?" + page_endpoint = f"{endpoint}{sep}per_page={per_page}&page={page}" + payload = gh_json(["api", page_endpoint]) + if payload is None: + break + if not isinstance(payload, list): + raise GhCommandError(f"Unexpected paginated payload from gh api {endpoint}") + items.extend(payload) + if len(payload) < per_page: + break + if max_pages is not None and page >= max_pages: + truncated = True + break + page += 1 + if with_metadata: + return { + "items": items, + "truncated": truncated, + "pages": page, + "max_pages": max_pages, + } + return items + + +def search_issue_numbers(queries, limit): + numbers = {} + for query in queries: + page = 1 + while True: + payload = gh_json( + [ + "api", + "search/issues", + "-X", + "GET", + "-f", + f"q={query}", + "-f", + "per_page=100", + "-f", + f"page={page}", + ] + ) + if not isinstance(payload, dict): + raise GhCommandError("Unexpected payload from GitHub issue search") + items = payload.get("items") or [] + if not isinstance(items, list): + raise GhCommandError("Expected search `items` to be a list") + for item in items: + if not isinstance(item, dict): + continue + number = item.get("number") + if isinstance(number, int): + numbers[number] = str(item.get("updated_at") or "") + if len(items) < 100 or len(numbers) >= limit: + break + page += 1 + ordered = sorted( + numbers, key=lambda number: (numbers[number], number), reverse=True + ) + return ordered[:limit] + + +def fetch_issue(repo, number): + payload = gh_json(["api", f"repos/{repo}/issues/{number}"]) + if not isinstance(payload, dict): + raise GhCommandError(f"Unexpected issue payload for #{number}") + return payload + + +def fetch_comments(repo, number, since=None, max_pages=None): + endpoint = f"repos/{repo}/issues/{number}/comments" + if since is not None: + endpoint = f"{endpoint}?since={quote(format_timestamp(since), safe='')}" + return gh_api_list_paginated( + endpoint, + max_pages=max_pages, + with_metadata=True, + ) + + +def fetch_reactions_for_item(endpoint, item): + if reaction_summary(item)["total"] <= 0: + return [] + return gh_api_list_paginated(endpoint) + + +def fetch_comment_reactions(repo, comments): + reactions_by_comment_id = {} + for comment in comments: + comment_id = comment.get("id") + if comment_id in (None, ""): + continue + endpoint = f"repos/{repo}/issues/comments/{comment_id}/reactions" + reactions_by_comment_id[comment_id] = fetch_reactions_for_item( + endpoint, comment + ) + return reactions_by_comment_id + + +def extract_login(user_obj): + if isinstance(user_obj, dict): + return str(user_obj.get("login") or "") + return "" + + +def is_bot_login(login): + return bool(login) and login.lower().endswith("[bot]") + + +def is_human_user(user_obj): + login = extract_login(user_obj) + return bool(login) and not is_bot_login(login) + + +def label_names(issue): + labels = [] + for label in issue.get("labels") or []: + if isinstance(label, dict) and label.get("name"): + labels.append(str(label["name"])) + return sorted(labels, key=str.casefold) + + +def matching_labels(labels, requested): + labels_by_key = {label.casefold(): label for label in labels} + return [label for label in requested if label.casefold() in labels_by_key] + + +def area_labels(labels): + kind_keys = {label.casefold() for label in QUALIFYING_KIND_LABELS} + return [label for label in labels if label.casefold() not in kind_keys] + + +def attention_thresholds_for_window(window_hours): + if window_hours <= 0: + raise ValueError("window_hours must be > 0") + window_hours = round(window_hours, 6) + scale = window_hours / BASE_ATTENTION_WINDOW_HOURS + elevated = max(1, math.ceil(ONE_ATTENTION_INTERACTION_THRESHOLD * scale)) + very_high = max( + elevated + 1, math.ceil(TWO_ATTENTION_INTERACTION_THRESHOLD * scale) + ) + return { + "base_window_hours": BASE_ATTENTION_WINDOW_HOURS, + "window_hours": round(window_hours, 3), + "scale": round(scale, 3), + "elevated": elevated, + "very_high": very_high, + } + + +def attention_level_for(user_interactions, attention_thresholds=None): + thresholds = attention_thresholds or attention_thresholds_for_window( + BASE_ATTENTION_WINDOW_HOURS + ) + if user_interactions >= thresholds["very_high"]: + return 2 + if user_interactions >= thresholds["elevated"]: + return 1 + return 0 + + +def attention_marker_for(user_interactions, attention_thresholds=None): + return "πŸ”₯" * attention_level_for(user_interactions, attention_thresholds) + + +def reaction_summary(item): + reactions = item.get("reactions") + if not isinstance(reactions, dict): + return {"total": 0, "counts": {}} + counts = {} + for key in REACTION_KEYS: + value = reactions.get(key, 0) + if isinstance(value, int) and value: + counts[key] = value + total = reactions.get("total_count") + if not isinstance(total, int): + total = sum(counts.values()) + return {"total": total, "counts": counts} + + +def reaction_event_summary(reactions, since, until): + counts = {} + total = 0 + for reaction in reactions or []: + if not isinstance(reaction, dict): + continue + if not is_in_window(str(reaction.get("created_at") or ""), since, until): + continue + if not is_human_user(reaction.get("user")): + continue + content = str(reaction.get("content") or "") + if not content: + continue + counts[content] = counts.get(content, 0) + 1 + total += 1 + return { + "total": total, + "counts": counts, + "upvotes": counts.get("+1", 0), + } + + +def compact_text(value, limit): + text = re.sub(r"\s+", " ", str(value or "")).strip() + if limit <= 0: + return "" + if len(text) <= limit: + return text + return f"{text[: max(limit - 1, 0)].rstrip()}..." + + +def clean_title_for_description(title): + cleaned = re.sub(r"\s+", " ", str(title or "")).strip() + cleaned = re.sub( + r"^(codex(?: desktop| app|\.app| cli)?|desktop|windows codex app)\s*[:,-]\s*", + "", + cleaned, + flags=re.IGNORECASE, + ) + cleaned = re.sub(r"^on windows,\s*", "Windows: ", cleaned, flags=re.IGNORECASE) + cleaned = cleaned.strip(" -:;") + return compact_text(cleaned, 80) or "Issue needs owner review" + + +def issue_description(issue): + return clean_title_for_description(issue.get("title")) + + +def is_in_window(timestamp, since, until): + parsed = parse_timestamp(timestamp, "timestamp") + if parsed is None: + return False + return since <= parsed < until + + +def summarize_comment( + comment, comment_chars, reaction_events=None, since=None, until=None +): + reactions = reaction_summary(comment) + new_reactions = ( + reaction_event_summary(reaction_events, since, until) + if since is not None and until is not None + else {"total": 0, "counts": {}, "upvotes": 0} + ) + human_user_interaction = is_human_user(comment.get("user")) + return { + "id": comment.get("id"), + "author": extract_login(comment.get("user")), + "author_association": str(comment.get("author_association") or ""), + "created_at": str(comment.get("created_at") or ""), + "updated_at": str(comment.get("updated_at") or ""), + "url": str(comment.get("html_url") or ""), + "human_user_interaction": human_user_interaction, + "reactions": reactions["counts"], + "reaction_total": reactions["total"], + "new_reactions": new_reactions["total"], + "new_upvotes": new_reactions["upvotes"], + "new_reaction_counts": new_reactions["counts"], + "body_excerpt": compact_text(comment.get("body"), comment_chars), + } + + +def summarize_issue( + issue, + comments, + requested_labels, + since, + until, + body_chars, + comment_chars, + issue_reaction_events=None, + comment_reactions_by_id=None, + all_labels=False, + comments_hydration=None, + attention_thresholds=None, +): + labels = label_names(issue) + labels_by_key = {label.casefold() for label in labels} + kind_labels = [ + label for label in QUALIFYING_KIND_LABELS if label.casefold() in labels_by_key + ] + if all_labels: + owner_labels = area_labels(labels) or ["unlabeled"] + else: + owner_labels = matching_labels(labels, requested_labels) + if not kind_labels or not owner_labels: + return None + + updated_at = str(issue.get("updated_at") or "") + if not is_in_window(updated_at, since, until): + return None + + new_issue = is_in_window(str(issue.get("created_at") or ""), since, until) + comment_reactions_by_id = comment_reactions_by_id or {} + new_comments = [ + summarize_comment( + comment, + comment_chars, + reaction_events=comment_reactions_by_id.get(comment.get("id")), + since=since, + until=until, + ) + for comment in comments + if is_in_window(str(comment.get("created_at") or ""), since, until) + ] + new_comments.sort(key=lambda item: (item["created_at"], str(item["id"]))) + + issue_reactions = reaction_summary(issue) + issue_reaction_events_summary = reaction_event_summary( + issue_reaction_events, since, until + ) + comment_reaction_events_summary = reaction_event_summary( + [ + reaction + for reactions in comment_reactions_by_id.values() + for reaction in reactions + ], + since, + until, + ) + new_reactions = ( + issue_reaction_events_summary["total"] + + comment_reaction_events_summary["total"] + ) + new_upvotes = ( + issue_reaction_events_summary["upvotes"] + + comment_reaction_events_summary["upvotes"] + ) + all_comment_reaction_total = sum( + reaction_summary(comment)["total"] for comment in comments + ) + new_comment_reaction_total = sum( + comment["reaction_total"] for comment in new_comments + ) + new_issue_user_interaction = new_issue and is_human_user(issue.get("user")) + new_comment_user_interactions = sum( + 1 for comment in new_comments if comment["human_user_interaction"] + ) + user_interactions = ( + int(new_issue_user_interaction) + new_comment_user_interactions + new_reactions + ) + attention_level = attention_level_for(user_interactions, attention_thresholds) + attention_marker = attention_marker_for(user_interactions, attention_thresholds) + updated_without_visible_new_post = ( + not new_issue and not new_comments and new_reactions == 0 + ) + + engagement_score = ( + len(new_comments) * 3 + + new_reactions + + issue_reactions["total"] + + new_comment_reaction_total + + min(int(issue.get("comments") or len(comments) or 0), 10) + ) + + return { + "number": issue.get("number"), + "title": str(issue.get("title") or ""), + "description": issue_description(issue), + "url": str(issue.get("html_url") or ""), + "state": str(issue.get("state") or ""), + "author": extract_login(issue.get("user")), + "author_association": str(issue.get("author_association") or ""), + "created_at": str(issue.get("created_at") or ""), + "updated_at": updated_at, + "labels": labels, + "kind_labels": kind_labels, + "owner_labels": owner_labels, + "comments_total": int(issue.get("comments") or len(comments) or 0), + "comments_hydration": comments_hydration + or { + "fetched": len(comments), + "since": None, + "truncated": False, + "max_pages": None, + }, + "issue_reactions": issue_reactions["counts"], + "issue_reaction_total": issue_reactions["total"], + "comment_reaction_total": all_comment_reaction_total, + "new_comment_reaction_total": new_comment_reaction_total, + "new_issue_reactions": issue_reaction_events_summary["total"], + "new_issue_upvotes": issue_reaction_events_summary["upvotes"], + "new_comment_reactions": comment_reaction_events_summary["total"], + "new_comment_upvotes": comment_reaction_events_summary["upvotes"], + "new_reactions": new_reactions, + "new_upvotes": new_upvotes, + "user_interactions": user_interactions, + "attention": attention_level > 0, + "attention_level": attention_level, + "attention_marker": attention_marker, + "engagement_score": engagement_score, + "activity": { + "new_issue": new_issue, + "new_comments": len(new_comments), + "new_human_comments": new_comment_user_interactions, + "new_reactions": new_reactions, + "new_upvotes": new_upvotes, + "updated_without_visible_new_post": updated_without_visible_new_post, + }, + "body_excerpt": compact_text(issue.get("body"), body_chars), + "new_comments": new_comments, + } + + +def count_by_label(issues, labels): + out = {} + for label in labels: + matching = [issue for issue in issues if label in issue["owner_labels"]] + out[label] = { + "issues": len(matching), + "new_issues": sum( + 1 for issue in matching if issue["activity"]["new_issue"] + ), + "new_comments": sum( + issue["activity"]["new_comments"] for issue in matching + ), + } + return out + + +def count_by_kind(issues): + out = {} + for kind in QUALIFYING_KIND_LABELS: + matching = [issue for issue in issues if kind in issue["kind_labels"]] + out[kind] = { + "issues": len(matching), + "new_issues": sum( + 1 for issue in matching if issue["activity"]["new_issue"] + ), + "new_comments": sum( + issue["activity"]["new_comments"] for issue in matching + ), + } + return out + + +def hot_items(issues, limit=8): + ranked = sorted( + issues, + key=lambda issue: ( + issue["attention"], + issue["attention_level"], + issue["user_interactions"], + issue["engagement_score"], + issue["activity"]["new_comments"], + issue["issue_reaction_total"] + issue["comment_reaction_total"], + issue["updated_at"], + ), + reverse=True, + ) + return [ + { + "number": issue["number"], + "title": issue["title"], + "url": issue["url"], + "owner_labels": issue["owner_labels"], + "kind_labels": issue["kind_labels"], + "attention": issue["attention"], + "attention_level": issue["attention_level"], + "attention_marker": issue["attention_marker"], + "user_interactions": issue["user_interactions"], + "new_reactions": issue["new_reactions"], + "new_upvotes": issue["new_upvotes"], + "engagement_score": issue["engagement_score"], + "new_comments": issue["activity"]["new_comments"], + "reaction_total": issue["issue_reaction_total"] + + issue["comment_reaction_total"], + } + for issue in ranked[:limit] + if issue["engagement_score"] > 0 + ] + + +def ranked_digest_issues(issues): + return sorted( + issues, + key=lambda issue: ( + issue["attention"], + issue["attention_level"], + issue["user_interactions"], + issue["engagement_score"], + issue["activity"]["new_comments"], + issue["updated_at"], + ), + reverse=True, + ) + + +def digest_rows(issues, limit=10, ref_map=None): + ranked = ranked_digest_issues(issues) + if ref_map is None: + ref_map = {issue["number"]: ref for ref, issue in enumerate(ranked, start=1)} + rows = [] + for issue in ranked[:limit]: + ref = ref_map[issue["number"]] + reaction_total = issue["issue_reaction_total"] + issue["comment_reaction_total"] + rows.append( + { + "ref": ref, + "ref_markdown": f"[{ref}]({issue['url']})", + "marker": issue["attention_marker"], + "attention_marker": issue["attention_marker"], + "number": issue["number"], + "description": issue["description"], + "title": issue["title"], + "url": issue["url"], + "area": ", ".join(issue["owner_labels"]), + "kind": ", ".join(issue["kind_labels"]), + "state": issue["state"], + "interactions": issue["user_interactions"], + "user_interactions": issue["user_interactions"], + "new_reactions": issue["new_reactions"], + "new_upvotes": issue["new_upvotes"], + "current_reactions": reaction_total, + } + ) + return rows + + +def issue_ref_markdown(issue, ref_map): + ref = ref_map[issue["number"]] + return f"[{ref}]({issue['url']})" + + +def summary_inputs(issues, limit=80, ref_map=None): + ranked = ranked_digest_issues(issues) + if ref_map is None: + ref_map = {issue["number"]: ref for ref, issue in enumerate(ranked, start=1)} + rows = [] + for issue in ranked[:limit]: + rows.append( + { + "ref": ref_map[issue["number"]], + "ref_markdown": issue_ref_markdown(issue, ref_map), + "number": issue["number"], + "title": issue["title"], + "description": issue["description"], + "url": issue["url"], + "labels": issue["labels"], + "owner_labels": issue["owner_labels"], + "kind_labels": issue["kind_labels"], + "state": issue.get("state", ""), + "attention_marker": issue.get("attention_marker", ""), + "interactions": issue["user_interactions"], + "new_comments": issue["activity"].get("new_comments", 0), + "new_reactions": issue.get("new_reactions", 0), + "new_upvotes": issue.get("new_upvotes", 0), + "current_reactions": issue.get("issue_reaction_total", 0) + + issue.get("comment_reaction_total", 0), + } + ) + return rows + + +def collect_digest(args): + since, until = resolve_window(args) + window_hours = (until - since).total_seconds() / 3600 + attention_thresholds = attention_thresholds_for_window(window_hours) + requested_labels, all_labels = normalize_requested_labels( + args.labels, all_labels=args.all_labels + ) + queries = build_search_queries( + args.repo, requested_labels, since, all_labels=all_labels + ) + numbers = search_issue_numbers(queries, args.limit_issues) + gh_version_output = gh_text(["--version"]) + + issues = [] + max_comment_pages = None if args.max_comment_pages <= 0 else args.max_comment_pages + for number in numbers: + issue = fetch_issue(args.repo, number) + comments_since = None if args.fetch_all_comments else since + comments_payload = fetch_comments( + args.repo, + number, + since=comments_since, + max_pages=max_comment_pages, + ) + comments = comments_payload["items"] + issue_reaction_events = fetch_reactions_for_item( + f"repos/{args.repo}/issues/{number}/reactions", issue + ) + comment_reactions_by_id = fetch_comment_reactions(args.repo, comments) + comments_hydration = { + "fetched": len(comments), + "total": int(issue.get("comments") or len(comments) or 0), + "since": format_timestamp(comments_since) if comments_since else None, + "truncated": comments_payload["truncated"], + "max_pages": comments_payload["max_pages"], + "fetch_all_comments": args.fetch_all_comments, + } + summary = summarize_issue( + issue, + comments, + requested_labels, + since, + until, + args.body_chars, + args.comment_chars, + issue_reaction_events=issue_reaction_events, + comment_reactions_by_id=comment_reactions_by_id, + all_labels=all_labels, + comments_hydration=comments_hydration, + attention_thresholds=attention_thresholds, + ) + if summary is not None: + issues.append(summary) + + issues.sort( + key=lambda issue: (issue["updated_at"], int(issue["number"] or 0)), reverse=True + ) + totals = { + "candidate_issues": len(numbers), + "included_issues": len(issues), + "new_issues": sum(1 for issue in issues if issue["activity"]["new_issue"]), + "issues_with_new_comments": sum( + 1 for issue in issues if issue["activity"]["new_comments"] > 0 + ), + "new_comments": sum(issue["activity"]["new_comments"] for issue in issues), + "comments_fetched": sum( + issue["comments_hydration"]["fetched"] for issue in issues + ), + "issues_with_truncated_comment_hydration": sum( + 1 for issue in issues if issue["comments_hydration"]["truncated"] + ), + "updated_without_visible_new_post": sum( + 1 + for issue in issues + if issue["activity"]["updated_without_visible_new_post"] + ), + "issue_reactions_current_total": sum( + issue["issue_reaction_total"] for issue in issues + ), + "comment_reactions_current_total": sum( + issue["comment_reaction_total"] for issue in issues + ), + "new_reactions": sum(issue["new_reactions"] for issue in issues), + "new_upvotes": sum(issue["new_upvotes"] for issue in issues), + "user_interactions": sum(issue["user_interactions"] for issue in issues), + } + ranked = ranked_digest_issues(issues) + ref_map = {issue["number"]: ref for ref, issue in enumerate(ranked, start=1)} + filter_label = "all" if all_labels else requested_labels + + return { + "generated_at": format_timestamp(datetime.now(timezone.utc)), + "source": { + "repo": args.repo, + "skill": "codex-issue-digest", + "collector": skill_relative_path(), + "script_version": SCRIPT_VERSION, + "git_head": git_head(), + "gh_version": gh_version_output.splitlines()[0] + if gh_version_output + else None, + }, + "window": { + "since": format_timestamp(since), + "until": format_timestamp(until), + "hours": round(window_hours, 3), + }, + "attention_thresholds": attention_thresholds, + "filters": { + "owner_labels": filter_label, + "all_labels": all_labels, + "kind_labels": list(QUALIFYING_KIND_LABELS), + }, + "collection_notes": [ + "Issues are selected when they currently have bug or enhancement plus at least one requested owner label and were updated during the window.", + "By default, issue comments are fetched with since=window_start and a max page cap to avoid long historical threads; use --fetch-all-comments when exhaustive comment history is needed.", + "New issue comments are filtered by comment creation time within the window from the fetched comment set.", + "Reaction events are counted by GitHub reaction created_at timestamps for hydrated issues and fetched comments.", + "Current reaction totals are standing engagement signals; new_reactions and new_upvotes are windowed activity.", + "The collector does not assign semantic clusters; use summary_inputs as model-ready evidence for report-time clustering.", + "Pure reaction-only issues may be missed if GitHub issue search does not surface them via updated_at.", + "Issues updated during the window without a new issue body or new comment are retained because label/status edits can still be useful owner signals.", + ], + "totals": totals, + "by_owner_label": count_by_label( + issues, + sorted( + {area for issue in issues for area in issue["owner_labels"]}, + key=str.casefold, + ) + if all_labels + else requested_labels, + ), + "by_kind_label": count_by_kind(issues), + "hot_items": hot_items(issues), + "summary_inputs": summary_inputs(issues, ref_map=ref_map), + "digest_rows": digest_rows(issues, ref_map=ref_map), + "issues": issues, + } + + +def main(): + args = parse_args() + try: + digest = collect_digest(args) + except (GhCommandError, RuntimeError, ValueError) as err: + sys.stderr.write(f"collect_issue_digest.py error: {err}\n") + return 1 + sys.stdout.write(json.dumps(digest, indent=2, sort_keys=True) + "\n") + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/.opencode/skills/codex-issue-digest/scripts/test_collect_issue_digest.py b/.opencode/skills/codex-issue-digest/scripts/test_collect_issue_digest.py new file mode 100644 index 000000000000..1c283ea2f694 --- /dev/null +++ b/.opencode/skills/codex-issue-digest/scripts/test_collect_issue_digest.py @@ -0,0 +1,614 @@ +import importlib.util +from datetime import timezone +from pathlib import Path + + +MODULE_PATH = Path(__file__).with_name("collect_issue_digest.py") +MODULE_SPEC = importlib.util.spec_from_file_location( + "collect_issue_digest", MODULE_PATH +) +collect_issue_digest = importlib.util.module_from_spec(MODULE_SPEC) +assert MODULE_SPEC.loader is not None +MODULE_SPEC.loader.exec_module(collect_issue_digest) + + +def test_build_search_queries_uses_each_owner_and_kind_label(): + since = collect_issue_digest.parse_timestamp("2026-04-25T12:34:56Z", "--since") + + queries = collect_issue_digest.build_search_queries( + "openai/codex", ["tui", "exec"], since + ) + + assert queries == [ + "repo:openai/codex is:issue updated:>=2026-04-25 label:tui label:bug", + "repo:openai/codex is:issue updated:>=2026-04-25 label:tui label:enhancement", + "repo:openai/codex is:issue updated:>=2026-04-25 label:exec label:bug", + "repo:openai/codex is:issue updated:>=2026-04-25 label:exec label:enhancement", + ] + + +def test_build_search_queries_can_scan_all_labels(): + since = collect_issue_digest.parse_timestamp("2026-04-25T12:34:56Z", "--since") + + queries = collect_issue_digest.build_search_queries( + "openai/codex", [], since, all_labels=True + ) + + assert queries == [ + "repo:openai/codex is:issue updated:>=2026-04-25 label:bug", + "repo:openai/codex is:issue updated:>=2026-04-25 label:enhancement", + ] + + +def test_normalize_requested_labels_accepts_all_area_phrases(): + assert collect_issue_digest.normalize_requested_labels(["all", "areas"]) == ( + [], + True, + ) + assert collect_issue_digest.normalize_requested_labels(["all-labels"]) == ( + [], + True, + ) + + +def test_summarize_issue_keeps_new_comments_and_reaction_signals(): + since = collect_issue_digest.parse_timestamp("2026-04-25T00:00:00Z", "--since") + until = collect_issue_digest.parse_timestamp("2026-04-26T00:00:00Z", "--until") + issue = { + "number": 123, + "title": "TUI does not redraw", + "html_url": "https://github.com/openai/codex/issues/123", + "state": "open", + "created_at": "2026-04-24T20:00:00Z", + "updated_at": "2026-04-25T10:00:00Z", + "user": {"login": "alice"}, + "author_association": "NONE", + "comments": 2, + "body": "The terminal freezes after resize.", + "labels": [{"name": "bug"}, {"name": "tui"}], + "reactions": {"total_count": 3, "+1": 2, "rocket": 1}, + } + comments = [ + { + "id": 1, + "created_at": "2026-04-25T11:00:00Z", + "updated_at": "2026-04-25T11:00:00Z", + "html_url": "https://github.com/openai/codex/issues/123#issuecomment-1", + "user": {"login": "bob"}, + "author_association": "MEMBER", + "body": "I can reproduce this on main.", + "reactions": {"total_count": 4, "heart": 1, "+1": 3}, + }, + { + "id": 2, + "created_at": "2026-04-24T11:00:00Z", + "updated_at": "2026-04-24T11:00:00Z", + "html_url": "https://github.com/openai/codex/issues/123#issuecomment-2", + "user": {"login": "carol"}, + "author_association": "NONE", + "body": "Older comment.", + "reactions": {"total_count": 1, "eyes": 1}, + }, + ] + + summary = collect_issue_digest.summarize_issue( + issue, + comments, + ["tui", "exec"], + since, + until, + body_chars=200, + comment_chars=200, + ) + + assert summary == { + "number": 123, + "title": "TUI does not redraw", + "description": "TUI does not redraw", + "url": "https://github.com/openai/codex/issues/123", + "state": "open", + "author": "alice", + "author_association": "NONE", + "created_at": "2026-04-24T20:00:00Z", + "updated_at": "2026-04-25T10:00:00Z", + "labels": ["bug", "tui"], + "kind_labels": ["bug"], + "owner_labels": ["tui"], + "comments_total": 2, + "comments_hydration": { + "fetched": 2, + "since": None, + "truncated": False, + "max_pages": None, + }, + "issue_reactions": {"+1": 2, "rocket": 1}, + "issue_reaction_total": 3, + "comment_reaction_total": 5, + "new_comment_reaction_total": 4, + "new_issue_reactions": 0, + "new_issue_upvotes": 0, + "new_comment_reactions": 0, + "new_comment_upvotes": 0, + "new_reactions": 0, + "new_upvotes": 0, + "user_interactions": 1, + "attention": False, + "attention_level": 0, + "attention_marker": "", + "engagement_score": 12, + "activity": { + "new_issue": False, + "new_comments": 1, + "new_human_comments": 1, + "new_reactions": 0, + "new_upvotes": 0, + "updated_without_visible_new_post": False, + }, + "body_excerpt": "The terminal freezes after resize.", + "new_comments": [ + { + "id": 1, + "author": "bob", + "author_association": "MEMBER", + "created_at": "2026-04-25T11:00:00Z", + "updated_at": "2026-04-25T11:00:00Z", + "url": "https://github.com/openai/codex/issues/123#issuecomment-1", + "human_user_interaction": True, + "reactions": {"+1": 3, "heart": 1}, + "reaction_total": 4, + "new_reactions": 0, + "new_upvotes": 0, + "new_reaction_counts": {}, + "body_excerpt": "I can reproduce this on main.", + } + ], + } + + +def test_summarize_issue_filters_non_owner_or_non_kind_labels(): + since = collect_issue_digest.parse_timestamp("2026-04-25T00:00:00Z", "--since") + until = collect_issue_digest.parse_timestamp("2026-04-26T00:00:00Z", "--until") + base_issue = { + "number": 1, + "title": "Question", + "created_at": "2026-04-25T01:00:00Z", + "updated_at": "2026-04-25T01:00:00Z", + "labels": [{"name": "question"}, {"name": "tui"}], + } + + assert ( + collect_issue_digest.summarize_issue( + base_issue, + [], + ["tui"], + since, + until, + body_chars=100, + comment_chars=100, + ) + is None + ) + + issue_without_owner = dict(base_issue) + issue_without_owner["labels"] = [{"name": "bug"}, {"name": "app"}] + + assert ( + collect_issue_digest.summarize_issue( + issue_without_owner, + [], + ["tui"], + since, + until, + body_chars=100, + comment_chars=100, + ) + is None + ) + + +def test_resolve_window_defaults_to_previous_hours(): + class Args: + since = None + until = "2026-04-26T12:00:00Z" + window_hours = 24 + + since, until = collect_issue_digest.resolve_window(Args()) + + assert since.isoformat() == "2026-04-25T12:00:00+00:00" + assert until.tzinfo == timezone.utc + + +def test_parse_duration_hours_accepts_common_phrases(): + assert collect_issue_digest.parse_duration_hours("past week") == 168 + assert collect_issue_digest.parse_duration_hours("48h") == 48 + assert collect_issue_digest.parse_duration_hours("2 days") == 48 + assert collect_issue_digest.parse_duration_hours("1w") == 168 + + +def test_attention_thresholds_scale_by_window_length(): + one_day = collect_issue_digest.attention_thresholds_for_window(24) + assert one_day["elevated"] == 10 + assert one_day["very_high"] == 20 + + half_day = collect_issue_digest.attention_thresholds_for_window(12) + assert half_day["elevated"] == 5 + assert half_day["very_high"] == 10 + + week = collect_issue_digest.attention_thresholds_for_window(168) + assert week["elevated"] == 70 + assert week["very_high"] == 140 + assert collect_issue_digest.attention_marker_for(69, week) == "" + assert collect_issue_digest.attention_marker_for(107, week) == "πŸ”₯" + assert collect_issue_digest.attention_marker_for(140, week) == "πŸ”₯πŸ”₯" + + +def test_fetch_comments_uses_since_filter_and_page_cap(monkeypatch): + calls = [] + + def fake_gh_json(args): + calls.append(args) + return [{"id": idx} for idx in range(100)] + + monkeypatch.setattr(collect_issue_digest, "gh_json", fake_gh_json) + since = collect_issue_digest.parse_timestamp("2026-04-25T00:00:00Z", "--since") + + payload = collect_issue_digest.fetch_comments( + "openai/codex", 123, since=since, max_pages=1 + ) + + assert len(payload["items"]) == 100 + assert payload["truncated"] is True + assert payload["max_pages"] == 1 + assert calls == [ + [ + "api", + "repos/openai/codex/issues/123/comments?since=2026-04-25T00%3A00%3A00Z&per_page=100&page=1", + ] + ] + + +def test_issue_description_prefers_title_over_body_noise(): + issue = { + "title": "Codex.app GUI: MCP child processes not reaped after task completion", + "body": "A later crash mention should not override the title-level symptom.", + "labels": [{"name": "app"}, {"name": "bug"}], + } + + description = collect_issue_digest.issue_description(issue) + assert "MCP child processes" in description + assert "crash" not in description.casefold() + + +def test_attention_markers_count_human_user_interactions(): + since = collect_issue_digest.parse_timestamp("2026-04-25T00:00:00Z", "--since") + until = collect_issue_digest.parse_timestamp("2026-04-26T00:00:00Z", "--until") + issue = { + "number": 456, + "title": "Agent context is exploding", + "html_url": "https://github.com/openai/codex/issues/456", + "state": "open", + "created_at": "2026-04-25T01:00:00Z", + "updated_at": "2026-04-25T12:00:00Z", + "user": {"login": "alice"}, + "labels": [{"name": "bug"}, {"name": "agent"}], + } + comments = [ + { + "id": idx, + "created_at": "2026-04-25T02:00:00Z", + "updated_at": "2026-04-25T02:00:00Z", + "user": {"login": f"user-{idx}"}, + "body": "same here", + } + for idx in range(9) + ] + comments.append( + { + "id": 99, + "created_at": "2026-04-25T02:00:00Z", + "updated_at": "2026-04-25T02:00:00Z", + "user": {"login": "github-actions[bot]"}, + "body": "duplicate bot note", + } + ) + + summary = collect_issue_digest.summarize_issue( + issue, + comments, + ["agent"], + since, + until, + body_chars=100, + comment_chars=100, + ) + + assert summary["user_interactions"] == 10 + assert summary["activity"]["new_human_comments"] == 9 + assert summary["attention"] is True + assert summary["attention_level"] == 1 + assert summary["attention_marker"] == "πŸ”₯" + + issue["created_at"] = "2026-04-24T01:00:00Z" + comments.extend( + { + "id": idx, + "created_at": "2026-04-25T03:00:00Z", + "updated_at": "2026-04-25T03:00:00Z", + "user": {"login": f"extra-user-{idx}"}, + "body": "also seeing this", + } + for idx in range(11) + ) + + summary = collect_issue_digest.summarize_issue( + issue, + comments, + ["agent"], + since, + until, + body_chars=100, + comment_chars=100, + ) + + assert summary["user_interactions"] == 20 + assert summary["attention_level"] == 2 + assert summary["attention_marker"] == "πŸ”₯πŸ”₯" + + +def test_reactions_count_toward_attention_markers(): + since = collect_issue_digest.parse_timestamp("2026-04-25T00:00:00Z", "--since") + until = collect_issue_digest.parse_timestamp("2026-04-26T00:00:00Z", "--until") + issue = { + "number": 789, + "title": "Support 1M token context", + "html_url": "https://github.com/openai/codex/issues/789", + "state": "open", + "created_at": "2026-04-24T01:00:00Z", + "updated_at": "2026-04-25T12:00:00Z", + "user": {"login": "alice"}, + "labels": [{"name": "enhancement"}, {"name": "context"}], + "reactions": {"total_count": 20, "+1": 20}, + } + comments = [ + { + "id": 1, + "created_at": "2026-04-25T02:00:00Z", + "updated_at": "2026-04-25T02:00:00Z", + "user": {"login": "commenter"}, + "body": "please", + "reactions": {"total_count": 2, "+1": 2}, + } + ] + issue_reactions = [ + { + "content": "+1", + "created_at": "2026-04-25T03:00:00Z", + "user": {"login": f"reactor-{idx}"}, + } + for idx in range(18) + ] + comment_reactions_by_id = { + 1: [ + { + "content": "heart", + "created_at": "2026-04-25T04:00:00Z", + "user": {"login": "human-reactor"}, + }, + { + "content": "+1", + "created_at": "2026-04-25T04:00:00Z", + "user": {"login": "github-actions[bot]"}, + }, + ] + } + + summary = collect_issue_digest.summarize_issue( + issue, + comments, + ["context"], + since, + until, + body_chars=100, + comment_chars=100, + issue_reaction_events=issue_reactions, + comment_reactions_by_id=comment_reactions_by_id, + ) + + assert summary["new_reactions"] == 19 + assert summary["new_upvotes"] == 18 + assert summary["user_interactions"] == 20 + assert summary["attention_level"] == 2 + assert summary["attention_marker"] == "πŸ”₯πŸ”₯" + assert summary["new_comments"][0]["new_reactions"] == 1 + assert summary["new_comments"][0]["new_upvotes"] == 0 + + +def test_digest_rows_are_table_ready_with_concise_descriptions(): + rows = collect_issue_digest.digest_rows( + [ + { + "number": 1, + "title": "Quiet bug", + "description": "Quiet bug", + "url": "https://github.com/openai/codex/issues/1", + "owner_labels": ["context"], + "kind_labels": ["bug"], + "state": "open", + "attention": False, + "attention_level": 0, + "attention_marker": "", + "user_interactions": 1, + "new_reactions": 0, + "new_upvotes": 0, + "engagement_score": 3, + "issue_reaction_total": 0, + "comment_reaction_total": 0, + "updated_at": "2026-04-25T01:00:00Z", + "activity": { + "new_issue": True, + "new_comments": 0, + "new_reactions": 0, + "updated_without_visible_new_post": False, + }, + }, + { + "number": 2, + "title": "Busy bug", + "description": "High-volume bug report", + "url": "https://github.com/openai/codex/issues/2", + "owner_labels": ["agent"], + "kind_labels": ["bug"], + "state": "open", + "attention": True, + "attention_level": 1, + "attention_marker": "πŸ”₯", + "user_interactions": 17, + "new_reactions": 3, + "new_upvotes": 2, + "engagement_score": 20, + "issue_reaction_total": 5, + "comment_reaction_total": 2, + "updated_at": "2026-04-25T02:00:00Z", + "activity": { + "new_issue": False, + "new_comments": 16, + "new_reactions": 3, + "updated_without_visible_new_post": False, + }, + }, + ] + ) + + assert rows[0] == { + "ref": 1, + "ref_markdown": "[1](https://github.com/openai/codex/issues/2)", + "marker": "πŸ”₯", + "attention_marker": "πŸ”₯", + "number": 2, + "description": "High-volume bug report", + "title": "Busy bug", + "url": "https://github.com/openai/codex/issues/2", + "area": "agent", + "kind": "bug", + "state": "open", + "interactions": 17, + "user_interactions": 17, + "new_reactions": 3, + "new_upvotes": 2, + "current_reactions": 7, + } + + +def test_summary_inputs_are_model_ready_without_preclustering(): + issues = [ + { + "number": 20, + "title": "Windows app Browser Use external navigation fails", + "description": "Browser Use navigation or app-server failure", + "url": "https://github.com/openai/codex/issues/20", + "labels": ["app", "bug"], + "owner_labels": ["app"], + "kind_labels": ["bug"], + "attention": False, + "attention_level": 0, + "attention_marker": "", + "user_interactions": 3, + "new_reactions": 1, + "engagement_score": 8, + "updated_at": "2026-04-25T04:00:00Z", + "activity": {"new_comments": 2}, + }, + { + "number": 21, + "title": "On Windows, cmake output waits until timeout", + "description": "Windows command timeout/capture problem", + "url": "https://github.com/openai/codex/issues/21", + "labels": ["app", "bug"], + "owner_labels": ["app"], + "kind_labels": ["bug"], + "attention": False, + "attention_level": 0, + "attention_marker": "", + "user_interactions": 3, + "new_reactions": 0, + "engagement_score": 7, + "updated_at": "2026-04-25T03:00:00Z", + "activity": {"new_comments": 3}, + }, + { + "number": 22, + "title": "Windows computer use tool fails to click buttons", + "description": "Computer-use workflow failure", + "url": "https://github.com/openai/codex/issues/22", + "labels": ["app", "bug"], + "owner_labels": ["app"], + "kind_labels": ["bug"], + "attention": False, + "attention_level": 0, + "attention_marker": "", + "user_interactions": 3, + "new_reactions": 0, + "engagement_score": 6, + "updated_at": "2026-04-25T02:00:00Z", + "activity": {"new_comments": 3}, + }, + ] + + rows = collect_issue_digest.summary_inputs(issues, ref_map={20: 1, 21: 2, 22: 3}) + + assert rows == [ + { + "ref": 1, + "ref_markdown": "[1](https://github.com/openai/codex/issues/20)", + "number": 20, + "title": "Windows app Browser Use external navigation fails", + "description": "Browser Use navigation or app-server failure", + "url": "https://github.com/openai/codex/issues/20", + "labels": ["app", "bug"], + "owner_labels": ["app"], + "kind_labels": ["bug"], + "state": "", + "attention_marker": "", + "interactions": 3, + "new_comments": 2, + "new_reactions": 1, + "new_upvotes": 0, + "current_reactions": 0, + }, + { + "ref": 2, + "ref_markdown": "[2](https://github.com/openai/codex/issues/21)", + "number": 21, + "title": "On Windows, cmake output waits until timeout", + "description": "Windows command timeout/capture problem", + "url": "https://github.com/openai/codex/issues/21", + "labels": ["app", "bug"], + "owner_labels": ["app"], + "kind_labels": ["bug"], + "state": "", + "attention_marker": "", + "interactions": 3, + "new_comments": 3, + "new_reactions": 0, + "new_upvotes": 0, + "current_reactions": 0, + }, + { + "ref": 3, + "ref_markdown": "[3](https://github.com/openai/codex/issues/22)", + "number": 22, + "title": "Windows computer use tool fails to click buttons", + "description": "Computer-use workflow failure", + "url": "https://github.com/openai/codex/issues/22", + "labels": ["app", "bug"], + "owner_labels": ["app"], + "kind_labels": ["bug"], + "state": "", + "attention_marker": "", + "interactions": 3, + "new_comments": 3, + "new_reactions": 0, + "new_upvotes": 0, + "current_reactions": 0, + }, + ] diff --git a/.opencode/skills/codex-pr-body/SKILL.md b/.opencode/skills/codex-pr-body/SKILL.md new file mode 100644 index 000000000000..76b37b875076 --- /dev/null +++ b/.opencode/skills/codex-pr-body/SKILL.md @@ -0,0 +1,59 @@ +--- +name: codex-pr-body +description: Update the title and body of one or more pull requests. +--- + +## Determining the PR(s) + +When this skill is invoked, the PR(s) to update may be specified explicitly, but in the common case, the PR(s) to update will be inferred from the branch / commit that the user is currently working on. For ordinary Git usage (i.e., not Sapling as discussed below), you may have to use a combination of `git branch` and `gh pr view --repo openai/codex --json number --jq '.number'` to determine the PR associated with the current branch / commit. + +## PR Body Contents + +When invoked, use `gh` to edit the pull request body and title to reflect the contents of the specified PR. Make sure to check the existing pull request body to see if there is key information that should be preserved. For example, NEVER remove an image in the existing pull request body, as the author may have no way to recover it if you remove it. + +It is critically important to explain _why_ the change is being made. If the current conversation in which this skill is invoked has discussed the motivation, be sure to capture this in the pull request body. + +The body should also explain _what_ changed, but this should appear after the _why_. + +Limit discussion to the _net change_ of the commit. It is generally frowned upon to discuss changes that were attempted but later undone in the course of the development of the pull request. When rewriting the pull request body, you may need to eliminate details such as these when they are no longer appropriate / of interest to future readers. + +Avoid references to absolute paths on my local disk. When talking about a path that is within the repository, simply use the repo-relative path. + +It is generally helpful to discuss how the change was verified. That said, it is unnecessary to mention things that CI checks automatically, e.g., do not include "ran `just fmt`" as part of the test plan. Though identifying the new tests that were purposely introduced to verify the new behavior introduced by the pull request is often appropriate. + +Make use of Markdown to format the pull request professionally. Ensure "code things" appear in single backticks when referenced inline. Fenced code blocks are useful when referencing code or showing a shell transcript. Also, make use of GitHub permalinks when citing existing pieces of code that are relevant to the change. + +Make sure to reference any relevant pull requests or issues, though there should be no need to reference the pull request in its own PR body. + +If there is documentation that should be updated on https://developers.openai.com/codex as a result of this change, please note that in a separate section near the end of the pull request. Omit this section if there is no documentation that needs to be updated. + +## Working with Stacks + +Sometimes a pull request is composed of a stack of commits that build on one another. In these cases, the PR body should reflect the _net_ change introduced by the stack as a whole, rather than the individual commits that make up the stack. + +Similarly, sometimes a user may be using a tool like Sapling to leverage _stacked pull requests_, in which case the `base` of the PR may be the a branch that is the `head` of another PR in the stack rather than `main`. In this case, be sure to discuss only the net change between the `base` and `head` of the PR that is being opened against that stacked base, rather than the changes relative to `main`. + +## Sapling + +If `.git/sl/store` is present, then this Git repository is governed by Sapling SCM (https://sapling-scm.com). + +In Sapling, run the following to see if there is a GitHub pull request associated with the current revision: + +```shell +sl log --template '{github_pull_request_url}' -r . +``` + +Alternatively, you can run `sl sl` to see the current development branch and whether there is a GitHub pull request associated with the current commit. For example, if the output were: + +``` + @ cb032b31cf 72 minutes ago mbolin #11412 +╭─╯ tui: show non-file layer content in /debug-config +β”‚ +o fdd0cd1de9 Today at 20:09 origin/main +β”‚ +~ +``` + +- `@` indicates the current commit is `cb032b31cf` +- it is a development branch containing a single commit branched off of `origin/main` +- it is associated with GitHub pull request #11412 diff --git a/.opencode/skills/remote-tests/SKILL.md b/.opencode/skills/remote-tests/SKILL.md new file mode 100644 index 000000000000..ee35fc2b2180 --- /dev/null +++ b/.opencode/skills/remote-tests/SKILL.md @@ -0,0 +1,16 @@ +--- +name: remote-tests +description: How to run tests using remote executor. +--- + +Some codex integration tests support a running against a remote executor. +This means that when CODEX_TEST_REMOTE_ENV environment variable is set they will attempt to start an executor process in a docker container CODEX_TEST_REMOTE_ENV points to and use it in tests. + +Docker container is built and initialized via ./scripts/test-remote-env.sh + +Currently running remote tests is only supported on Linux, so you need to use a devbox to run them + +You can list devboxes via `applied_devbox ls`, pick the one with `codex` in the name. +Connect to devbox via `ssh `. +Reuse the same checkout of codex in `~/code/codex`. Reset files if needed. Multiple checkouts take longer to build and take up more space. +Check whether the SHA and modified files are in sync between remote and local. diff --git a/.opencode/skills/test-tui/SKILL.md b/.opencode/skills/test-tui/SKILL.md new file mode 100644 index 000000000000..e58e67730efe --- /dev/null +++ b/.opencode/skills/test-tui/SKILL.md @@ -0,0 +1,14 @@ +--- +name: test-tui +description: Guide for testing Codex TUI interactively +--- + +You can start and use Codex TUI to verify changes. + +Important notes: + +Start interactively. +Always set RUST_LOG="trace" when starting the process. +Pass `-c log_dir=` argument to have logs written to a specific directory to help with debugging. +When sending a test message programmatically, send text first, then send Enter in a separate write (do not send text + Enter in one burst). +Use `just codex` target to run - `just codex -c ...` diff --git a/packages/opencode/src/cli/cmd/tui/context/theme.tsx b/packages/opencode/src/cli/cmd/tui/context/theme.tsx index ce9ade6a1a96..fb80bfed4f20 100644 --- a/packages/opencode/src/cli/cmd/tui/context/theme.tsx +++ b/packages/opencode/src/cli/cmd/tui/context/theme.tsx @@ -243,6 +243,16 @@ export function resolveTheme(theme: ThemeJson, mode: "dark" | "light") { resolved.backgroundMenu = resolved.backgroundElement } + // Keep prose-like markdown readable across themes by using the base text color. + resolved.markdownText = resolved.text + resolved.markdownHeading = resolved.text + resolved.markdownBlockQuote = resolved.text + resolved.markdownEmph = resolved.text + resolved.markdownStrong = resolved.text + resolved.markdownListItem = resolved.text + resolved.markdownListEnumeration = resolved.text + resolved.markdownImageText = resolved.text + // Handle thinkingOpacity - optional with default of 0.6 const thinkingOpacity = theme.theme.thinkingOpacity ?? 0.6 diff --git a/packages/opencode/src/cli/cmd/tui/feature-plugins/home/footer.tsx b/packages/opencode/src/cli/cmd/tui/feature-plugins/home/footer.tsx index 7f2ef55e9b0e..a04c8e49499a 100644 --- a/packages/opencode/src/cli/cmd/tui/feature-plugins/home/footer.tsx +++ b/packages/opencode/src/cli/cmd/tui/feature-plugins/home/footer.tsx @@ -1,5 +1,5 @@ import type { TuiPlugin, TuiPluginApi, TuiPluginModule } from "@opencode-ai/plugin/tui" -import { createMemo, Match, Show, Switch } from "solid-js" +import { createMemo, Show } from "solid-js" import { Global } from "@opencode-ai/core/global" const id = "internal:home-footer" @@ -21,23 +21,12 @@ function Mcp(props: { api: TuiPluginApi }) { const theme = () => props.api.theme.current const list = createMemo(() => props.api.state.mcp()) const has = createMemo(() => list().length > 0) - const err = createMemo(() => list().some((item) => item.status === "failed")) const count = createMemo(() => list().filter((item) => item.status === "connected").length) return ( - - - - βŠ™ - - - 0 ? theme().success : theme().textMuted }}>βŠ™ - - - {count()} MCP - + {count()} MCP /status diff --git a/packages/opencode/src/cli/cmd/tui/routes/session/footer.tsx b/packages/opencode/src/cli/cmd/tui/routes/session/footer.tsx index c3a96254e98b..515d56233bcf 100644 --- a/packages/opencode/src/cli/cmd/tui/routes/session/footer.tsx +++ b/packages/opencode/src/cli/cmd/tui/routes/session/footer.tsx @@ -1,4 +1,4 @@ -import { createMemo, Match, onCleanup, onMount, Show, Switch } from "solid-js" +import { createMemo, onCleanup, onMount, Show, Switch } from "solid-js" import { useTheme } from "../../context/theme" import { useSync } from "../../context/sync" import { useDirectory } from "../../context/directory" @@ -70,17 +70,7 @@ export function Footer() { 0 ? theme.success : theme.textMuted }}>β€’ {lsp().length} LSP - - - - βŠ™ - - - βŠ™ - - - {mcp()} MCP - + {mcp()} MCP /status diff --git a/packages/opencode/test/cli/tui/theme-store.test.ts b/packages/opencode/test/cli/tui/theme-store.test.ts index 9ebfc4320ed5..c716fe8fe3c4 100644 --- a/packages/opencode/test/cli/tui/theme-store.test.ts +++ b/packages/opencode/test/cli/tui/theme-store.test.ts @@ -49,3 +49,18 @@ test("resolveTheme rejects circular color refs", () => { expect(() => resolveTheme(item, "dark")).toThrow("Circular color reference") }) + +test("resolveTheme keeps markdown prose on base text color", () => { + const resolved = resolveTheme(DEFAULT_THEMES.matrix, "dark") + + expect(resolved.markdownText).toBe(resolved.text) + expect(resolved.markdownHeading).toBe(resolved.text) + expect(resolved.markdownBlockQuote).toBe(resolved.text) + expect(resolved.markdownEmph).toBe(resolved.text) + expect(resolved.markdownStrong).toBe(resolved.text) + expect(resolved.markdownListItem).toBe(resolved.text) + expect(resolved.markdownListEnumeration).toBe(resolved.text) + expect(resolved.markdownImageText).toBe(resolved.text) + expect(resolved.markdownCode).not.toBe(resolved.text) + expect(resolved.markdownLink).not.toBe(resolved.text) +}) diff --git a/packages/ui/src/context/marked.tsx b/packages/ui/src/context/marked.tsx index 46f4993babde..c5ef800a1305 100644 --- a/packages/ui/src/context/marked.tsx +++ b/packages/ui/src/context/marked.tsx @@ -239,20 +239,20 @@ registerCustomTheme("OpenCode", () => { { scope: "punctuation.definition.list.begin.markdown", settings: { - foreground: "var(--syntax-punctuation)", + foreground: "var(--text-strong)", }, }, { scope: ["markup.heading", "markup.heading entity.name"], settings: { fontStyle: "bold", - foreground: "var(--syntax-info)", + foreground: "var(--text-strong)", }, }, { scope: "markup.quote", settings: { - foreground: "var(--syntax-info)", + foreground: "var(--text-strong)", }, }, {