diff --git a/CHANGELOG.md b/CHANGELOG.md index b6ccdf2..e2327fa 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,11 @@ # Changelog +## 0.70.0 - 2026-06-01 + +- Add Claude Code settings, agentic workflow, and GitHub Actions hardening checks to `contextforge doctor`. +- Expand proof-pack and scorecard evidence commands so first-readiness reports point directly at `claude-audit`, `workflow-audit`, and `actions-audit` SARIF/Markdown proof. +- Refresh README, doctor/proof-pack/scorecard docs, LLM discovery files, and research notes around one-command hardening evidence for Codex and Claude handoffs. + ## 0.69.0 - 2026-06-01 - Add `contextforge actions-audit --summary contextforge-actions-audit.md --sarif contextforge-actions.sarif` for GitHub Actions hardening proof. diff --git a/README.md b/README.md index 8517a90..bb60106 100644 --- a/README.md +++ b/README.md @@ -22,17 +22,23 @@ safely?** ## 30-second proof ```bash +contextforge doctor --summary contextforge-doctor.md +contextforge scorecard --output contextforge-scorecard.md contextforge audit --min-context-score 70 --min-cache-score 70 --min-security-score 80 contextforge surface-map --output contextforge-agent-surface-map.md contextforge surface-inventory --output contextforge-agent-surface-inventory.md contextforge surface-diff --base main --output contextforge-agent-surface-diff.md ``` -The audit gates context health, cache stability, and prompt/context poisoning. -The surface map shows exactly which agent-facing files are covered before a -maintainer has to read every doc. The inventory shows the agent-readable files -that are actually present in the current repository. The diff shows which -agent-readable files changed in a PR before reviewers trust the new context. +The doctor report now gives one first-readiness answer across context health, +security benchmark fixtures, MCP exposure, Claude Code settings, agentic +workflows, GitHub Actions hardening, public proof, launch assets, and community +health. The scorecard is the one-screen README/PR view. The audit gates context +health, cache stability, and prompt/context poisoning. The surface map shows +exactly which agent-facing files are covered before a maintainer has to read +every doc. The inventory shows the agent-readable files that are actually +present in the current repository. The diff shows which agent-readable files +changed in a PR before reviewers trust the new context. | Agent stack | Surfaces ContextForge checks | | --- | --- | @@ -353,7 +359,7 @@ contextforge pack --task "review auth regression" --budget 20000 --sessions --ou Or use the GitHub Action before npm publishing is complete: ```yaml -- uses: grnbtqdbyx-create/contextforge@v0.69.0 +- uses: grnbtqdbyx-create/contextforge@v0.70.0 with: min-context-score: 60 min-cache-score: 60 @@ -492,7 +498,7 @@ contextforge cost-estimate [--demo] [--json] [--summary contextforge-cost-estima contextforge review-kit [--demo] [--base main] [--output contextforge-review-kit.md] contextforge artifact-map [--output docs/artifacts.md] contextforge publish-readiness [--json] [--summary contextforge-publish-readiness.md] -contextforge init [--all] [--github-action] [--pr-comment-workflow] [--agents-md] [--claude-md] [--copilot-instructions] [--project-name "My App"] [--action-ref grnbtqdbyx-create/contextforge@v0.69.0] [--force] +contextforge init [--all] [--github-action] [--pr-comment-workflow] [--agents-md] [--claude-md] [--copilot-instructions] [--project-name "My App"] [--action-ref grnbtqdbyx-create/contextforge@v0.70.0] [--force] ``` Local session scans are bounded by default. Use `--max-session-files` and @@ -577,12 +583,13 @@ See [docs/research/adjacent-tools.md](docs/research/adjacent-tools.md). ## Current Status -ContextForge v0.69.0 is a public MVP CLI with: +ContextForge v0.70.0 is a public MVP CLI with: - Claude Code and Codex JSONL fixture scanners - bounded local session scanning fallbacks - first-run `contextforge doctor` readiness report with JSON output - shareable `contextforge doctor --summary` Markdown reports +- doctor, proof-pack, and scorecard hardening checks for Claude settings, agentic workflows, and GitHub Actions release safety - shareable `contextforge proof-pack` readiness packets for launch, PR, and OSS evidence - generated `contextforge adoption-brief` evaluator pages for first-time maintainers - one-screen `contextforge scorecard` readiness snapshots for README, PR, and CI artifact readers @@ -720,6 +727,7 @@ ContextForge v0.69.0 is a public MVP CLI with: - **v0.67.0:** agentic workflow audits catch untrusted GitHub event text flowing into privileged AI workflows. - **v0.68.0:** workflow audits expand attacker-controlled coverage to titles and branch/ref text. - **v0.69.0:** GitHub Actions audits catch mutable action refs, pwn-request checkout, missing permissions, and direct script interpolation. +- **v0.70.0:** doctor, proof-pack, and scorecard reports surface Claude settings, agentic workflow, and GitHub Actions hardening evidence in one readiness path. - **Next:** first approved npm publish and external launch outreach. Release preparation lives in [docs/release-checklist.md](docs/release-checklist.md). diff --git a/contextforge-publish-readiness.md b/contextforge-publish-readiness.md index a32e4a3..9a8f7d4 100644 --- a/contextforge-publish-readiness.md +++ b/contextforge-publish-readiness.md @@ -2,11 +2,11 @@ Status: **warn** -Package: `contextforge@0.69.0` +Package: `contextforge@0.70.0` | Check | Status | Detail | | --- | --- | --- | -| Package metadata | pass | contextforge@0.69.0 is public-package ready with bin dist/cli.js | +| Package metadata | pass | contextforge@0.70.0 is public-package ready with bin dist/cli.js | | Package provenance metadata | pass | repository, homepage, and issue tracker point at grnbtqdbyx-create/contextforge for npm provenance readers | | Trusted publishing workflow | pass | npm Trusted Publishing uses GitHub OIDC, manual dispatch, dry-run default, and environment approval | | Release artifact attestation | pass | GitHub artifact attestation covers the packed npm tarball before the same tarball is published | diff --git a/contextforge-scorecard.md b/contextforge-scorecard.md index f853565..1aabb54 100644 --- a/contextforge-scorecard.md +++ b/contextforge-scorecard.md @@ -19,6 +19,9 @@ A one-screen snapshot for maintainers, reviewers, and coding agents deciding whe | Cache stability | pass | 100/100 with no local sessions scanned | | Context security | pass | 100/100 from repo instruction files | | Security benchmark | pass | 4/4 benchmark cases passing | +| Claude Code settings | pass | 100/100 with no Claude Code settings found | +| Agentic workflows | pass | 100/100 across .github/workflows/ci.yml, .github/workflows/contextforge-audit.yml, .github/workflows/npm-publish.yml | +| GitHub Actions hardening | pass | 100/100 across .github/workflows/ci.yml, .github/workflows/contextforge-audit.yml, .github/workflows/npm-publish.yml | | GitHub workflows | pass | ci.yml, contextforge-audit.yml present | | Public proof surfaces | pass | README.md, LICENSE, CONTRIBUTING.md, CHANGELOG.md, llms.txt, llms-full.txt, examples/demo-output.md, examples/pr-comment.md, examples/review-kit.md present | | Launch profile surfaces | pass | demo-terminal.svg, contextforge-report.png, docs/launch-post.md, docs/comparison.md, docs/artifacts.md present | @@ -42,6 +45,9 @@ contextforge scorecard --output contextforge-scorecard.md contextforge proof-pack --output contextforge-proof-pack.md contextforge review-kit --base main --output contextforge-review-kit.md contextforge mcp-audit --summary contextforge-mcp-audit.md +contextforge claude-audit --summary contextforge-claude-audit.md --sarif contextforge-claude.sarif +contextforge workflow-audit --summary contextforge-workflow-audit.md --sarif contextforge-workflow.sarif +contextforge actions-audit --summary contextforge-actions-audit.md --sarif contextforge-actions.sarif contextforge artifact-map --output contextforge-artifact-map.md ``` diff --git a/docs/architecture.md b/docs/architecture.md index 42bcbff..16af1b9 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -7,7 +7,7 @@ ContextForge is split into small modules: - `pack`: create task-specific context packs under a token budget. - `improve`: turn audit findings into repo-rule suggestions. - `report`: write local HTML reports. -- `doctor`: compose first-run readiness checks across audits, benchmark fixtures, and GitHub workflow presence. +- `doctor`: compose first-run readiness checks across audits, benchmark fixtures, MCP exposure, Claude settings, agentic workflows, GitHub Actions hardening, and GitHub workflow presence. - `security`: ignore risky paths and redact common secrets. The CLI composes these modules without network calls by default. diff --git a/docs/doctor.md b/docs/doctor.md index 3690cbb..bb5f757 100644 --- a/docs/doctor.md +++ b/docs/doctor.md @@ -17,6 +17,12 @@ The report checks: - context security score - public security benchmark status - MCP exposure status for committed MCP server configs +- Claude Code settings status for shared permissions, hooks, HTTP allowlists, + and sensitive-file deny rules +- agentic workflow status for untrusted GitHub event text reaching model-backed + jobs +- GitHub Actions hardening status for SHA pins, token permissions, + `pull_request_target`, pwn-request checkout, and direct script interpolation - GitHub workflow presence for CI and ContextForge audit artifacts - public proof surfaces: README, license, contribution guide, changelog, demo output, PR comment preview, review-kit preview, and LLM discovery files @@ -60,6 +66,12 @@ shell installers, unpinned package launches, auto-approval, broad tool permissions, and symlinked config files so maintainers can review tool access as part of the same first-run readiness report. +For Codex/Claude handoffs, the Claude settings, agentic workflow, and GitHub +Actions hardening checks keep the first-run answer honest: a repo can have +great README proof and still be unsafe if shared agent settings weaken +permissions or CI lets untrusted PR text reach privileged model or release +steps. + The Markdown summary keeps the first-run proof portable. It uses the same doctor result as terminal and JSON output, so maintainers can publish a report without hand-copying or reinterpreting the readiness checks. diff --git a/docs/github-action.md b/docs/github-action.md index 154f822..68a1462 100644 --- a/docs/github-action.md +++ b/docs/github-action.md @@ -36,7 +36,7 @@ refuses to overwrite existing files by default: ```bash contextforge init --github-action --force -contextforge init --github-action --action-ref grnbtqdbyx-create/contextforge@v0.69.0 +contextforge init --github-action --action-ref grnbtqdbyx-create/contextforge@v0.70.0 ``` `contextforge init --pr-comment-workflow` writes a separate @@ -71,7 +71,7 @@ jobs: - uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5 with: fetch-depth: 0 - - uses: grnbtqdbyx-create/contextforge@v0.69.0 + - uses: grnbtqdbyx-create/contextforge@v0.70.0 with: min-context-score: 60 min-cache-score: 60 diff --git a/docs/proof-pack.md b/docs/proof-pack.md index d55cc94..5b7f4ec 100644 --- a/docs/proof-pack.md +++ b/docs/proof-pack.md @@ -35,5 +35,7 @@ The proof pack includes: - doctor status and every doctor check - audit status, context health, cache stability, context security, and cache hit ratio - top next actions from doctor and audit evidence -- commands to rerun doctor, audit, security benchmark, and context pack creation +- commands to rerun doctor, audit, security benchmark, Claude settings audit, + agentic workflow audit, GitHub Actions hardening audit, and context pack + creation - a short Codex/Claude handoff paragraph for the next agent session diff --git a/docs/research/adjacent-tools.md b/docs/research/adjacent-tools.md index 0bb87f0..3f0f78d 100644 --- a/docs/research/adjacent-tools.md +++ b/docs/research/adjacent-tools.md @@ -570,3 +570,9 @@ permissions, `pull_request_target`, pwn-request checkout, and direct shell interpolation of untrusted GitHub contexts. ContextForge dogfoods the feature by pinning its own workflows to full action SHAs and uploading the new Actions SARIF beside MCP, Claude settings, and agentic workflow alerts. +ContextForge v0.70.0 folds Claude settings, agentic workflow, and GitHub +Actions hardening checks into `contextforge doctor`, then points proof-pack and +scorecard readers at the matching Markdown/SARIF rerun commands. The product +reason is simple: a first-time maintainer, Codex session, or Claude session +should not need to remember every specialized audit command before it can tell +whether the repository is safe enough for agent-assisted work. diff --git a/docs/scorecard.md b/docs/scorecard.md index a13e993..5167c47 100644 --- a/docs/scorecard.md +++ b/docs/scorecard.md @@ -18,15 +18,20 @@ reader needs a fast answer to one question: It combines: - agent readiness score from context health, cache stability, and context security -- doctor checks for public proof, launch profile, community health, MCP exposure, and workflows +- doctor checks for public proof, launch profile, community health, MCP + exposure, Claude settings, agentic workflows, GitHub Actions hardening, and + workflow presence - next best actions -- links to the deeper proof pack, review kit, surface diff, artifact map, and action plan +- links to the deeper proof pack, review kit, hardening audits, surface diff, + artifact map, and action plan ## CI Artifact The reusable GitHub Action, generated audit workflow, and ContextForge dogfood -workflow upload `contextforge-scorecard.md` next to the MCP audit, proof pack, -review kit, surface diff, artifact map, SARIF, summary, badge, and JSON report. +workflow upload `contextforge-scorecard.md` next to the MCP audit, Claude +settings audit, agentic workflow audit, GitHub Actions hardening audit, proof +pack, review kit, surface diff, artifact map, SARIF, summary, badge, and JSON +report. Use `contextforge-scorecard.md` as the first artifact to open. If it is clean, open `contextforge-proof-pack.md` for evidence and `contextforge-review-kit.md` diff --git a/llms-full.txt b/llms-full.txt index 0c5e69a..859cd1a 100644 --- a/llms-full.txt +++ b/llms-full.txt @@ -134,9 +134,9 @@ cases so maintainers can see what the scanner is expected to catch. - `docs/use-cases.md`: maintainer scenarios with commands, artifacts, and success signals - `contextforge-suggestions.json`: machine-readable improvement suggestions for Codex, Claude, bots, and CI scripts - `contextforge-badge.svg`: compact audit status badge generated from context, cache, and security scores -- `contextforge-doctor.md`: shareable first-run readiness checklist from `doctor --summary` -- `contextforge-proof-pack.md`: shareable doctor, audit, command, and handoff evidence from `proof-pack` -- `contextforge-scorecard.md`: one-screen Codex/Claude readiness snapshot from `scorecard` +- `contextforge-doctor.md`: shareable first-run readiness checklist from `doctor --summary`, including MCP exposure, Claude settings, agentic workflow, and GitHub Actions hardening status +- `contextforge-proof-pack.md`: shareable doctor, audit, hardening command, and handoff evidence from `proof-pack` +- `contextforge-scorecard.md`: one-screen Codex/Claude readiness snapshot from `scorecard`, including hardening checks - `contextforge-agent-surface-map.md`: cross-agent support matrix from `surface-map` - `contextforge-agent-surface-inventory.md`: repo-specific list of actual agent-readable files from `surface-inventory` - `contextforge-agent-surface-diff.md`: PR-specific list of changed agent-readable files from `surface-diff` diff --git a/llms.txt b/llms.txt index 5bf7d2a..8fcfb65 100644 --- a/llms.txt +++ b/llms.txt @@ -5,17 +5,18 @@ ContextForge is a local-first TypeScript CLI for maintainers who use AI coding agents. It measures token usage, audits repo context files, checks prompt-cache stability, scans for malicious repo instructions, audits committed MCP configs, -reviews Claude Code subagents and custom slash commands, checks Codex/Claude -trace efficiency, builds task-specific context packs, and produces CI artifacts -that Codex and Claude can act on. +reviews Claude Code settings, subagents, and custom slash commands, checks +agentic GitHub workflows and GitHub Actions hardening, checks Codex/Claude trace +efficiency, builds task-specific context packs, and produces CI artifacts that +Codex and Claude can act on. ## Quick Start - `contextforge init --all --project-name "My Repo"`: scaffold the recommended repo setup, including `AGENTS.md`, `CLAUDE.md`, and `.github/copilot-instructions.md`. - `contextforge doctor --demo --summary contextforge-doctor.md`: run first-readiness checks and write a shareable Markdown report. -- `contextforge doctor`: verifies public proof files, launch profile assets, community health files, and next actions. -- `contextforge proof-pack --output contextforge-proof-pack.md`: combine doctor, audit, evidence commands, and Codex/Claude handoff guidance into one Markdown proof packet. -- `contextforge scorecard --output contextforge-scorecard.md`: write a one-screen Codex/Claude readiness snapshot for README, PR, and CI artifact readers. +- `contextforge doctor`: verifies public proof files, launch profile assets, community health files, MCP exposure, Claude settings, agentic workflow, GitHub Actions hardening, and next actions. +- `contextforge proof-pack --output contextforge-proof-pack.md`: combine doctor, audit, hardening evidence commands, and Codex/Claude handoff guidance into one Markdown proof packet. +- `contextforge scorecard --output contextforge-scorecard.md`: write a one-screen Codex/Claude readiness snapshot with hardening checks for README, PR, and CI artifact readers. - `contextforge surface-map --output contextforge-agent-surface-map.md`: write a support matrix for audited Codex, Claude Code, GitHub Copilot, MCP, Cursor, Cline, Gemini CLI, and Windsurf repo surfaces. - `contextforge surface-inventory --output contextforge-agent-surface-inventory.md`: write the actual agent-readable files present in this repo and the commands that audit them. - `contextforge surface-diff --base main --output contextforge-agent-surface-diff.md`: write the changed agent-readable files in a PR, affected ecosystems, and follow-up checks. diff --git a/package.json b/package.json index d16613b..908808a 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "contextforge", - "version": "0.69.0", + "version": "0.70.0", "description": "Agent context gate for Codex, Claude Code, GitHub Copilot, MCP, Cursor, Cline, Gemini, and Windsurf repos.", "type": "module", "packageManager": "pnpm@11.2.2", diff --git a/src/cli.ts b/src/cli.ts index 527bb41..a06b196 100644 --- a/src/cli.ts +++ b/src/cli.ts @@ -920,7 +920,7 @@ Usage: contextforge surface-inventory [--json] [--output contextforge-agent-surface-inventory.md] contextforge surface-diff [--base main] [--json] [--output contextforge-agent-surface-diff.md] contextforge publish-readiness [--json] [--summary contextforge-publish-readiness.md] - contextforge init [--all] [--github-action] [--pr-comment-workflow] [--agents-md] [--claude-md] [--copilot-instructions] [--project-name "My App"] [--action-ref grnbtqdbyx-create/contextforge@v0.69.0] [--force] + contextforge init [--all] [--github-action] [--pr-comment-workflow] [--agents-md] [--claude-md] [--copilot-instructions] [--project-name "My App"] [--action-ref grnbtqdbyx-create/contextforge@v0.70.0] [--force] Session scan safety: --max-session-files 50 newest JSONL files to scan per provider diff --git a/src/doctor/doctor.ts b/src/doctor/doctor.ts index dbf9693..195f385 100644 --- a/src/doctor/doctor.ts +++ b/src/doctor/doctor.ts @@ -1,5 +1,8 @@ import { access } from 'node:fs/promises'; import path from 'node:path'; +import { auditAgenticWorkflows } from '../analyzers/agenticWorkflow.js'; +import { auditClaudeSettings } from '../analyzers/claudeSettings.js'; +import { auditGithubActions } from '../analyzers/githubActions.js'; import { auditMcpExposure } from '../analyzers/mcpExposure.js'; import { buildAudit } from '../audit/buildAudit.js'; import { runSecurityBenchmark } from '../benchmark/securityBenchmark.js'; @@ -37,6 +40,9 @@ export async function runDoctor(options: DoctorOptions): Promise { minSecurityScore: options.minSecurityScore ?? 60 }); const benchmark = await runSecurityBenchmark({ benchmarkDir: options.benchmarkDir }); + const claudeSettings = await auditClaudeSettings({ rootDir: options.rootDir }); + const agenticWorkflows = await auditAgenticWorkflows({ rootDir: options.rootDir }); + const githubActions = await auditGithubActions({ rootDir: options.rootDir }); const mcpExposure = await auditMcpExposure({ rootDir: options.rootDir }); const workflows = await workflowChecks(options.rootDir); const publicProof = await publicProofChecks(options.rootDir); @@ -67,6 +73,30 @@ export async function runDoctor(options: DoctorOptions): Promise { status: benchmark.passed ? 'pass' : 'fail', detail: `${benchmark.totalCases - benchmark.failedCases}/${benchmark.totalCases} benchmark cases passing` }, + { + name: 'Claude Code settings', + status: claudeSettings.status, + detail: + claudeSettings.files.length > 0 + ? `${claudeSettings.score}/100 across ${claudeSettings.files.join(', ')}${claudeSettings.findings.length > 0 ? `; ${claudeSettings.findings.length} findings` : ''}` + : `${claudeSettings.score}/100 with no Claude Code settings found` + }, + { + name: 'Agentic workflows', + status: agenticWorkflows.status, + detail: + agenticWorkflows.files.length > 0 + ? `${agenticWorkflows.score}/100 across ${agenticWorkflows.files.join(', ')}${agenticWorkflows.findings.length > 0 ? `; ${agenticWorkflows.findings.length} findings` : ''}` + : `${agenticWorkflows.score}/100 with no GitHub workflows found` + }, + { + name: 'GitHub Actions hardening', + status: githubActions.status, + detail: + githubActions.files.length > 0 + ? `${githubActions.score}/100 across ${githubActions.files.join(', ')}${githubActions.findings.length > 0 ? `; ${githubActions.findings.length} findings` : ''}` + : `${githubActions.score}/100 with no GitHub workflows found` + }, { name: 'GitHub workflows', status: workflows.missing.length === 0 ? 'pass' : 'warn', @@ -249,6 +279,15 @@ function nextActions(checks: DoctorCheck[], auditActions: string[]): string[] { if (checks.some((check) => check.name === 'GitHub workflows' && check.status === 'warn')) { actions.push('Add the ContextForge GitHub Action so every PR uploads JSON and HTML audit artifacts.'); } + if (checks.some((check) => check.name === 'Claude Code settings' && check.status !== 'pass')) { + actions.push('Fix Claude Code settings findings before sharing project-level permissions, hooks, or sensitive-file rules.'); + } + if (checks.some((check) => check.name === 'Agentic workflows' && check.status !== 'pass')) { + actions.push('Fix agentic workflow findings before letting untrusted GitHub event text reach Codex, Claude, Copilot, or other agents.'); + } + if (checks.some((check) => check.name === 'GitHub Actions hardening' && check.status !== 'pass')) { + actions.push('Fix GitHub Actions hardening findings before trusting release or agent-review workflows.'); + } if (checks.some((check) => check.name === 'MCP exposure' && check.status !== 'pass')) { actions.push('Review MCP configs for hardcoded secrets, unsafe shell installers, unpinned packages, auto-approval, broad tool permissions, and symlinked config files before enabling agents.'); } diff --git a/src/init/githubAction.ts b/src/init/githubAction.ts index 9035307..78594b8 100644 --- a/src/init/githubAction.ts +++ b/src/init/githubAction.ts @@ -1,7 +1,7 @@ import { access, mkdir, writeFile } from 'node:fs/promises'; import path from 'node:path'; -export const DEFAULT_GITHUB_ACTION_REF = 'grnbtqdbyx-create/contextforge@v0.69.0'; +export const DEFAULT_GITHUB_ACTION_REF = 'grnbtqdbyx-create/contextforge@v0.70.0'; export interface GithubActionScaffoldOptions { rootDir: string; diff --git a/src/report/proofPack.ts b/src/report/proofPack.ts index c8f0ada..7b48f38 100644 --- a/src/report/proofPack.ts +++ b/src/report/proofPack.ts @@ -32,6 +32,9 @@ export function createProofPack(options: { doctor: DoctorResult; audit: AuditRes 'contextforge doctor --summary contextforge-doctor.md', 'contextforge audit --summary contextforge-summary.md --plan contextforge-agent-plan.md --comment contextforge-pr-comment.md --suggestions contextforge-suggestions.json --badge contextforge-badge.svg --base main', 'contextforge security-benchmark', + 'contextforge claude-audit --summary contextforge-claude-audit.md --sarif contextforge-claude.sarif', + 'contextforge workflow-audit --summary contextforge-workflow-audit.md --sarif contextforge-workflow.sarif', + 'contextforge actions-audit --summary contextforge-actions-audit.md --sarif contextforge-actions.sarif', 'contextforge pack --task "next change" --budget 20000 --sessions --output contextforge-pack.md', '```', '', diff --git a/src/report/scorecard.ts b/src/report/scorecard.ts index d5ad617..bdc289d 100644 --- a/src/report/scorecard.ts +++ b/src/report/scorecard.ts @@ -35,6 +35,9 @@ export function createAgentReadinessScorecardData(options: { doctor: DoctorResul 'contextforge-proof-pack.md', 'contextforge-review-kit.md', 'contextforge-mcp-audit.md', + 'contextforge-claude-audit.md', + 'contextforge-workflow-audit.md', + 'contextforge-actions-audit.md', 'contextforge-artifact-map.md', 'contextforge-agent-plan.md' ] @@ -79,6 +82,9 @@ export function createAgentReadinessScorecard(data: AgentReadinessScorecardData) 'contextforge proof-pack --output contextforge-proof-pack.md', 'contextforge review-kit --base main --output contextforge-review-kit.md', 'contextforge mcp-audit --summary contextforge-mcp-audit.md', + 'contextforge claude-audit --summary contextforge-claude-audit.md --sarif contextforge-claude.sarif', + 'contextforge workflow-audit --summary contextforge-workflow-audit.md --sarif contextforge-workflow.sarif', + 'contextforge actions-audit --summary contextforge-actions-audit.md --sarif contextforge-actions.sarif', 'contextforge artifact-map --output contextforge-artifact-map.md', '```', '' diff --git a/tests/cli.test.ts b/tests/cli.test.ts index 7cf3d96..9f133f7 100644 --- a/tests/cli.test.ts +++ b/tests/cli.test.ts @@ -62,7 +62,7 @@ describe('CLI help command', () => { it('prints the current default GitHub Action ref in init examples', async () => { const { stdout } = await execFileAsync('pnpm', ['contextforge', 'help']); - expect(stdout).toContain('--action-ref grnbtqdbyx-create/contextforge@v0.69.0'); + expect(stdout).toContain('--action-ref grnbtqdbyx-create/contextforge@v0.70.0'); }); }); diff --git a/tests/doctor.test.ts b/tests/doctor.test.ts index 78ce726..7b4666c 100644 --- a/tests/doctor.test.ts +++ b/tests/doctor.test.ts @@ -33,8 +33,8 @@ describe('doctor readiness report', () => { await writeFile(path.join(rootDir, '.github/ISSUE_TEMPLATE/bug_report.md'), '---\nname: Bug report\nabout: Report something broken\n---\n'); await writeFile(path.join(rootDir, '.github/ISSUE_TEMPLATE/feature_request.md'), '---\nname: Feature request\nabout: Suggest an improvement\n---\n'); await writeFile(path.join(rootDir, '.github/PULL_REQUEST_TEMPLATE.md'), '## What changed\n'); - await writeFile(path.join(rootDir, '.github/workflows/ci.yml'), 'name: CI\n'); - await writeFile(path.join(rootDir, '.github/workflows/contextforge-audit.yml'), 'name: ContextForge Audit\n'); + await writeFile(path.join(rootDir, '.github/workflows/ci.yml'), 'name: CI\npermissions:\n contents: read\n'); + await writeFile(path.join(rootDir, '.github/workflows/contextforge-audit.yml'), 'name: ContextForge Audit\npermissions:\n contents: read\n'); const result = await runDoctor({ rootDir, @@ -50,6 +50,9 @@ describe('doctor readiness report', () => { 'Cache stability', 'Context security', 'Security benchmark', + 'Claude Code settings', + 'Agentic workflows', + 'GitHub Actions hardening', 'GitHub workflows', 'Public proof surfaces', 'Launch profile surfaces', @@ -59,11 +62,45 @@ describe('doctor readiness report', () => { expect(result.nextActions.length).toBeGreaterThan(0); expect(result.checks.find((check) => check.name === 'Public proof surfaces')?.detail).toContain('examples/review-kit.md present'); expect(result.checks.find((check) => check.name === 'Launch profile surfaces')?.detail).toContain('docs/artifacts.md present'); + expect(result.checks.find((check) => check.name === 'GitHub Actions hardening')?.detail).toContain('100/100'); expect(formatDoctor(result)).toContain('ContextForge doctor: pass'); expect(createDoctorSummary(result)).toContain('# ContextForge Doctor'); expect(createDoctorSummary(result)).toContain('| Check | Status | Detail |'); }); + it('warns when GitHub Actions hardening findings are present', async () => { + const rootDir = await mkdtemp(path.join(os.tmpdir(), 'contextforge-doctor-actions-hardening-')); + await mkdir(path.join(rootDir, '.github/workflows'), { recursive: true }); + await writeFile( + path.join(rootDir, '.github/workflows/release.yml'), + [ + 'name: Release', + 'on: pull_request_target', + 'jobs:', + ' release:', + ' runs-on: ubuntu-latest', + ' steps:', + ' - uses: actions/checkout@v5', + ' with:', + ' ref: ${{ github.event.pull_request.head.sha }}' + ].join('\n') + ); + + const result = await runDoctor({ + rootDir, + records: [], + minContextScore: 60, + minCacheScore: 60, + minSecurityScore: 60 + }); + + const hardening = result.checks.find((check) => check.name === 'GitHub Actions hardening'); + + expect(hardening?.status).toBe('warn'); + expect(hardening?.detail).toContain('findings'); + expect(result.nextActions).toContain('Fix GitHub Actions hardening findings before trusting release or agent-review workflows.'); + }); + it('warns when public proof surfaces are missing', async () => { const rootDir = await mkdtemp(path.join(os.tmpdir(), 'contextforge-public-proof-')); await writeFile(path.join(rootDir, 'AGENTS.md'), 'Keep instructions short, explicit, and reviewed.\n'); diff --git a/tests/init.test.ts b/tests/init.test.ts index 7c94944..c05f228 100644 --- a/tests/init.test.ts +++ b/tests/init.test.ts @@ -88,7 +88,7 @@ describe('GitHub Action init scaffold', () => { const rootDir = await mkdtemp(path.join(os.tmpdir(), 'contextforge-init-default-ref-')); const result = await scaffoldGithubActionWorkflow({ rootDir }); - expect(await readFile(result.path, 'utf8')).toContain('uses: grnbtqdbyx-create/contextforge@v0.69.0'); + expect(await readFile(result.path, 'utf8')).toContain('uses: grnbtqdbyx-create/contextforge@v0.70.0'); }); it('is available through the init CLI command', async () => { diff --git a/tests/proofPack.test.ts b/tests/proofPack.test.ts index 64f32c3..4f57949 100644 --- a/tests/proofPack.test.ts +++ b/tests/proofPack.test.ts @@ -31,6 +31,9 @@ describe('proof pack report', () => { expect(markdown).toContain('## Evidence Commands'); expect(markdown).toContain('contextforge doctor --summary contextforge-doctor.md'); expect(markdown).toContain('contextforge audit --summary contextforge-summary.md'); + expect(markdown).toContain('contextforge claude-audit --summary contextforge-claude-audit.md --sarif contextforge-claude.sarif'); + expect(markdown).toContain('contextforge workflow-audit --summary contextforge-workflow-audit.md --sarif contextforge-workflow.sarif'); + expect(markdown).toContain('contextforge actions-audit --summary contextforge-actions-audit.md --sarif contextforge-actions.sarif'); expect(markdown).toContain('## Codex / Claude Handoff'); }); }); diff --git a/tests/scorecard.test.ts b/tests/scorecard.test.ts index 259c1a2..5b7dc1d 100644 --- a/tests/scorecard.test.ts +++ b/tests/scorecard.test.ts @@ -33,5 +33,8 @@ describe('agent readiness scorecard', () => { expect(markdown).toContain('## Why Codex And Claude Should Care'); expect(markdown).toContain('contextforge proof-pack --output contextforge-proof-pack.md'); expect(markdown).toContain('contextforge review-kit --base main --output contextforge-review-kit.md'); + expect(markdown).toContain('contextforge claude-audit --summary contextforge-claude-audit.md --sarif contextforge-claude.sarif'); + expect(markdown).toContain('contextforge workflow-audit --summary contextforge-workflow-audit.md --sarif contextforge-workflow.sarif'); + expect(markdown).toContain('contextforge actions-audit --summary contextforge-actions-audit.md --sarif contextforge-actions.sarif'); }); });