Skip to content

Commit f03fe27

Browse files
SmartBrandStrategieskovermierclaude
authored
feat: v0.8.0 — classifier QA fix, routing trace, cold-start UX (#39-#46) (#49)
* fix(security): remove shell injection surface and block directory traversal Closes #43: Remove shell: true from runGit() in git-helpers.ts. Node.js resolves the git binary via PATH directly without a shell on WSL, Linux, macOS, and Windows. shell: true is unnecessary and allows shell metacharacters in args to be interpreted as shell syntax. Closes #42: Validate module paths in adf create before path.join. Paths containing ".." or absolute paths are rejected with a clear error. A secondary resolved-path check confirms the final path stays within the .ai/ directory, guarding against platform-specific bypass patterns. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(adf): populate command, unknown-op error clarity, CONTEXT scaffold section charter adf populate [--dry-run] [--force] [--ai-dir <dir>] Reads package.json, README.md, and stack detection signals to auto-fill ADF files with project-specific content after charter adf init. Populates CONTEXT in core/backend/frontend.adf and STATE in state.adf. Idempotent: skips files with non-scaffold content unless --force. patcher: unknown ops now produce a clear error listing valid op names instead of the cryptic "handlers[op.op] is not a function" TypeError. CORE_SCAFFOLD: add a CONTEXT section placeholder so ADD_BULLET section:CONTEXT works immediately after adf init without requiring ADD_SECTION first. adf patch help: list all valid ops with concrete usage examples. bootstrap/init next steps: point to charter adf populate as step 1 instead of generic "edit core.adf manually" guidance. harness: extend SDLC corpus and runner with mixed QA/backend signal scenarios that exposed the classifier routing issues filed in #44/#45. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(classifier): QA phrase override, routing trace, tidy --verbose (#44 #45 #46) - QA compound phrases (smoke test, contract test, schema compat, approval gate, verified against, test fixtures) now fire a phrase-level override BEFORE keyword scoring so they cannot be outvoted by raw infra/backend keyword count (#44, #45) - contentToModule() returns { module, phraseOverride?, scores } so all per-module candidate scores are visible to callers (#46) - ClassificationResult gains optional routingTrace?: RoutingTrace with headingModule, phraseOverride, and candidateScores for debugging - adf tidy --verbose prints per-item routing rationale (module, section, trigger scores or phrase override) (#46) - 9 new tests covering phrase override routing, guard against absent qa.adf in triggerMap, and routingTrace shape (#44 #45 #46) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(cold-start): migrate --keep-summary, migrate --audit, doctor sparse-pointer warn (#39 #40 #41) - adf migrate --keep-summary: injects auto-generated "## Architecture Summary" block listing migrated module names and section counts so CLAUDE.md thin pointer gives agents architectural orientation without duplicating rules (#39) - adf migrate --audit: prints per-module breakdown (constraints/context/ advisory counts) and flags potential misroutes via routingTrace — items routed to core.adf that scored non-zero for a specialized module (#40) - doctor: adds 'INFO' status tier (soft, does not fail overall check); warns [info] when thin-pointer CLAUDE.md has <15 lines and no stack/framework keywords — agents have zero orientation, suggests charter adf populate (#41) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump to v0.8.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Kevin Overmier <kovermier@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 8a45830 commit f03fe27

25 files changed

+1150
-49
lines changed

harness/adf-inspector.ts

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,9 @@ export function printSnapshot(snapshot: AdfSnapshot, previous?: AdfSnapshot): vo
120120
export function detectAccumulationIssues(snapshots: AdfSnapshot[]): string[] {
121121
const issues: string[] = [];
122122
if (snapshots.length < 2) return issues;
123+
const MIN_ABSOLUTE_GROWTH = 10;
124+
const MIN_BASELINE_ITEMS = 3;
125+
const MAX_SECTION_ITEMS = 20;
123126

124127
const first = snapshots[0];
125128
const last = snapshots[snapshots.length - 1];
@@ -131,13 +134,13 @@ export function detectAccumulationIssues(snapshots: AdfSnapshot[]): string[] {
131134
const growth = mod.totalItems - start.totalItems;
132135
const growthRate = start.totalItems > 0 ? growth / start.totalItems : growth;
133136

134-
if (growthRate > 2) {
137+
if (growth >= MIN_ABSOLUTE_GROWTH && start.totalItems >= MIN_BASELINE_ITEMS && growthRate > 2) {
135138
issues.push(`${mod.module}: grew ${growth} items (${(growthRate * 100).toFixed(0)}% increase) — possible accumulation`);
136139
}
137140

138141
// Check any single section that got very large
139142
for (const sec of mod.sections) {
140-
if (sec.itemCount > 15) {
143+
if (sec.itemCount > MAX_SECTION_ITEMS) {
141144
issues.push(`${mod.module} > ${sec.key}: ${sec.itemCount} items — section may need pruning`);
142145
}
143146
}

harness/corpus/sdlc.ts

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
/**
2+
* SDLC-focused scenarios — validate that ADF modules stay updated and portable
3+
* as project guidance evolves from requirements through release.
4+
*/
5+
6+
import type { Scenario } from '../types';
7+
8+
export const sdlcScenarios: Scenario[] = [
9+
{
10+
id: 'fullstack-sdlc-handoff-portability',
11+
archetype: 'fullstack',
12+
description: 'Rules evolve across SDLC phases while remaining portable through ADF modules',
13+
manifest: {
14+
onDemand: [
15+
{ path: 'frontend.adf', triggers: ['react', 'component', 'ui', 'css', 'vite', 'tsx'] },
16+
{ path: 'backend.adf', triggers: ['api', 'endpoint', 'route', 'handler', 'database', 'auth', 'zod', 'request', 'response'] },
17+
{ path: 'infra.adf', triggers: ['deploy', 'release', 'rollback', 'ci', 'pipeline', 'docker', 'env', 'artifact'] },
18+
{ path: 'qa.adf', triggers: ['test', 'testing', 'playwright', 'contract', 'smoke', 'verification', 'evidence', 'auditability'] },
19+
],
20+
},
21+
sessions: [
22+
{
23+
label: 'session-1: requirements',
24+
inject: `
25+
## API Requirements
26+
27+
- Every API endpoint must publish request and response schemas
28+
- Auth is required for all write endpoints
29+
- Route handlers must return structured error codes
30+
- Database migrations must be reviewed before merge
31+
`,
32+
expected: { 'backend.adf': 4 },
33+
},
34+
{
35+
label: 'session-2: design',
36+
inject: `
37+
## System Design
38+
39+
- React UI components must map one-to-one to approved design tokens
40+
- API handlers must validate all payloads with Zod
41+
- Route naming must stay stable across versions
42+
- Frontend component props must be typed in TSX files
43+
`,
44+
expected: { 'frontend.adf': 2, 'backend.adf': 2 },
45+
},
46+
{
47+
label: 'session-3: implementation',
48+
inject: `
49+
## Implementation Rules
50+
51+
- API route files live under \`app/api/\` and use one handler per endpoint
52+
- Database writes must run inside transactions
53+
- Auth checks execute before any handler business logic
54+
- Build artifacts are generated only in CI pipeline jobs
55+
`,
56+
expected: { 'backend.adf': 3, 'infra.adf': 1 },
57+
},
58+
{
59+
label: 'session-4: verification',
60+
inject: `
61+
## Verification
62+
63+
- CI pipeline must run unit, integration, and Playwright suites on every PR
64+
- API contract tests validate request and response schema compatibility
65+
- Deploy preview environments must run smoke checks before approval
66+
- Test artifacts are uploaded from CI for auditability
67+
`,
68+
expected: { 'qa.adf': 4 },
69+
},
70+
{
71+
label: 'session-5: release and portability handoff',
72+
inject: `
73+
## Release Handoff
74+
75+
- Deploy jobs must consume versioned artifacts from the pipeline only
76+
- Rollback instructions must be validated in staging before production release
77+
- Environment configuration uses env keys defined in the deployment checklist
78+
- Release evidence includes CI run ID, artifact hash, and deployment timestamp
79+
`,
80+
expected: { 'infra.adf': 4 },
81+
},
82+
],
83+
},
84+
{
85+
id: 'fullstack-sdlc-generic-checklist-routing',
86+
archetype: 'fullstack',
87+
description: 'Generic SDLC handoff headings still separate verification evidence from release operations',
88+
manifest: {
89+
onDemand: [
90+
{ path: 'frontend.adf', triggers: ['react', 'component', 'ui', 'css', 'vite', 'tsx'] },
91+
{ path: 'backend.adf', triggers: ['api', 'endpoint', 'route', 'handler', 'database', 'auth', 'zod', 'request', 'response'] },
92+
{ path: 'infra.adf', triggers: ['deploy', 'release', 'rollback', 'ci', 'pipeline', 'docker', 'env', 'artifact'] },
93+
{ path: 'qa.adf', triggers: ['test', 'testing', 'playwright', 'contract', 'smoke', 'verification', 'evidence', 'auditability'] },
94+
],
95+
},
96+
sessions: [
97+
{
98+
label: 'session-1: generic checklist handoff',
99+
inject: `
100+
## Checklist
101+
102+
- Playwright smoke tests must pass before release approval
103+
- Contract test evidence is attached to the deployment record for auditability
104+
- Release artifact hashes are recorded before deploy starts
105+
- Rollback drills must use the staged deploy artifact from the pipeline
106+
`,
107+
expected: { 'qa.adf': 2, 'infra.adf': 2 },
108+
},
109+
],
110+
},
111+
{
112+
id: 'fullstack-sdlc-mixed-qa-backend-signals',
113+
archetype: 'fullstack',
114+
description: 'Mixed backend and QA wording in a generic checklist should still route by dominant verification vs API intent',
115+
manifest: {
116+
onDemand: [
117+
{ path: 'frontend.adf', triggers: ['react', 'component', 'ui', 'css', 'vite', 'tsx'] },
118+
{ path: 'backend.adf', triggers: ['api', 'endpoint', 'route', 'handler', 'database', 'auth', 'zod', 'request', 'response'] },
119+
{ path: 'infra.adf', triggers: ['deploy', 'release', 'rollback', 'ci', 'pipeline', 'docker', 'env', 'artifact'] },
120+
{ path: 'qa.adf', triggers: ['test', 'testing', 'playwright', 'contract', 'smoke', 'verification', 'evidence', 'auditability'] },
121+
],
122+
},
123+
sessions: [
124+
{
125+
label: 'session-1: mixed checklist bullets',
126+
inject: `
127+
## Checklist
128+
129+
- API contract test evidence must be attached to the release review for auditability
130+
- Request and response schema contract tests must pass before merging backend changes
131+
- Endpoint smoke tests run in CI before deploy approval
132+
- API handler error responses are verified against contract fixtures
133+
`,
134+
expected: { 'qa.adf': 3, 'backend.adf': 1 },
135+
},
136+
],
137+
},
138+
];

harness/runner.ts

Lines changed: 115 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,8 @@ import * as os from 'node:os';
2121
import * as path from 'node:path';
2222
import { execFileSync } from 'node:child_process';
2323

24-
import type { Scenario, TidyOutput, ScenarioResult, HarnessReport } from './types';
24+
import { buildMigrationPlan, parseMarkdownSections, type TriggerMap } from '../packages/adf/src';
25+
import type { Scenario, TidyOutput, ScenarioResult, HarnessReport, StaticSessionAudit, StaticItemRoute } from './types';
2526
import { evaluateSession, printSessionResult } from './evaluator';
2627
import { generateScenarios, getArchetypeManifest } from './ollama';
2728
import { REAL_REPOS } from './corpus/real-repos';
@@ -31,6 +32,7 @@ import { workerScenarios } from './corpus/worker';
3132
import { backendScenarios } from './corpus/backend';
3233
import { fullstackScenarios } from './corpus/fullstack';
3334
import { edgeCaseScenarios } from './corpus/edge-cases';
35+
import { sdlcScenarios } from './corpus/sdlc';
3436

3537
// ============================================================================
3638
// Config
@@ -44,6 +46,7 @@ const ALL_STATIC: Scenario[] = [
4446
...backendScenarios,
4547
...fullstackScenarios,
4648
...edgeCaseScenarios,
49+
...sdlcScenarios,
4750
];
4851

4952
const OLLAMA_ARCHETYPES = ['worker', 'backend', 'fullstack'];
@@ -158,7 +161,12 @@ function runTidy(repoDir: string, dryRun = true): TidyOutput {
158161
function runStaticScenario(scenario: Scenario): ScenarioResult {
159162
const tmp = makeTempRepo(scenario);
160163
const sessionResults = [];
164+
const sessionAudits: StaticSessionAudit[] = [];
165+
const snapshots: AdfSnapshot[] = [];
166+
let prevSnapshot: AdfSnapshot | undefined;
161167
let scenarioPass = true;
168+
const baseClaude = THIN_POINTER.trim();
169+
const aiDir = path.join(tmp, '.ai');
162170

163171
for (const session of scenario.sessions) {
164172
// Each session: inject onto thin pointer, dry-run to evaluate, then apply
@@ -173,18 +181,123 @@ function runStaticScenario(scenario: Scenario): ScenarioResult {
173181

174182
// Apply tidy (non-dry-run) to route content into ADF modules, restoring
175183
// CLAUDE.md to thin pointer so the next session sees a clean baseline.
176-
runTidy(tmp, false);
184+
const applyOutput = runTidy(tmp, false);
185+
186+
const postClaude = fs.readFileSync(path.join(tmp, 'CLAUDE.md'), 'utf-8').trim();
187+
const claudeRestored = postClaude === baseClaude;
188+
if (!claudeRestored) {
189+
scenarioPass = false;
190+
console.log(' portability warning: CLAUDE.md was not restored to thin pointer state');
191+
}
192+
193+
const snapshot = inspectAdfModules(aiDir, session.label, prevSnapshot);
194+
snapshots.push(snapshot);
195+
prevSnapshot = snapshot;
196+
const itemRoutes = previewItemRoutes(session.inject, scenario);
197+
198+
sessionAudits.push({
199+
sessionLabel: session.label,
200+
dryRunExtracted: tidyOutput.totalExtracted,
201+
appliedModulesModified: applyOutput.modulesModified,
202+
claudeRestored,
203+
adfTotalItems: snapshot.totalItemsAcrossAllModules,
204+
modulesGrew: snapshot.grew,
205+
itemRoutes,
206+
});
207+
208+
if (!sessionResult.pass) {
209+
console.log(' item routing preview:');
210+
for (const item of itemRoutes) {
211+
const matches = item.matchedTriggers.length > 0 ? ` | matches=${item.matchedTriggers.join(', ')} score=${item.matchScore}` : '';
212+
console.log(` [${item.heading || 'preamble'} -> ${item.headingModule}] ${item.targetModule} (${item.targetSection}) :: ${item.content}${matches}`);
213+
}
214+
}
215+
}
216+
217+
const accumulationIssues = detectAccumulationIssues(snapshots);
218+
if (accumulationIssues.length > 0) {
219+
console.log(' accumulation warnings:');
220+
for (const issue of accumulationIssues) console.log(` - ${issue}`);
177221
}
178222

179223
return {
180224
scenarioId: scenario.id,
181225
archetype: scenario.archetype,
182226
description: scenario.description,
183227
sessions: sessionResults,
228+
staticAudit: {
229+
sessions: sessionAudits,
230+
accumulationIssues,
231+
},
184232
pass: scenarioPass,
185233
};
186234
}
187235

236+
function previewItemRoutes(inject: string, scenario: Scenario): StaticItemRoute[] {
237+
const triggerMap: TriggerMap = {};
238+
for (const entry of scenario.manifest.onDemand) {
239+
if (entry.triggers.length > 0) {
240+
triggerMap[entry.path] = entry.triggers.map(trigger => trigger.toLowerCase());
241+
}
242+
}
243+
244+
const sections = parseMarkdownSections(inject);
245+
const plan = buildMigrationPlan(sections, undefined, triggerMap);
246+
247+
return plan.items.map(item => ({
248+
heading: item.sourceHeading,
249+
content: item.element.content,
250+
headingModule: previewHeadingModule(item.sourceHeading),
251+
targetModule: item.classification.targetModule,
252+
targetSection: item.classification.targetSection,
253+
decision: item.classification.decision,
254+
reason: item.classification.reason,
255+
...scoreItemAgainstTriggers(item.element.content, triggerMap),
256+
}));
257+
}
258+
259+
function previewHeadingModule(heading: string): string {
260+
const lower = heading.toLowerCase();
261+
if (/\b(design.system|ui|frontend|css|component|react|vue|svelte|next|nextjs|tailwind|shadcn|radix|storybook|vite|vitest|playwright|remix|nuxt|astro)\b/.test(lower)) {
262+
return 'frontend.adf';
263+
}
264+
if (/\b(qa|quality|test|testing|verification|validate|validation|contract|smoke|evidence|audit)\b/.test(lower)) {
265+
return 'qa.adf';
266+
}
267+
if (/\b(auth|authentication|authorization|security|secret|token|permission|cors|rate.limit|jwt|oauth|clerk|nextauth|lucia|session|cookie|csrf|xss|password|bcrypt)\b/.test(lower)) {
268+
return 'security.adf';
269+
}
270+
if (/\b(deploy|deployment|infrastructure|infra|ci|cd|pipeline|config|configuration|environment|env|docker|wrangler|cloudflare|vercel|netlify|railway|fly|render|github.actions|kv|d1|r2|queue|durable.object)\b/.test(lower)) {
271+
return 'infra.adf';
272+
}
273+
if (/\b(api|backend|server|database|db|endpoint|query|migration|handler|prisma|drizzle|mongoose|postgres|postgresql|mysql|sqlite|express|fastify|hono|trpc|zod|graphql)\b/.test(lower)) {
274+
return 'backend.adf';
275+
}
276+
return 'core.adf';
277+
}
278+
279+
function scoreItemAgainstTriggers(text: string, triggerMap: TriggerMap): Pick<StaticItemRoute, 'matchedTriggers' | 'matchScore'> {
280+
const lower = text.toLowerCase();
281+
let matchedTriggers: string[] = [];
282+
let matchScore = 0;
283+
284+
for (const triggers of Object.values(triggerMap)) {
285+
const currentMatches = triggers.filter(trigger =>
286+
new RegExp(`\\b${escapeRegex(trigger)}(?:s|ed|ing|ment|tion|ity|ication)?\\b`, 'i').test(lower),
287+
);
288+
if (currentMatches.length > matchScore) {
289+
matchedTriggers = currentMatches;
290+
matchScore = currentMatches.length;
291+
}
292+
}
293+
294+
return { matchedTriggers, matchScore };
295+
}
296+
297+
function escapeRegex(str: string): string {
298+
return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
299+
}
300+
188301
// ============================================================================
189302
// Ollama Scenario Runner (exploratory — no expected routing)
190303
// ============================================================================

harness/types.ts

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,9 +87,37 @@ export interface ScenarioResult {
8787
archetype: string;
8888
description: string;
8989
sessions: SessionResult[];
90+
staticAudit?: StaticScenarioAudit;
9091
pass: boolean;
9192
}
9293

94+
export interface StaticSessionAudit {
95+
sessionLabel: string;
96+
dryRunExtracted: number;
97+
appliedModulesModified: string[];
98+
claudeRestored: boolean;
99+
adfTotalItems: number;
100+
modulesGrew: string[];
101+
itemRoutes: StaticItemRoute[];
102+
}
103+
104+
export interface StaticScenarioAudit {
105+
sessions: StaticSessionAudit[];
106+
accumulationIssues: string[];
107+
}
108+
109+
export interface StaticItemRoute {
110+
heading: string;
111+
content: string;
112+
headingModule: string;
113+
targetModule: string;
114+
targetSection: string;
115+
decision: 'STAY' | 'MIGRATE';
116+
reason: string;
117+
matchedTriggers: string[];
118+
matchScore: number;
119+
}
120+
93121
// ============================================================================
94122
// Run Report
95123
// ============================================================================

package.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,5 +42,6 @@
4242
"typescript": "~5.8.2",
4343
"vitest": "^4.0.18",
4444
"zod": "^3.24.1"
45-
}
45+
},
46+
"version": "0.8.0"
4647
}

packages/adf/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"name": "@stackbilt/adf",
33
"sideEffects": false,
4-
"version": "0.7.0",
4+
"version": "0.8.0",
55
"description": "ADF (Attention-Directed Format) — AST-backed context format for AI agents",
66
"main": "./dist/index.js",
77
"types": "./dist/index.d.ts",

0 commit comments

Comments
 (0)