Add viral/differentiation enhancements across all 9 plan items by shahcolate · Pull Request #4 · shahcolate/Product-Kit

shahcolate · 2026-03-15T12:40:09Z

Summary

4 example teardowns (Notion, Linear, Figma, ChatGPT) in examples/teardowns/ — "show don't tell" for GitHub browsers
--vs comparison mode in teardown.py — head-to-head product analysis across 7 dimensions (e.g., python scripts/teardown.py "Notion" --vs "Coda")
--baseline eval mode in run_evals.py — proves skill behavioral lift vs vanilla Claude (e.g., python scripts/run_evals.py --plugin strategic-pm --baseline)
--output social format in teardown.py — thread-ready summary with hook + 5 bullet verdicts
Async parallel API calls in teardown.py — ~4x faster teardowns via asyncio.gather()
Expanded eval coverage — 30 cases across 3 plugins (was 14 across 2): +5 strategic-pm, +5 product-writing-studio
PM Interview Prep plugin — 321-line SKILL.md, 6 eval cases, mock interview mode, STAR coaching, company-type calibration
Teardown-on-issue GitHub Action — label an issue teardown, get results as a comment (5/day rate limit)
README updated with all new features, example teardowns section, updated badges (30 cases), repo structure

Test plan

python scripts/teardown.py "Notion" — verify async parallel execution completes
python scripts/teardown.py "Notion" --vs "Coda" — verify comparison output with 7 dimensions
python scripts/teardown.py "Notion" --output social — verify thread-ready summary format
python scripts/run_evals.py --plugin strategic-pm --baseline — verify skill vs vanilla scores appear
python scripts/run_evals.py — verify all 30 cases across 3 plugins load and run
Verify all JSON files parse: cases.json (×3), marketplace.json, plugin.json
Open test issue with teardown label to verify GitHub Action triggers
Review example teardowns for quality and format consistency

🤖 Generated with Claude Code

…ls, interview prep plugin, teardown examples Phase 1 — Quick Wins: - Add 4 example teardowns (Notion, Linear, Figma, ChatGPT) to examples/teardowns/ - Add --vs comparison mode to teardown.py (head-to-head across 7 dimensions) - Add --baseline flag to run_evals.py (skill vs vanilla Claude behavioral lift) Phase 2 — Amplifiers: - Add --output social format to teardown.py (thread-ready summary) - Convert teardown.py to async parallel API calls (~4x faster) - Expand eval coverage: 30 cases across 3 plugins (was 14 across 2) Phase 3 — New Plugin + Community Loop: - Add PM Interview Prep plugin (321-line SKILL.md, 6 eval cases) - Add teardown-on-issue GitHub Action with 5/day rate limit - Update README with all new features, sections, and badge counts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

shahcolate · 2026-03-15T12:43:28Z

Test Results

All 10 validation checks passing.

#	Test	Result	Details
1	Lint JSON files	✅ Pass	10/10 JSON files valid
2	Validate plugin manifests	✅ Pass	3/3 plugins have name, version, skills
3	Validate SKILL.md files	✅ Pass	3/3 SKILL.md files exist at declared paths
4	Python syntax check	✅ Pass	`teardown.py` and `run_evals.py` parse clean
5	Eval case structure	✅ Pass	30 cases, 100 criteria — no dupes, no missing fields
6	CLI smoke tests	✅ Pass	Both scripts load and respond to `--help`
7	Marketplace validation	✅ Pass	3/3 plugin source paths resolve
8	Example teardown format	✅ Pass	4/4 teardowns with all 6 dimensions + footer
9	GitHub Action YAML	✅ Pass	4/4 workflow files valid YAML
10	CLI flag validation	✅ Pass	`--vs`, `--output social`, `--baseline` registered; `--vs` + `--output social` correctly rejected

Eval coverage

Plugin	Cases	Criteria
strategic-pm	12	39
product-writing-studio	12	39
pm-interview-prep	6	22
Total	30	100

Not run

Behavioral evals (run_evals.py) — requires ANTHROPIC_API_KEY for live API calls. Run manually or via Actions → Behavioral Eval workflow.
Teardown-on-issue Action — requires a labeled issue on the remote repo to trigger.

…lugin/ The validation script was resolving skill paths relative to the directory containing plugin.json (.claude-plugin/), but skill paths in plugin.json are relative to the plugin root directory (one level up). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

shahcolate merged commit 16d2aeb into main Mar 15, 2026
3 checks passed

This was referenced Mar 15, 2026

Add eval cases for product-writing-studio pws-008: tone mismatch detection #1

Closed

Add --verbose flag to run_evals.py for full subject response output #2

Closed

Document cost breakdown per model in evals/README.md #3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add viral/differentiation enhancements across all 9 plan items#4

Add viral/differentiation enhancements across all 9 plan items#4
shahcolate merged 2 commits into
mainfrom
viral-differentiation-enhancements

shahcolate commented Mar 15, 2026

Uh oh!

shahcolate commented Mar 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shahcolate commented Mar 15, 2026

Summary

Test plan

Uh oh!

shahcolate commented Mar 15, 2026

Test Results

Eval coverage

Not run

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant