Skip to content

feat(skills): add vibe-aesthetic — design evaluation capture + LLM prompt#160

Open
HermeticOrmus wants to merge 1 commit into
VibiumDev:mainfrom
HermeticOrmus:feature/vibe-aesthetic
Open

feat(skills): add vibe-aesthetic — design evaluation capture + LLM prompt#160
HermeticOrmus wants to merge 1 commit into
VibiumDev:mainfrom
HermeticOrmus:feature/vibe-aesthetic

Conversation

@HermeticOrmus

Copy link
Copy Markdown

Why

New skill in the vibe-* family for design evaluation. Where vibe-explore maps what a page lets you do and vibe-recon maps the auth wall, vibe-aesthetic asks what does this page feel like, and where is it weak?

Captures evidence (section screenshots at desktop + mobile, design-token probe), then renders an LLM-agnostic evaluator prompt that any host LLM can read to score the page across nine dimensions and produce an AESTHETIC.md deliverable.

Use cases the skill is built for:

  • Pre-launch design review on a deployed site
  • Post-rebrand verification — did the new identity actually land?
  • Variant comparison — score two candidates on the same rubric
  • Brand-family coherence — score sibling sites and see who's lifting / dragging

What changed

New skill bundle at skills/vibe-aesthetic/:

  • SKILL.md — entry brief, pipeline, daemon-hygiene notes
  • aesthetic.sh — runner (prep + walk + prompt render)
  • walk.sh — section-aware walker with auto-anchor discovery, viewport-stepped fallback when no <section id> elements exist
  • probe.js — design-token probe (palette frequencies, typography, surface tokens, composition counts, meta). Returns JSON.stringify(...) so it serializes cleanly through vibium eval --stdin
  • prep.sh — daemon hygiene enforcing --headless to prevent silent 0-byte captures on SSH/CI sessions with empty DISPLAY
  • PROMPT.md — LLM-agnostic evaluator template with rubric, score anchors, output spec, and discipline notes
  • examples/reference-run.md — anonymized 8-iteration trajectory (7.4 → 9.5) documenting realistic score movement, the code ceiling at ~8.7, and the photography-as-ceiling pattern

How to test

From the repo root after building vibium:

make build-go
mkdir /tmp/aesthetic-test
skills/vibe-aesthetic/aesthetic.sh /tmp/aesthetic-test https://example.com/ --quick --viewport desktop

Expected:

  • /tmp/aesthetic-test/sections/s00_top__desktop.png etc. exist and are non-zero
  • /tmp/aesthetic-test/probes/tokens_desktop.json parses as valid JSON
  • /tmp/aesthetic-test/PROMPT.filled.md contains the rubric + screenshot list + probe JSON
  • /tmp/aesthetic-test/RUN.md has the run summary

For a fuller test against a real multi-section page (auto-anchor discovery exercised):

skills/vibe-aesthetic/aesthetic.sh /tmp/aesthetic-full https://news.ycombinator.com/ --viewport desktop

Notes

  • The skill is read-only by construction — no login required for public pages, no destructive surface.
  • The shipped PROMPT.md template is editable — operators can retune the rubric (different dimension set, different acceptance bar, different domain register) without changing the capture pipeline.
  • The daemon-headless lesson in prep.sh was discovered in the field — without it, every capture command logs success but writes nothing. Worth documenting upstream so future skill authors don't repeat the pattern.
  • Tested against example.com (single-fold fallback path) and a real multi-section deployed site (auto-discovery captured 4 anchored sections plus top + bottom).

…ompt

Walks a page at desktop + mobile, extracts a design-token probe (palette,
typography, spacing, surface tokens, composition, meta), captures section
screenshots auto-discovered from DOM anchors, and renders an LLM-agnostic
evaluator prompt that scores the page across nine dimensions:

  visual hierarchy, typography, color, spacing & layout, consistency,
  accessibility (visual), emotional impact, usability (perceived),
  archetypal coherence

The skill itself only captures evidence. The evaluation is produced by
the host LLM (Claude, GPT, etc.) reading the rendered prompt against the
attached screenshots and probe JSON. This split keeps capture mechanical
and reproducible while letting any LLM contribute the judgment.

Includes:
  - SKILL.md     — entry brief and pipeline doc
  - aesthetic.sh — runner (prep + walk + prompt render)
  - walk.sh      — section-aware walker with auto-anchor discovery and
                   viewport-stepped fallback
  - probe.js     — design-token probe (returns JSON.stringify(...))
  - prep.sh      — daemon hygiene enforcing --headless to prevent silent
                   0-byte captures on SSH/CI sessions with empty DISPLAY
  - PROMPT.md    — LLM-agnostic evaluator template with rubric, anchors,
                   output spec, and discipline notes
  - examples/reference-run.md — anonymized 8-iteration trajectory
                   (7.4 to 9.5) documenting realistic score movement, the
                   code ceiling at ~8.7, and the photography-as-ceiling
                   pattern

Tested against example.com (single-fold fallback path) and a real
multi-section page (auto-discovery captured 4 anchored sections + top +
bottom).

Sits alongside the vibe-explore / vibe-recon family as the design twin
of capability and auth-wall mapping.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant