Skip to content

fix(crawler): enable crawlDetails on all active Cafe24 sites + color fallback chain#18

Open
KJaeKwan wants to merge 1 commit into
devfrom
fix/crawl-details-color-fallback
Open

fix(crawler): enable crawlDetails on all active Cafe24 sites + color fallback chain#18
KJaeKwan wants to merge 1 commit into
devfrom
fix/crawl-details-color-fallback

Conversation

@KJaeKwan

@KJaeKwan KJaeKwan commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Summary

  • 적재 실패의 주 원인이었던 "Cafe24 사이트 대부분이 crawlDetails 꺼진 채로 크롤링됨"을 해결 — 활성 사이트 전체(etcseoul/mardimercredi/matteveil/triplestore/roughside 5곳 추가)에 crawlDetails: true 적용
  • color 추출을 단일 시도가 아닌 다단계 폴백 체인으로 구성: 상세페이지 셀렉터 값 → 상품명 키워드 매칭 → _COLOR 접미사/[COLOR] 브라켓 패턴, 사이트별로 og:title 괄호 파싱 등 추가 폴백
  • 신규 온보딩 사이트(yuse/ojos/goyowear) 설정 및 리스트 페이지 셀렉터/페이지네이션 안정성 수정 포함

Test plan

  • npx tsc --noEmit 통과 확인
  • npx dotenv -e .env -- tsx src/crawl.ts --site=etcseoul,mardimercredi,matteveil,triplestore,roughside --detail 소량 실행 후 color 필드 채워지는지 확인
  • 기존 정상 동작하던 사이트 회귀 없는지 샘플 크롤로 확인

🗿 MoAI email@mo.ai.kr

…extraction fallback chain

Products were failing to import because most Cafe24 sites never had
crawlDetails enabled, so color (a NOT NULL DB field) was only ever
guessed from the listing-page product name — which rarely contains a
color word for Korean fashion titles.

- platforms.ts: add crawlDetails: true to the remaining 5 active
  Cafe24 sites that lacked it (etcseoul, mardimercredi, matteveil,
  triplestore, roughside), plus onboarding config for yuse/ojos/goyowear
  and selector/pagination robustness fixes (skeleton li filtering,
  img-alt name fallback, duplicate-page loop guard)
- cafe24-engine.ts: multi-tier color fallback — detail selector value,
  then name-based keyword match, then _COLOR suffix / [COLOR] bracket
  patterns; brand resolution now prefers config.brand over DOM guesses
  for single-brand storefronts
- selector-registry.ts / base-detail-parser.ts / strategies.ts: per-site
  color selector overrides (blankroom JS-rendered span, ojos/goyowear
  name-based color) and sienneboutique og:title parenthetical color
  parser for sites where option1 is always size, not color
- color-normalizer.ts: add Indigo, extend Sand synonym (oat)
- probe-list.ts: new listing-page selector diagnostic tool for site
  onboarding

🗿 MoAI <email@mo.ai.kr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant