A Claude Skill for Scrapling — an adaptive web-scraping framework that goes from a single request to a full-scale async crawl.
BeautifulSoup4 is roughly 780× slower than Scrapling on the official parsing benchmark, yet most Python scraping tutorials still default to it. The official Scrapling docs cover the API surface. This skill covers what to actually do with it — and is verified against the real v0.4.9 API, not from memory.
Three high-value capabilities aren't obvious from the README:
- Self-healing selectors — a two-phase workflow (
auto_savethenadaptive), not a magic flag. - Fetcher selection — knowing when to use
FetchervsStealthyFetchervsDynamicFetcher(and their Session/Async variants) means reading several doc pages. This skill ships a one-glance decision tree. - MCP + spiders — the native MCP server, multi-session spiders, lifecycle hooks (
on_scraped_item,is_blocked,retry_blocked_request), and pause/resume crawling, with copy-paste-correct invocations.
- Fetcher selection decision tree (HTTP / stealth / dynamic × one-shot / session / async)
- Self-healing selectors with the correct two-phase
auto_save→adaptiveworkflow - Spider architecture — concurrency, download delays, real lifecycle hooks, multi-session routing
- Correct
ProxyRotatorusage (proxy_rotator=, notproxy=rotator.next()) - Native MCP server wiring (
scrapling mcp, stdio + HTTP transports) - CLI usage for one-off scraping without writing a script
- BeautifulSoup → Scrapling migration cheat sheet
- A gotchas checklist (the
scrapling installprerequisite, the v0.4.9 proxy-leak fix, etc.)
# As a Claude Skill
claude skill install https://github.com/Thanane15M/scrapling-skill
# or copy SKILL.md into .claude/skills/scrapling/
# The underlying library (pin the version)
pip install "scrapling[fetchers]==0.4.9"
scrapling install # required before any browser-based fetcherThis skill targets Scrapling v0.4.x (verified on 0.4.9). The 0.4 line introduced breaking API changes (async spiders, ProxyRotator). If you're on 0.3.x, upgrade before using these patterns.
Scrapling is authored by Karim Shoair (D4Vinci) and licensed BSD-3-Clause. It's intended for educational and research use — respect each target site's robots.txt, terms of service, and applicable data-protection law (e.g. GDPR). This skill repository is MIT-licensed; the Scrapling library is not.
MIT