Auto-detect Apple MPS device#17
Merged
Merged
Conversation
- Convert emoji feature list to plain markdown bullets - Qualify the 20x faster/cheaper claim as 'on an L4 GPU' (it's 20.1x on L4, 7.1x on A100 per the blog) so it doesn't read as universal - Remove all em/en dashes from prose per style preference
Add default_device() (cuda -> mps -> cpu) and use it in Extractor and Pipeline when device is unspecified. Previously device=None fell back straight to CPU on Macs even when MPS was available, silently leaving Apple acceleration unused.
There was a problem hiding this comment.
Pull request overview
Adds a shared device auto-detection helper so Extractor() and Pipeline() select Apple MPS on macOS when CUDA is unavailable, instead of falling straight to CPU.
Changes:
- Introduce
default_device()inmodel_utils(CUDA → MPS → CPU). - Use
default_device()whenExtractor(device=None)andPipeline(devices=None)resolve their default device(s). - Update README formatting/copy and attribution text.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/pulpie/model_utils.py |
Adds default_device() helper for consistent device selection. |
src/pulpie/extractor.py |
Switches default device selection to default_device(). |
src/pulpie/pipeline.py |
Uses default_device() when CUDA is unavailable and devices is not provided. |
README.md |
Reformats and rewrites copy; also changes footer attribution. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+26
to
+30
| Pulpie extracts the main content from raw HTML, stripping navigation, ads, sidebars, and footers. It uses small encoder models that label every block in a single forward pass, approaching state-of-the-art extraction quality while running up to 20x faster and 20x cheaper than autoregressive extractors on an L4 GPU. | ||
|
|
||
| **⚡ Fast** — an encoder labels every block in one forward pass (13.7 pages/sec on an L4) </br> | ||
| **🎯 Accurate** — matches SOTA quality: 0.862–0.873 ROUGE-5 F1 on WebMainBench </br> | ||
| **🪶 Small** — the recommended model is 210M params, fits on any GPU </br> | ||
| **💸 Cheap** — clean 1 billion pages for ~$7,900 vs ~$159,000 for the leading decoder </br> | ||
| **📦 Simple** — `pip install pulpie`, then `Extractor().extract(html)` </br> | ||
| **🔌 Batched** — overlapped CPU+GPU pipeline scales across multiple GPUs </br> | ||
| - **Fast.** An encoder labels every block in one forward pass (13.7 pages/sec on an L4). | ||
| - **Accurate.** Matches state-of-the-art quality: 0.862 to 0.873 ROUGE-5 F1 on WebMainBench. | ||
| - **Small.** The recommended model is 210M parameters and fits on any GPU. |
|
|
||
| <div align="center"> | ||
| Built by <a href="https://github.com/chonkie-inc">Chonkie</a>, the open-source work behind <a href="https://usefeyn.com">Feyn</a>. | ||
| Built by <a href="https://usefeyn.com">Feyn</a>. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extractor()/Pipeline()with nodevicepreviously checked only CUDA, falling straight to CPU on Macs even when MPS was available. Addsdefault_device()(cuda → mps → cpu) and uses it in both. Verified:Extractor()now lands onmpson Apple Silicon and produces identical output to CPU; 65 parity tests pass.