Transform cramped homework PDFs into spacious, answer-friendly documents. Each question gets its own page with plenty of room to write.
| Before Cramped questions |
After Room to write |
![]() |
![]() |
- Preserves Everything — Visual snapshots keep diagrams, math formulas, and formatting intact
- Smart Detection — Auto-detects questions (1., Q1, Problem 1, Exercise 1) or use custom regex
- Multi-page Questions — Questions spanning multiple source pages are stacked seamlessly, with source margins stripped so no blank gaps appear between segments
- ≥ 50 % Writing Space Guaranteed — The last output page of every question contains at most half a page of content, leaving at least half blank for handwritten answers
- Content-Aware Page Splits — When a tall question must overflow to a second page, splits snap to the nearest whitespace row so no text line or table row is cut in half
- Title Page Generation — Automatically detects course number and homework number from PDF or filename
- Page Numbers — Each question page shows "Question X of Y" (multi-page: "page k of N") at the bottom
- Batch Processing — Upload multiple PDFs, download as ZIP or merged into a single PDF
- Smart Merging — Merge multiple homework PDFs sorted by homework number (HW1, HW2, HW3...)
- Works Offline — Run locally or build a standalone desktop app
# Install
pip install streamlit PyMuPDF Pillow
# Run
streamlit run app.pyOpens at http://localhost:8501
┌─────────────────┐ ┌─────────────────┐
│ Original PDF │ │ Title Page │
├─────────────────┤ │ MATH 201 - HW 3 │
│ 1. Question A │ │ (8 Questions) │
│ 2. Question B │ ──► ├─────────────────┤
│ 3. Question C │ │ 1. Question A │
│ ... │ │ (stem) │
└─────────────────┘ │ (sub-parts) │ ← segments stacked,
│ │ margins stripped
│ ≥ 50% blank │ ← writing space
│ Question 1 of 8 │
├─────────────────┤
│ 2. Question B │
│ ... │
└─────────────────┘
Layout algorithm (per question):
- All source-page segments are cropped, margin-stripped, and stacked into one image
- The stacked image is chunked to fit output pages:
- Non-last pages: filled to capacity, split at the nearest whitespace row (never mid-line)
- Last page: ≤ 50 % content → ≥ 50 % blank writing area
When you upload multiple homework PDFs, you can download them as:
- Individual PDFs (ZIP file)
- Single Merged PDF — All homework combined, sorted by homework number, with divider pages
┌─────────────────┐ ┌─────────────────┐
│ HW1.pdf │ │ Merged PDF │
│ HW2.pdf │ ├─────────────────┤
│ HW3.pdf │ ──► │ HW1 Title Page │
└─────────────────┘ │ HW1 Questions │
│ ─── Divider ─── │
│ HW2 Title Page │
│ HW2 Questions │
│ ─── Divider ─── │
│ HW3 Title Page │
│ HW3 Questions │
└─────────────────┘
Filename: MATH_201_HW1-3_spaced_merged.pdf
Supported Patterns
| Pattern | Example | Regex |
|---|---|---|
| Numbered | 1., 2., 3. | ^\d+\. |
| Q-style | Q1, Q2 | ^Q\d+ |
| Problem | Problem 1 | ^Problem\s+\d+ |
| Exercise | Exercise 1 | ^Exercise\s+\d+ |
Title Detection
The app automatically detects course and homework information:
From PDF content:
- Course:
MATH 201,CS 10100,PHYS 201A(2-4 letters + any number of digits) - Homework:
Homework 3,HW 5,Assignment 2,Problem Set 4
From filename (fallback):
math201_hw3.pdf→ MATH 201 - Homework 3cs10100-assignment2.pdf→ CS 10100 - Assignment 2random_document.pdf→ "Random Document" (title case)
Testing
pip install -r requirements.txt
pytestRun with coverage:
pytest --cov=app --cov-report=term-missingTests cover pure functions, image processing, PDF manipulation, and title extraction. Streamlit is mocked so tests run without the UI.
Building Desktop App
pip install pyinstaller
python build.py --clean- macOS:
dist/Better Homework PDF.app - Windows:
dist/Better Homework PDF/Better Homework PDF.exe
Troubleshooting
No questions detected?
- Try a different pattern preset
- Use custom regex matching your format
- Ensure PDF has searchable text (not scanned images)
macOS app won't open?
- Right-click → "Open" to bypass Gatekeeper
MIT — free to use, modify, and distribute.

