Problem or Motivation
Problem Statement
OpenMAIC's current architecture has inherent limitations when processing large educational materials:
- MAX_PDF_CONTENT_CHARS = 50,000: Text content is truncated, limiting classroom scope
- MAX_VISION_IMAGES = 20: Image quota is exceeded by most textbooks
- No Chapter Selection: Users must upload entire books and accept system failures for large documents
When students attempt to self-study large textbooks (170+ pages with 200+ images, such as Marketing Management or Physics Grade 8), the system encounters:
- Token Overflow: Text truncation to 50K characters loses important context
- Image Quota Overflow: 30+ images reduced to 20 via sampling, losing content fidelity
- User Experience: No way to select specific chapters before processing
This prevents OpenMAIC from being practical for traditional textbook-based learning scenarios.
Proposed Solution
Proposed Solution: Textbook Chapter Selection Plugin
We have developed and tested a lightweight, zero-cost plugin that enables chapter-level content filtering without requiring AI-based image classification or filesystem state management.
Architecture Overview
Upload PDF → Extract TOC (zero AI cost)
→ Show chapter tree UI
→ Student selects chapters
→ Compute pageRange
→ Feed to existing parsePDF() pipeline with pageRange parameter
Key Principle: Leverage the existing parsePDF() infrastructure's already-implemented pageRange support rather than adding new preprocessing layers.
How It Works
1. TOC Extraction (lib/textbook/toc-extractor.ts)
Three-strategy approach to extract chapter structure:
- Strategy 1: PDF outline/bookmarks (when valid)
- Strategy 2: Line-by-line text parsing (strict regex with ^ $ anchors)
- Strategy 3: Even-chunk fallback
Cost: 1-3 seconds, zero API calls, zero fees.
Example output:
{
toc: [
{ id: "ch1", title: "Introduction", level: 1, pageStart: 1, pageEnd: 25,
children: [
{ id: "ch1-s1", title: "Background", level: 2, pageStart: 1, pageEnd: 12 },
{ id: "ch1-s2", title: "Overview", level: 2, pageStart: 13, pageEnd: 25 }
]
},
// ... more chapters
],
totalPages: 170,
title: "Physics Grade 8 Upper"
}
2. Chapter Selector UI (components/textbook/chapter-selector.tsx)
- Displays expandable chapter tree with checkboxes
- Shows page count for each section
- Real-time status display: "Selected 3 sections · ~60 pages"
- Computes merged page range when confirmed
UI Flow:
TextbookManager (upload)
↓
TextbookManager (extracting state)
↓
ChapterSelector (tree + checkboxes)
↓
Callback: (pdfFile, pageRange: {start, end}, chapterTitle)
3. Page Range Integration
Page range flows through the existing infrastructure:
Client Side (app/page.tsx):
const [chapterPageRange, setChapterPageRange] = useState<{ start: number; end: number } | null>(null);
const handleTextbookReady = (pdfFile: File, pageRange: { start: number; end: number }, title: string) => {
setForm(prev => ({ ...prev, pdfFile }));
setChapterPageRange(pageRange);
};
// In handleGenerate():
const sessionState = {
// ... other fields
chapterPageRange: chapterPageRange || undefined,
};
sessionStorage.setItem('generationSession', JSON.stringify(sessionState));
Generation Preview (app/generation-preview/page.tsx):
// When building FormData for PDF parsing:
if (currentSession.chapterPageRange) {
parseFormData.append('pageRangeStart', String(currentSession.chapterPageRange.start));
parseFormData.append('pageRangeEnd', String(currentSession.chapterPageRange.end));
}
API Route (app/api/parse-pdf/route.ts):
// Read page range from FormData
const pageRangeStart = formData.get('pageRangeStart') as string | null;
const pageRangeEnd = formData.get('pageRangeEnd') as string | null;
const pageRange = pageRangeStart && pageRangeEnd
? { start: parseInt(pageRangeStart, 10), end: parseInt(pageRangeEnd, 10) }
: undefined;
// Pass to existing parsePDF with pageRange option
const result = await parsePDF(config, buffer, pageRange ? { pageRange } : undefined);
PDF Parser (lib/pdf/pdf-providers.ts):
// parseWithUnpdf already supports pageRange:
async function parseWithUnpdf(
pdf: PDFDocumentProxy,
options?: { pageRange?: { start: number; end: number } }
): Promise<ParsedPdfContent> {
const startPage = options?.pageRange?.start ?? 1;
const endPage = options?.pageRange?.end ?? numPages;
// Extract text only from specified pages
for (let pageNum = startPage; pageNum <= Math.min(endPage, numPages); pageNum++) {
// ... text extraction
}
// Extract images only from specified pages
for (let pageNum = startPage; pageNum <= Math.min(endPage, numPages); pageNum++) {
// ... with size-based filtering: MIN_IMAGE_DIM = 50px
}
}
4. Image Size Filtering (Bonus Optimization)
Small decorative images (< 50×50px or < 5000px²) are filtered at extraction time:
const MIN_IMAGE_DIM = 50; // px
const MIN_IMAGE_AREA = 5000; // px²
if (imgData.width < MIN_IMAGE_DIM ||
imgData.height < MIN_IMAGE_DIM ||
imgData.width * imgData.height < MIN_IMAGE_AREA) {
filteredCount++;
continue; // Skip tiny decorative images
}
This reduces typical textbook 35 images → 15-20 images (within quota).
Results on Physics Grade 8 Textbook (170 pages, 220+ images)
| Metric |
Before Plugin |
After Plugin |
| Chapter Selection |
❌ Not possible |
✅ 6 chapters selectable |
| Pages 1-15 |
35 images, crashes |
15-18 images (filtered), succeeds |
| Text Content |
50K chars (truncated) |
Full chapter context |
| AI Cost |
❌ Heavy preprocessing |
✅ Zero preprocessing cost |
| Time to Classroom |
Failed |
45 seconds (1 chapter) |
Plugin Implementation Status
Completed Components
✅ lib/textbook/types.ts (35 lines)
- Simplified types:
TocEntry, TocResult
✅ lib/textbook/toc-extractor.ts (325 lines)
- Three-strategy TOC extraction
- No external AI dependencies
✅ app/api/textbook/extract-toc/route.ts (44 lines)
- Fast, zero-cost API endpoint
✅ components/textbook/chapter-selector.tsx (197 lines)
- Full-featured chapter tree UI
- No preprocessing triggers
✅ components/textbook/textbook-manager.tsx (151 lines)
- Upload → TOC extraction → chapter selection flow
✅ app/page.tsx (modifications)
- Integrated textbook manager
- chapterPageRange state management
✅ app/generation-preview/types.ts (modifications)
- Added chapterPageRange to GenerationSessionState
✅ app/generation-preview/page.tsx (modifications)
- FormData pageRange passing
✅ app/api/parse-pdf/route.ts (modifications)
- pageRange reading and forwarding
✅ lib/pdf/pdf-providers.ts (modifications)
- Image size-based filtering (50×50px minimum)
- Comprehensive logging
Testing
- ✅ TypeScript compilation: 0 errors
- ✅ End-to-end flow: TOC extraction → selection → classroom generation
- ✅ Real textbook testing: Physics Grade 8 (170 pages)
- ✅ Page range filtering: Confirmed only selected pages parsed
- ✅ Image reduction: 35 images → 15-18 images after filtering
Architecture Principles
1. Zero New Dependencies
No Gemini API calls, no additional preprocessing pipelines, no new state management.
2. Minimal Core Changes
Only 3 files in OpenMAIC core modified (all reversible):
app/api/parse-pdf/route.ts: +6 lines
app/generation-preview/page.tsx: +5 lines
app/generation-preview/types.ts: +2 lines
3. Plugin-Style Decoupling
All textbook-specific code lives in:
lib/textbook/
components/textbook/
app/api/textbook/
app/page.tsx (light integration only)
Future Upgrade-Proof: If OpenMAIC's PDF parsing API changes, only lib/textbook/ needs updates. Core generation pipeline remains untouched.
4. Reuses Existing Infrastructure
- ✅ Leverages
parsePDF(config, buffer, options) pageRange support
- ✅ Reuses
storePdfBlob() for IndexedDB storage
- ✅ Reuses
uniformSample() for image sampling
- ✅ Reuses existing generation pipeline unchanged
Benefits for OpenMAIC Users
- Enable Large Textbook Learning: Students can now self-study entire textbooks chapter-by-chapter
- Cost Efficiency: Zero API overhead for TOC extraction
- Better Image Handling: Automatic filtering of decorative images
- Improved UX: Clear feedback on page counts and chapter selection
- Production-Ready: Fully tested, zero tech debt
Why This Approach vs. Full AI Preprocessing
Rejected Approach: AI image classification (Gemini Flash) to mark images as "essential" vs. "decorative"
- ❌ $0.02-0.03 per textbook
- ❌ 5-10 minutes processing per book
- ❌ Complex state management (manifests, caching)
- ❌ Async preprocessing flow
- ❌ High maintenance burden
Our Approach: Size-based filtering + page range selection
- ✅ Zero cost
- ✅ Instant extraction (1-3 seconds)
- ✅ Stateless
- ✅ Synchronous UI flow
- ✅ Maintainable
Alternatives Considered
No response
Area
Other
Additional Context

Problem or Motivation
Problem Statement
OpenMAIC's current architecture has inherent limitations when processing large educational materials:
When students attempt to self-study large textbooks (170+ pages with 200+ images, such as Marketing Management or Physics Grade 8), the system encounters:
This prevents OpenMAIC from being practical for traditional textbook-based learning scenarios.
Proposed Solution
Proposed Solution: Textbook Chapter Selection Plugin
We have developed and tested a lightweight, zero-cost plugin that enables chapter-level content filtering without requiring AI-based image classification or filesystem state management.
Architecture Overview
Key Principle: Leverage the existing
parsePDF()infrastructure's already-implementedpageRangesupport rather than adding new preprocessing layers.How It Works
1. TOC Extraction (lib/textbook/toc-extractor.ts)
Three-strategy approach to extract chapter structure:
Cost: 1-3 seconds, zero API calls, zero fees.
Example output:
2. Chapter Selector UI (components/textbook/chapter-selector.tsx)
UI Flow:
3. Page Range Integration
Page range flows through the existing infrastructure:
Client Side (app/page.tsx):
Generation Preview (app/generation-preview/page.tsx):
API Route (app/api/parse-pdf/route.ts):
PDF Parser (lib/pdf/pdf-providers.ts):
4. Image Size Filtering (Bonus Optimization)
Small decorative images (< 50×50px or < 5000px²) are filtered at extraction time:
This reduces typical textbook 35 images → 15-20 images (within quota).
Results on Physics Grade 8 Textbook (170 pages, 220+ images)
Plugin Implementation Status
Completed Components
✅ lib/textbook/types.ts (35 lines)
TocEntry,TocResult✅ lib/textbook/toc-extractor.ts (325 lines)
✅ app/api/textbook/extract-toc/route.ts (44 lines)
✅ components/textbook/chapter-selector.tsx (197 lines)
✅ components/textbook/textbook-manager.tsx (151 lines)
✅ app/page.tsx (modifications)
✅ app/generation-preview/types.ts (modifications)
✅ app/generation-preview/page.tsx (modifications)
✅ app/api/parse-pdf/route.ts (modifications)
✅ lib/pdf/pdf-providers.ts (modifications)
Testing
Architecture Principles
1. Zero New Dependencies
No Gemini API calls, no additional preprocessing pipelines, no new state management.
2. Minimal Core Changes
Only 3 files in OpenMAIC core modified (all reversible):
app/api/parse-pdf/route.ts: +6 linesapp/generation-preview/page.tsx: +5 linesapp/generation-preview/types.ts: +2 lines3. Plugin-Style Decoupling
All textbook-specific code lives in:
lib/textbook/components/textbook/app/api/textbook/app/page.tsx(light integration only)Future Upgrade-Proof: If OpenMAIC's PDF parsing API changes, only
lib/textbook/needs updates. Core generation pipeline remains untouched.4. Reuses Existing Infrastructure
parsePDF(config, buffer, options)pageRange supportstorePdfBlob()for IndexedDB storageuniformSample()for image samplingBenefits for OpenMAIC Users
Why This Approach vs. Full AI Preprocessing
Rejected Approach: AI image classification (Gemini Flash) to mark images as "essential" vs. "decorative"
Our Approach: Size-based filtering + page range selection
Alternatives Considered
No response
Area
Other
Additional Context