feat: crawl progress reporting — live status, ETA, and completion summary

## Problem

When running a long crawl (`--house both` across 8 search groups × 5 ministries × ~270 RS sessions), there is no way to know:
- How many records have been collected so far
- Which query bucket is currently running and how many remain
- Estimated time to completion
- A final summary of what was retrieved

Users are left polling the manifest (`wc -l manifest.jsonl`) and the log (`tail crawl.log`) manually. Confusing for non-developer researchers running the tool for actual parliamentary research.

## Requested behaviour

1. **Live progress line** — after each bucket completes: `[LS 12/40] query='NTA' ministry=EDUCATION → 8 new | total=143`
2. **ETA estimate** — rolling estimate printed after the first 10% of buckets complete
3. **Completion summary** — final table showing records by house, tag, year, PDFs downloaded, and total duration
4. **`status` subcommand** — `sansad-semantic-crawler status --out data/my-crawl/` reads a running or completed crawl and prints the summary on demand, no Python one-liners needed

## Why

The tool is for researchers, not just developers. A full two-house crawl takes 90+ minutes. Without feedback, users cannot tell if it is working, stuck, or nearly done — which erodes trust and forces manual inspection of internal files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: crawl progress reporting — live status, ETA, and completion summary #48

Problem

Requested behaviour

Why

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: crawl progress reporting — live status, ETA, and completion summary #48

Description

Problem

Requested behaviour

Why

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions