Summary
Kompl needs a controlled way to rerun a compile from a selected pipeline step.
Right now retry behavior is mostly automatic:
- session retry resumes from the first non-done step
- retry-failed targets failed staging rows, failed drafts, or unextracted sources
That is useful for crashes and failed items, but it does not cover intentional reruns after the user changes settings or wants to redo only part of the pipeline.
User problem
Sometimes the pipeline did not fail, but the output is still wrong enough that the user needs a partial redo.
Examples:
- extraction was fine, but planning produced too many noisy pages
- user changes
entity_promotion_threshold and wants to rerun planning
- user changes
min_draft_chars and wants to rerun drafting/commit
- drafts failed or were low quality, but fetching and extraction should not run again
- user cancelled during draft and wants to restart from draft, not from ingest
Today the practical options are either too broad or too manual.
Why current retry is not enough
/api/compile/retry follows step status. It does not let the user say, "I know extract/resolve/match are fine, rerun from plan."
/api/compile/retry-failed is narrower, but only for failed items. It does not handle quality reruns where the prior step is technically done.
Manual DB manipulation can work, but it is risky because compile_progress, page_plans, sources, extractions, and Saved Links all have related state.
Proposed behavior
Add an advanced retry action:
POST /api/compile/retry-from
{
"session_id": "...",
"step": "plan"
}
Supported steps could include:
ingest_urls
extract
resolve
match
plan
draft
crossref
commit
schema
The UI could expose this as an advanced action on the progress page: Retry from....
Important semantics
The route should reset compile_progress.steps from the selected step onward, but each step needs its own data rules.
Suggested rules:
extract: keep existing extractions by default, retry only missing extractions unless force mode is explicit.
resolve: rerun resolver from existing extractions.
match: rerun match from existing sources/extractions.
plan: clear non-committed page_plans, rebuild plans from current settings and existing extracted data.
draft: rerun only planned/failed drafts by default, with optional force reset for drafted non-committed plans.
crossref: rerun on drafted pages.
commit: rerun on crossreffed pages.
schema: rerun schema only.
Safety requirements
- Refuse to start if another compile session is queued/running.
- Show what will be rerun before starting.
- Do not delete committed pages unless an explicit destructive mode exists.
- Keep the existing retry and retry-failed paths working.
- Log the selected retry step in activity/history.
Acceptance criteria
- User can change
entity_promotion_threshold, retry from plan, and get new plans without refetching URLs.
- User can retry from
draft without re-extracting sources.
- User can retry from
crossref or commit for downstream repair.
- The progress UI shows the selected retry range accurately.
- Tests cover step reset behavior and
page_plans cleanup rules.
Summary
Kompl needs a controlled way to rerun a compile from a selected pipeline step.
Right now retry behavior is mostly automatic:
That is useful for crashes and failed items, but it does not cover intentional reruns after the user changes settings or wants to redo only part of the pipeline.
User problem
Sometimes the pipeline did not fail, but the output is still wrong enough that the user needs a partial redo.
Examples:
entity_promotion_thresholdand wants to rerun planningmin_draft_charsand wants to rerun drafting/commitToday the practical options are either too broad or too manual.
Why current retry is not enough
/api/compile/retryfollows step status. It does not let the user say, "I know extract/resolve/match are fine, rerun from plan."/api/compile/retry-failedis narrower, but only for failed items. It does not handle quality reruns where the prior step is technically done.Manual DB manipulation can work, but it is risky because
compile_progress,page_plans,sources,extractions, and Saved Links all have related state.Proposed behavior
Add an advanced retry action:
Supported steps could include:
ingest_urlsextractresolvematchplandraftcrossrefcommitschemaThe UI could expose this as an advanced action on the progress page:
Retry from....Important semantics
The route should reset
compile_progress.stepsfrom the selected step onward, but each step needs its own data rules.Suggested rules:
extract: keep existing extractions by default, retry only missing extractions unless force mode is explicit.resolve: rerun resolver from existing extractions.match: rerun match from existing sources/extractions.plan: clear non-committedpage_plans, rebuild plans from current settings and existing extracted data.draft: rerun only planned/failed drafts by default, with optional force reset for drafted non-committed plans.crossref: rerun on drafted pages.commit: rerun on crossreffed pages.schema: rerun schema only.Safety requirements
Acceptance criteria
entity_promotion_threshold, retry fromplan, and get new plans without refetching URLs.draftwithout re-extracting sources.crossreforcommitfor downstream repair.page_planscleanup rules.