Skip to content

C4: process_job HttpSpoof+Auto cutover to runner.run #56

Description

@filipeforattini

Parent

#46

What to build

Cutover the HttpSpoof path of Crawler::process_job from inline dispatch to runner.run(&job, &ctx).await. The SessionContext constructed in C3 is now consumed: the runner picks the spoof fetcher (held internally), runs fetch → ChallengeDetector-or-Fingerprinter → extract, and returns a JobOutcome. Crawler reads outcome.result, outcome.error, outcome.signals, outcome.retry, and continues with the existing post-processing (storage, frontier feed, retry decision honoring caps and cooldowns, session-state commit, run-level events).

The Render path stays inline for this PRD's scope. The Auto path delegates to spoof internally (per ADR-0002), so this cutover covers HttpSpoof + Auto via one code path.

Per-attempt events (JobStarted, FetchCompleted, ChallengeDetected, ExtractCompleted, JobFailed) now fire on the production NDJSON wire from inside the runner. The regression test from issue #16 is the byte-for-byte trip wire.

Acceptance criteria

  • Crawler::process_job HttpSpoof path replaces the inline spoof block with runner.run(&job, &ctx).await
  • Auto method also flows through runner.run (per ADR-0002 — AutoFetcher would chain inline, which we don't do)
  • Render method stays inline; documented as out of scope for this slice
  • Per-attempt events fire from the runner on the production wire
  • cargo test --all-features --test runner_ndjson_regression: byte-stable — same event-kind sequence
  • cargo test --all-features --test runner_integration: 4/4 still pass
  • cargo test --all-features --lib --no-fail-fast: zero new failures vs main
  • PR description includes before/after LOC count for src/crawler.rs (target: drop measurably below 3600)
  • Existing wiremock-driven integration tests under tests/ continue to pass

Blocked by

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestneeds-triageAwaiting triagerustPull requests that update rust code

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions