Skip to content

Refactor WorkerProxy._start_failure to store reconstruction params, not the live exception #233

Description

@conradbzura

Description

Replace WorkerProxy._start_failure (declared at src/wool/runtime/worker/proxy.py:429, set at line 819, read at lines 809-810 and 813-814) with a cache of reconstruction parameters — the exception class plus its args tuple — rather than the live BaseException instance. Each subsequent lazy-dispatch entrant against a failed-start proxy constructs a fresh exception from those parameters and raises that.

Motivation

Re-raising the same live BaseException accretes traceback frames onto its __traceback__ chain every time a new caller's raise self._start_failure runs. On a long-lived proxy whose start has failed and that the application keeps prodding — a health check, a polling supervisor, repeated client calls — this is a slow memory leak: the cached instance grows as more frames accrete, and the frame locals captured on the traceback chain pin objects that should be collectable.

WorkerProxy uses double-checked locking for lazy startup, so the short-circuit path is hot enough on degraded-state proxies for the accretion to matter. The trade-off — losing the __cause__ / __context__ chain on subsequent raises — is acceptable: the first-failure entrant who caught the live exception keeps the full chain; subsequent entrants want a typed diagnostic, not the original traceback.

Expected Outcome

  • WorkerProxy._start_failure stores tuple[type[BaseException], tuple[Any, ...]] (or equivalent reconstruction params) instead of a live BaseException.
  • Each subsequent lazy-dispatch entrant receives a fresh exception instance; the second entrant's __traceback__ is independent of the first's.
  • The first-failure entrant's path is unchanged — they still receive the original traceback chain. The trade-off applies only to subsequent entrants.
  • A regression test exercises 10+ post-failure dispatch calls and asserts the cached state's footprint stays bounded.

Metadata

Metadata

Assignees

No one assigned

    Labels

    refactorCode restructuring without behavior change

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions