Description
Replace WorkerProxy._start_failure (declared at src/wool/runtime/worker/proxy.py:429, set at line 819, read at lines 809-810 and 813-814) with a cache of reconstruction parameters — the exception class plus its args tuple — rather than the live BaseException instance. Each subsequent lazy-dispatch entrant against a failed-start proxy constructs a fresh exception from those parameters and raises that.
Motivation
Re-raising the same live BaseException accretes traceback frames onto its __traceback__ chain every time a new caller's raise self._start_failure runs. On a long-lived proxy whose start has failed and that the application keeps prodding — a health check, a polling supervisor, repeated client calls — this is a slow memory leak: the cached instance grows as more frames accrete, and the frame locals captured on the traceback chain pin objects that should be collectable.
WorkerProxy uses double-checked locking for lazy startup, so the short-circuit path is hot enough on degraded-state proxies for the accretion to matter. The trade-off — losing the __cause__ / __context__ chain on subsequent raises — is acceptable: the first-failure entrant who caught the live exception keeps the full chain; subsequent entrants want a typed diagnostic, not the original traceback.
Expected Outcome
WorkerProxy._start_failure stores tuple[type[BaseException], tuple[Any, ...]] (or equivalent reconstruction params) instead of a live BaseException.
- Each subsequent lazy-dispatch entrant receives a fresh exception instance; the second entrant's
__traceback__ is independent of the first's.
- The first-failure entrant's path is unchanged — they still receive the original traceback chain. The trade-off applies only to subsequent entrants.
- A regression test exercises 10+ post-failure dispatch calls and asserts the cached state's footprint stays bounded.
Description
Replace
WorkerProxy._start_failure(declared atsrc/wool/runtime/worker/proxy.py:429, set at line 819, read at lines 809-810 and 813-814) with a cache of reconstruction parameters — the exception class plus itsargstuple — rather than the liveBaseExceptioninstance. Each subsequent lazy-dispatch entrant against a failed-start proxy constructs a fresh exception from those parameters and raises that.Motivation
Re-raising the same live
BaseExceptionaccretes traceback frames onto its__traceback__chain every time a new caller'sraise self._start_failureruns. On a long-lived proxy whose start has failed and that the application keeps prodding — a health check, a polling supervisor, repeated client calls — this is a slow memory leak: the cached instance grows as more frames accrete, and the frame locals captured on the traceback chain pin objects that should be collectable.WorkerProxyuses double-checked locking for lazy startup, so the short-circuit path is hot enough on degraded-state proxies for the accretion to matter. The trade-off — losing the__cause__/__context__chain on subsequent raises — is acceptable: the first-failure entrant who caught the live exception keeps the full chain; subsequent entrants want a typed diagnostic, not the original traceback.Expected Outcome
WorkerProxy._start_failurestorestuple[type[BaseException], tuple[Any, ...]](or equivalent reconstruction params) instead of a liveBaseException.__traceback__is independent of the first's.