[AAASM-3793] ✅ (test): Add stale-lock detection to packaging test lock#201
Conversation
Record the holder's pid and acquisition time inside .packaging-test.lock so a later waiter can distinguish a live holder from an orphaned lock. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019mSz31RysZF6DYToUoBWLf
Replace the unbounded retry loop with stale-lock detection: an orphaned .packaging-test.lock (dead owner pid, or a 0-byte lock left by a previous run past a short grace) is reclaimed and retried, and a large TTL backs that up against pid reuse without ever evicting a live, slow-but-legitimate build. Also bound the acquire wait just under the 30s per-test timeout so a lock held by a live, non-stale process fails with an actionable error instead of an opaque "Test timed out". Previously a leaked lock made every later packaging test spin until its own timeout, cascading unrelated failures. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019mSz31RysZF6DYToUoBWLf
Cover the path where the lock is held by a dead owner: the helper probes pid liveness, reclaims the orphan, and acquires the lock so the callback runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019mSz31RysZF6DYToUoBWLf
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
The new stale-detection path reads the lock file and probes the owner pid, so the existing contention-retry test became dependent on whatever real .packaging-test.lock a concurrent packaging test holds on CI — a stale orphan left by a sibling test made reclaim fire, breaking the assertions (and on the slower CI Node 18 runner, timing the test out). Model "no lock on disk" by mocking readFileSync/statSync so the test deterministically exercises the not-stale fast-retry path independent of shared filesystem state. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019mSz31RysZF6DYToUoBWLf
|
🤖 Claude Code — PR Review (AAASM-3793)Verdict: Approve (non-blocking observations below). Independent review of test-infra hardening for CIAll green at head Scope / acceptanceChange is confined to
Locally verified: Side-effect assessment (this lock guards ALL packaging tests)
Non-blocking observations
Neither observation blocks merge. Recommend addressing the 28s-vs-90s mismatch (and the comment wording) in a quick follow-up if convenient. Not merging — review only. |



Target
Task summary:
Harden
tests/packaging/lock.tsagainst an orphaned.packaging-test.lockcascade. When a packaging test timed out, vitest abandoned the promise before thefinallyremoved the exclusive lock; the orphan then made every later packaging test spin in the unbounded retry loop until its own per-test timeout, producing a cascade of unrelated "Test timed out" failures.Task tickets:
Key point change (optional):
{pid, timestamp}on acquire.execSync) is never reclaimed, so two builds can never race the samedist/.Effecting Scope
Test-infra only (
tests/packaging/). No product/runtime code changes; no public API, transport, or build-output changes.Description
tests/packaging/lock.ts:{pid, timestamp}into the lock on acquire.isProcessAlive(signal-0 probe),readLockMetadata,lockFileAgeMs, andreclaimStaleLockhelpers; reclaim orphaned locks in the retry loop.EEXISTascause).tests/packaging/lock-helper.test.ts:fs.writeFileSyncin the existing contention test now that the helper writes metadata.How to verify
pnpm lint— clean.pnpm typecheck— clean.pnpm test— full suite green (338 passed, 2 skipped), including the new reclaim test.Closes AAASM-3793
🤖 Generated with Claude Code