P0: Fix flaky SQLite WAL livenessTest — replace Date() comparison with monotonic clock, remove CI skip#98
Conversation
…ve CI skip `livenessTest()` used `Date()` timestamps which have ~1µs resolution. On fast release-mode CI runners both timestamps could land in the same tick, making the strict `writeEnd < readEnd` comparison fail by a tie even when the underlying WAL concurrency was correct. The workaround was a `.disabled(if: CI)` annotation — a policy violation per CLAUDE.md. Replace both `Date()` captures with `ContinuousClock.now` (nanosecond resolution, backed by `mach_absolute_time` on Darwin). `ContinuousClock` is already the established pattern in `SearchDeadlineContext`, `SQLiteStorage`, and `SQLiteReaderActor`. `ContinuousClock.Instant` is `Comparable` so the `#expect(writeEnd < readEnd)` assertion requires no change. Remove the `.disabled(if: CI)` annotation. Verified 20/20 consecutive passes in both debug and release configurations locally. Closes #95 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
🏭 Fabrik — stage: Implement The Fabrik engine will create the draft PR after completion. Everything is committed and pushed. Replaced the two Used 31/250 turns, 0k input / 11k output tokens. |
There was a problem hiding this comment.
Pull request overview
This PR fixes a flaky SQLite WAL concurrency test by replacing wall-clock Date() ordering comparisons with a monotonic clock, allowing the test to run reliably on CI without being disabled.
Changes:
- Remove the CI-only disablement from
SQLiteStorageConcurrencyTests.livenessTest(). - Capture
writeEnd/readEndusingContinuousClock.nowinstead ofDate(). - Update the
livenessTest()doc comment to reflect the new timing approach.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// `writeEnd < readEnd`. Uses `ContinuousClock` (nanosecond resolution) | ||
| /// so the assertion is reliable even when the FTS scan is fast. |
|
🏭 Fabrik — stage: Review The PR is clean. One commit, one file. Let me write up the review. Review FindingsVerified — All Plan Tasks CompleteAll 9 tasks from the implementation plan are confirmed complete in commit
Out-of-Scope Code UntouchedConfirmed unchanged on this branch vs main:
Test ResultsDebug ( Release ( No New RegressionsThe diff is a single file ( Copilot ReviewCopilot summarized the change accurately with no actionable findings. Branch StateBranch is one commit ahead of current Reviewed the Used 20/50 turns, 0k input / 7k output tokens. |
|
🏭 Fabrik — stage: Validate Both suites passed on clean runs. Let me compile the full validation report. Validation ReportBranch state
Requirements: 7/7 passed
Test Suite Results
Tests skipped on CI (
Tests failing locally:
Both were confirmed pre-existing: identical code on Code Completeness
Regressions: NoneThe diff is 5 insertions / 20 deletions, all in Verdict: READY TO MERGEAll acceptance criteria met.
Used 40/50 turns, 0k input / 16k output tokens. |
Summary
SQLiteStorageConcurrencyTests.livenessTest()was disabled on CI runners via@Test(.disabled(if: ProcessInfo.processInfo.environment["CI"] != nil))because itswriteEnd < readEndassertion usesDate()timestamps that lack the resolution to distinguish ticks on fast release-mode CI runners. This issue removes the skip and replaces the timing primitive with a monotonic, sub-microsecond clock so the assertion is both correct and reliable on CI. Per the project engineering policy (.claude/CLAUDE.md), the CI skip is itself a policy violation. This is P0 — it blocks all merges until resolved.Problem
Summary
Approach
Approach
The fix is surgical: four line changes and a doc comment update in a single test file. The research confirms that
ContinuousClock.nowis already the established pattern in this codebase (used inSearchDeadlineContext,SQLiteStorage,SQLiteReaderActor) and is available with no new imports.ContinuousClock.InstantisComparable, so the#expect(writeEnd < readEnd)assertion needs no change — only the assignment lines do.The implementation is: remove the
.disabled(if: CI)annotation, swap the twoDate()captures toContinuousClock.now, and scrub the now-false doc comment text. Then verify with ≥20 consecutive runs in debug and release before committing.No new files. No ADR. The
ContinuousClockconvention is already established; no new architectural decision is being introduced.New/Modified Files
Tests/SwitchcraftTests/SQLiteStorageConcurrencyTests.swift.disabled(if: CI), replaceDate()withContinuousClock.now(×2), update doc commentKey Decisions
ContinuousClockovermach_absolute_time/clock_gettime_nsec_np:ContinuousClockis the idiomatic Swift 5.7+ monotonic clock, already used inSearchDeadlineContextand the SQLite actor layer. Raw Darwin APIs would work but are lower-level than necessary, requireDarwinimport, and produceUInt64that needs bridging for comparison — exactly what the spec forbids.ContinuousClock.Instantcompares natively with<.No
awaitbarrier / memory fence needed:writeEndis captured synchronously afterawait storage.upsertDocument(...)returns;readEndis captured synchronously after_ = try await slowRead. Swiftawaitis a sequencing point — the compiler cannot reorder across it. No additional synchronization is required.Task.yield()× 8 loop untouched: This is a valid and necessary cooperative-scheduler nudge to give the reader actor time to begin the FTS scan before the write is issued. The clock change does not affect it.Out-of-scope code untouched:
measureMedian(usesDate(), called only byperformanceAssertionTest()),safariUnfuckerRegressionTest()(usesDate()for wall-time, already passing), andperformanceAssertionTest()'s own.disabled(if: CI)are explicitly out of scope. No changes to these.No ADR warranted: This fix adopts an existing, documented convention. No new design decision is introduced that would constrain future contributors.
Task Checklist
.disabled(if: CI)annotation — delete lines 82–85 of the@Testdecorator onlivenessTest()so only the test display name remainslet writeEnd = Date()withlet writeEnd = ContinuousClock.nowat line 106let readEnd = Date()withlet readEnd = ContinuousClock.nowat line 110livenessTest()— remove the "Disabled on CI runners." paragraph (lines 71–79) since it is now false; preserve the paragraph describing what the test asserts and whyswift test --filter livenessTest×20 in debug mode and confirm all 20 pass; if any fail, stop and escalate rather than re-adding a skipswift test -c release --filter livenessTest×20 and confirm all 20 pass; if any fail, stop and escalateswift test) and confirm no newly failing or newly skipped testsswift test -c release) and confirm no newly failing or newly skipped testsCloses #95and push tofabrik/issue-95Risks
Residual flakiness after clock change: If
Task.yield()× 8 is insufficient to ensure the reader actor has begun the FTS scan before the write completes,writeEnd < readEndcould still fail. This would indicate a real ordering race in the actor scheduler, not a timing-resolution issue. Per spec: do not re-skip — stop at Task 5 or 6 and post an escalation comment. In practice this is unlikely; a 5,000-document FTS scan takes tens of milliseconds vs. a single upsert taking microseconds.Type inference change: Replacing
Date()withContinuousClock.nowchanges the inferred type ofwriteEnd/readEndfromDatetoContinuousClock.Instant. The compiler will enforce this. The#expectcomparison is unchanged. There are no other uses ofwriteEnd/readEndin the function, so no cascading type errors.Used 8/50 turns, 0k input / 2k output tokens.
Verification
Replaced the two
Date()timestamp captures inlivenessTest()withContinuousClock.now(nanosecond resolution, nativelyComparable) and removed the.disabled(if: CI)annotation. Verified 20/20 consecutive passes in both debug and release configurations locally. The only failing tests in the full suite are pre-existing timing-sensitive flakes that fail at the same rate on the unmodifiedmainbranch — confirmed by running the baseline twice before my changes.Closes #95