Skip to content

flaky test: registry_invariants asserts 0 allocs but sometimes counts 1–2 on Ubuntu CI #10

@patdhlk

Description

@patdhlk

Symptom

crates/taktora-connector-ethercat-tests/tests/registry.rs::registry_invariants (TEST_0219) failed on PR #9 with:

thread 'registry_invariants' panicked at crates/taktora-connector-ethercat-tests/tests/registry.rs:103:9:
assertion `left == right` failed: iter() allocated 2 times across 1000 cycles × 16 channels — REQ_0328 prohibits per-cycle alloc
  left: 2
 right: 0

What the test asserts

The case at tests/registry.rs:78-110 measures `ChannelRegistry::iter()` over 1000 cycles × 16 channels = 16 000 iterations and asserts `ALLOC.alloc_count() == 0`. The intent is to enforce `REQ_0328` (no per-cycle allocation on the executor hot path). A single `iter().count()` warm-up runs before tracking starts to absorb first-call quirks.

Why this looks flaky, not real

  • `CountingAllocator` is the test binary's `#[global_allocator]` — it counts every allocation in the process between `set_tracking(true)` and `set_tracking(false)`, not just allocations inside the iterator.
  • 2 allocs across 16 000 iterations is a constant overhead, not a per-cycle leak. `REQ_0328` is about the cycle hot path — a constant 1–2 alloc cost from runtime bookkeeping doesn't violate it.
  • Plausible sources of the stray allocations:
    • The test panic-hook / backtrace machinery initializing lazily the first time `ALLOC.set_tracking(true)` is called inside this binary.
    • cargo's libtest harness emitting a status line on a background thread that briefly enters the allocator.
    • Thread-local-storage lazy initialization on first allocation after `reset()`.
  • These would all be racy and Ubuntu-specific (different stdlib alloc patterns vs macOS / Windows), which matches the observed pattern.

Suggested fix

Tighten the spec to what the requirement actually says — no per-cycle alloc — by asserting a small constant ceiling instead of strict zero. Either:

const STRAY_ALLOC_BUDGET: usize = 16; // covers runtime bookkeeping; still proves no per-cycle leak
let allocs = ALLOC.alloc_count();
assert!(
    allocs <= STRAY_ALLOC_BUDGET,
    "iter() allocated {allocs} times across 1000 cycles × 16 channels — \
     budget is {STRAY_ALLOC_BUDGET}; REQ_0328 prohibits per-cycle alloc"
);

or assert the rate instead of the total:

let per_cycle = allocs as f64 / 1_000.0;
assert!(
    per_cycle < 0.1,
    "iter() averaged {per_cycle:.4} allocs/cycle across 1000 cycles × 16 channels — \
     REQ_0328 prohibits per-cycle alloc"
);

Either form keeps the spec teeth (a real per-iter `Box::new` would still trip it: 16 000 ≫ 16 / 0.1) while tolerating ≤O(1) runtime bookkeeping.

Reproducibility

Has only been observed once so far (this PR). If it recurs, link the run here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions