Skip to content

Nightly workflow hardening follow-up: timeouts, test-presence gating, and soak implementation #496

@somethingwithproof

Description

@somethingwithproof

The new nightly workflow is a strong start, but there are a few operational risks worth hardening before relying on it as a long-term signal.

Why this matters

Nightly jobs are intentionally heavy. If they hang, run with zero tests, or only execute placeholders, they can look green while providing limited coverage and consume runner capacity.

Risks observed

  • No explicit timeout-minutes on nightly jobs (tsan, valgrind, soak-placeholder).
  • Multiple test paths can emit a notice and continue successfully when no tests are discovered.
  • soak-placeholder is currently informational only, so nightly lacks real soak/integration execution.

Suggested follow-up

  1. Add explicit timeout-minutes for each nightly job (and tune per job profile).
  2. Add a strict mode (or default behavior) to fail nightly when zero tests are detected in TSan/Valgrind paths.
  3. Replace soak-placeholder with an actual soak/integration scenario (or gate with a feature flag and mark pending clearly in status output).
  4. Optionally split nightly deps by job (tsan vs valgrind) to reduce install surface and variance.

Acceptance criteria

  • Nightly jobs have deterministic time bounds.
  • Nightly fails (not just notices) when expected test suites are missing.
  • Soak stage executes real work or is explicitly disabled with a tracked flag.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions