Nightly eval REGRESSION: benchmark 'pipeline' passed ALL trials in the previous run but failed BOTH trials on 2026-06-15.
Error category: [thrash_aborted]
Model: opencode-qwen3-5-35b-a3b-mxfp8 (local, tiers:smoke,core)
Results: /tmp/nightly_eval_20260615 (prev: /tmp/nightly_eval_20260614_rag_on/agent)
It was solid last night, so this is a fresh solid->broken regression — investigate.
Binary info (auto-attached):
ailang version: v0.25.0-71-g32cf1069
git commit: 32cf106
Reported by: nightly-eval via ailang messages
Nightly eval REGRESSION: benchmark 'pipeline' passed ALL trials in the previous run but failed BOTH trials on 2026-06-15.
Error category: [thrash_aborted]
Model: opencode-qwen3-5-35b-a3b-mxfp8 (local, tiers:smoke,core)
Results: /tmp/nightly_eval_20260615 (prev: /tmp/nightly_eval_20260614_rag_on/agent)
It was solid last night, so this is a fresh solid->broken regression — investigate.
Binary info (auto-attached):
ailang version: v0.25.0-71-g32cf1069
git commit: 32cf106
Reported by: nightly-eval via ailang messages