feat(M91): Smart Retry for Divergent Samples#195
Conversation
hlin99-Review-Bot
left a comment
There was a problem hiding this comment.
Review: Request Changes
Idea: ✅ Smart retry to classify divergences as deterministic vs stochastic is a solid addition — reduces false positives in CI and gives actionable signal.
Code issue — CI failing (6 test failures on Python 3.11):
Tests use asyncio.get_event_loop().run_until_complete() which raises RuntimeError: There is no current event loop in thread 'MainThread' on Python ≥3.10 (deprecated) / 3.12+ (removed).
Fix: Replace all instances in tests/test_smart_retry.py:
# Before
result = asyncio.get_event_loop().run_until_complete(
run_smart_retry(report, ...)
)
# After
result = asyncio.run(
run_smart_retry(report, ...)
)This affects tests at lines 128, 136, 157, 181, 203, 225.
Everything else looks clean — module structure, dataclasses, CLI integration, formatting all LGTM. Just fix the async test pattern and CI should go green.
hlin99-Review-BotX
left a comment
There was a problem hiding this comment.
Idea: ✅ Good — Smart retry to classify divergences as deterministic vs stochastic is valuable. Aligns well with the project's accuracy tooling goals.
Code: Needs fix — CI is red. All 6 async tests in tests/test_smart_retry.py fail on Python 3.11 with:
RuntimeError: There is no current event loop in thread 'MainThread'.
The tests use asyncio.get_event_loop().run_until_complete(), which was deprecated in 3.10 and raises on 3.11+. Replace with asyncio.run():
# Before
result = asyncio.get_event_loop().run_until_complete(
run_smart_retry(report, "http://base", "http://target")
)
# After
result = asyncio.run(
run_smart_retry(report, "http://base", "http://target")
)Fix the 6 occurrences, confirm CI passes, then re-request review.
hlin99-Review-Bot
left a comment
There was a problem hiding this comment.
CI green across all Python versions (3.10/3.11/3.12). The asyncio fix looks correct — asyncio.run() is the right replacement. Code, tests, and docs all good. ✅ Approved.
hlin99-Review-BotX
left a comment
There was a problem hiding this comment.
CI green across all Python versions. The asyncio.run() fix is correct — addresses the exact issue I flagged. Code, tests, docs all look good. ✅ Approved.
Adds smart-retry capability that automatically reruns divergent samples with deterministic settings (temperature=0, seed=42) to classify each divergence as 'deterministic' (real bug) or 'stochastic' (sampling noise). - SmartRetryResult and SampleRetryResult dataclasses with JSON export - run_smart_retry() async function reusing existing run_batch() - format_smart_retry() for rich terminal output - CLI: xpyd-acc smart-retry --report --baseline --target - Exit 1 if deterministic divergences found (CI-friendly) - Updates ROADMAP.md: marks M87, M89 as complete, adds M91 - 12 tests covering retry logic, classification, formatting, JSON export Closes #194
…ete with asyncio.run Fixes CI failures on Python 3.11+ where get_event_loop() raises RuntimeError in non-async context. All 6 affected test functions updated.
7c483ed to
af8c371
Compare
Summary
Adds smart-retry capability that automatically reruns divergent samples with deterministic settings (temperature=0, seed=42) to classify each divergence as
deterministic(real bug) orstochastic(sampling noise).Changes
smart_retry.py:SmartRetryResult,SampleRetryResult,run_smart_retry(),format_smart_retry()xpyd-acc smart-retry --report --baseline --target [--json]Closes #194