Skip to content

feat(M91): Smart Retry for Divergent Samples#195

Merged
hlin99 merged 2 commits into
mainfrom
feat/m91-smart-retry
Apr 6, 2026
Merged

feat(M91): Smart Retry for Divergent Samples#195
hlin99 merged 2 commits into
mainfrom
feat/m91-smart-retry

Conversation

@hlin99
Copy link
Copy Markdown
Member

@hlin99 hlin99 commented Apr 6, 2026

Summary

Adds smart-retry capability that automatically reruns divergent samples with deterministic settings (temperature=0, seed=42) to classify each divergence as deterministic (real bug) or stochastic (sampling noise).

Changes

  • smart_retry.py: SmartRetryResult, SampleRetryResult, run_smart_retry(), format_smart_retry()
  • CLI: xpyd-acc smart-retry --report --baseline --target [--json]
  • Exit 1 if deterministic divergences found (CI-friendly)
  • ROADMAP.md: marked M87/M89 complete, added M91
  • 12 tests covering retry logic, classification, formatting, JSON export

Closes #194

Copy link
Copy Markdown

@hlin99-Review-Bot hlin99-Review-Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Request Changes

Idea: ✅ Smart retry to classify divergences as deterministic vs stochastic is a solid addition — reduces false positives in CI and gives actionable signal.

Code issue — CI failing (6 test failures on Python 3.11):

Tests use asyncio.get_event_loop().run_until_complete() which raises RuntimeError: There is no current event loop in thread 'MainThread' on Python ≥3.10 (deprecated) / 3.12+ (removed).

Fix: Replace all instances in tests/test_smart_retry.py:

# Before
result = asyncio.get_event_loop().run_until_complete(
    run_smart_retry(report, ...)
)

# After
result = asyncio.run(
    run_smart_retry(report, ...)
)

This affects tests at lines 128, 136, 157, 181, 203, 225.

Everything else looks clean — module structure, dataclasses, CLI integration, formatting all LGTM. Just fix the async test pattern and CI should go green.

Copy link
Copy Markdown

@hlin99-Review-BotX hlin99-Review-BotX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea: ✅ Good — Smart retry to classify divergences as deterministic vs stochastic is valuable. Aligns well with the project's accuracy tooling goals.

Code: Needs fix — CI is red. All 6 async tests in tests/test_smart_retry.py fail on Python 3.11 with:

RuntimeError: There is no current event loop in thread 'MainThread'.

The tests use asyncio.get_event_loop().run_until_complete(), which was deprecated in 3.10 and raises on 3.11+. Replace with asyncio.run():

# Before
result = asyncio.get_event_loop().run_until_complete(
    run_smart_retry(report, "http://base", "http://target")
)

# After
result = asyncio.run(
    run_smart_retry(report, "http://base", "http://target")
)

Fix the 6 occurrences, confirm CI passes, then re-request review.

Copy link
Copy Markdown

@hlin99-Review-Bot hlin99-Review-Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI green across all Python versions (3.10/3.11/3.12). The asyncio fix looks correct — asyncio.run() is the right replacement. Code, tests, and docs all good. ✅ Approved.

Copy link
Copy Markdown

@hlin99-Review-BotX hlin99-Review-BotX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI green across all Python versions. The asyncio.run() fix is correct — addresses the exact issue I flagged. Code, tests, docs all look good. ✅ Approved.

hlin99 added 2 commits April 6, 2026 22:36
Adds smart-retry capability that automatically reruns divergent samples
with deterministic settings (temperature=0, seed=42) to classify each
divergence as 'deterministic' (real bug) or 'stochastic' (sampling noise).

- SmartRetryResult and SampleRetryResult dataclasses with JSON export
- run_smart_retry() async function reusing existing run_batch()
- format_smart_retry() for rich terminal output
- CLI: xpyd-acc smart-retry --report --baseline --target
- Exit 1 if deterministic divergences found (CI-friendly)
- Updates ROADMAP.md: marks M87, M89 as complete, adds M91
- 12 tests covering retry logic, classification, formatting, JSON export

Closes #194
…ete with asyncio.run

Fixes CI failures on Python 3.11+ where get_event_loop() raises RuntimeError
in non-async context. All 6 affected test functions updated.
@hlin99 hlin99 force-pushed the feat/m91-smart-retry branch from 7c483ed to af8c371 Compare April 6, 2026 14:36
@hlin99 hlin99 merged commit fb72043 into main Apr 6, 2026
4 checks passed
@hlin99 hlin99 deleted the feat/m91-smart-retry branch April 6, 2026 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(M91): Smart Retry for Divergent Samples

3 participants