feat: add Forager composite-score auto-exclude#146
Conversation
Complements the existing markout-based ``--auto-exclude`` with a multi-axis health score that captures coin-level pathologies the markout sensor misses: * **Activity** — fill recency, catches dead markets that never accumulate enough flow for the markout window to fire. * **Close quality** — maker-close rate, catches structural taker fallback even when markout looks neutral. * **Cost** — recent dollar-loss per 1K notional, catches gradual bleed. A rolling ``CoinHealthTracker`` records every fill (activity) and every position close (quality + cost) from the WS fill feed and computes a weighted composite in [0, 100]. Strategy ``_check_forager_health()`` samples the score on the existing per-coin run-loop tick; below ``--forager-threshold`` for ``--forager-consecutive`` checks in a row, the coin is paused via the shared ``_coin_cooldown_until`` map (same machinery as ``--auto-exclude`` and ``--loss-streak-limit``). Forager and ``--auto-exclude`` are independent — either may trigger. All thresholds, weights, and formula constants are sourced from ``ForagerConfig``; the tracker contains no hardcoded numeric tunables. The 7 most operationally relevant parameters are exposed as ``--forager-*`` CLI flags, and the 5 internal formula constants (window, idle grace, cost scale, min-closes gate, check interval) are overridable via env vars without bot.py CLI bloat. * New module: ``strategies/coin_health_tracker.py`` (CoinHealthTracker + CloseEvent / CoinHealth dataclasses). * ``strategies/mm_config.py``: new ``ForagerConfig`` dataclass wired into ``MMConfig`` via ``from_legacy_dict()``. * ``strategies/market_making_strategy.py``: instantiates the tracker on startup when ``forager_enabled``, calls ``_check_forager_health`` in the per-coin run loop right after ``_check_auto_exclude``, feeds ``record_fill`` from the in-loop fill detector. * ``ws/fill_feed.py``: routes WS userFills to ``record_fill`` (always) and ``record_close`` (when ``closedPnl != 0``) using ``crossed`` to derive maker/taker. * ``bot.py``: 7 new CLI flags, 12 new keys in ``default_configs`` and ``_STRATEGY_PARAMS``, link Forager into the FillFeed when enabled. * ``validation/strategy_validator.py``: range checks for the new params. * ``README.md``: human-readable section + AI YAML reference block. * Tests: ``test_coin_health_tracker.py`` (28 cases) for scoring math and config-driven tunability; ``test_forager.py`` for strategy integration including throttling, cooldown lifecycle, and co-existence with ``--auto-exclude``. Backward compatible: ``forager_enabled`` defaults to ``false``; with the flag absent the tracker is never instantiated and no fill feed hook fires. ``CoinHealthTracker`` always reads from ``ForagerConfig`` so env-only formula constants tune behaviour without code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
keitaj
left a comment
There was a problem hiding this comment.
Review: Forager composite-score auto-exclude
The implementation matches the design doc closely, with config-driven scoring throughout the tracker and a clean integration with the existing per-coin cooldown machinery. Backward compatibility is well preserved (default disabled, getattr fallbacks for fixtures bypassing __init__). 35 new tests cover both the math and the strategy integration. One should-fix on a hardcoded literal that escaped the "no-hardcode" goal, plus a few nits.
1. Hardcoded 50.0 activity threshold in the quality gate (should-fix)
strategies/market_making_strategy.py L1208-1212:
if (
health.n_closes < cfg.min_closes_for_quality
and health.activity_score > 50.0
):
returnThe design doc states "All thresholds, weights, and formula constants are read from ForagerConfig — the module contains no hardcoded numeric tunables." This literal 50.0 is the only remaining hardcoded score boundary. It also happens to coincide with CoinHealthTracker._NO_HISTORY_NEUTRAL_SCORE, which is the actual semantic anchor — "activity above the no-history neutral suggests the coin is warming up; give it more time."
Suggestion: Replace the literal with the named constant to document the connection:
# strategies/market_making_strategy.py
from strategies.coin_health_tracker import CoinHealthTracker
# inside _check_forager_health:
if (
health.n_closes < cfg.min_closes_for_quality
and health.activity_score > CoinHealthTracker._NO_HISTORY_NEUTRAL_SCORE
):
returnOr, if a future operator might want to tune it independently of the no-history neutral, expose it as a 6th env-only knob (forager_quality_gate_activity_min) with default 50.0. The named-constant approach is lighter and matches current operational intent.
Test gap: the boundary case activity_score == 50.0 is not covered by test_skips_when_quality_data_insufficient_and_active. Worth adding a parametrised case to lock the inequality semantics.
2. Redundant forager_enabled check in bot.py (nit)
bot.py L519-525:
coin_health_tracker = getattr(self.strategy, '_coin_health_tracker', None)
if (
self.fill_feed is not None
and coin_health_tracker is not None
and self.strategy_config.get('forager_enabled', False)
):The strategy only instantiates _coin_health_tracker when cfg.forager.enabled is True (see MarketMakingStrategy.__init__ L127-130), so coin_health_tracker is not None is already a sufficient guard. The third clause re-checks the source of truth one layer up and could drift if either from_legacy_dict or the dataclass default changes.
Suggestion:
if self.fill_feed is not None and coin_health_tracker is not None:
self.fill_feed.set_coin_health_tracker(coin_health_tracker)
logger.info("[ws] Forager linked to FillFeed (CoinHealthTracker)")3. tracked_coins() is unused outside tests (nit)
strategies/coin_health_tracker.py L181-184:
The diagnostic tracked_coins() returns {coin: n_closes_in_buffer} but no production caller invokes it. The next natural use is a periodic [forager] Summary log similar to the existing [adverse] Summary and [imb-guard] Summary lines (the design doc's ### 観察用ログ section anticipates this).
Not blocking — the method is cheap, harmless, and ready to be plumbed in. But adding the summary log in this PR would round out the observability story without adding much code.
4. Throttle test boundary comment is slightly off (nit)
tests/test_forager.py L177-187 (test_throttle_blocks_repeat_check_within_interval):
The comment says "the third is post-window" but with +30 then +30 the third call lands exactly at the 60s window edge, where now - last_check < cfg.check_interval_seconds evaluates 60.0 < 60.0 → False, so the third call is not throttled. The assertion call_count < 3 happens to pass because the second call (at +30s) is throttled, leaving counts of 1 or 2 — but the comment makes it sound like the third is also throttled.
Suggestion: Either expect exactly call_count == 2 or rewrite the comment to clarify only the second call is suppressed. The strict-inequality semantic is worth pinning down with an explicit assertion.
Verdict: LGTM
No blockers. One should-fix (the hardcoded 50.0 literal in the quality gate) and three nits. The should-fix is a small, isolated change and does not affect the default behaviour with forager_enabled=false (the gate is unreachable in that path). Recommend addressing the literal before enabling Forager in production so the rollout has a clean "no hardcoded tunables" story end-to-end, but the change is acceptable to merge as-is.
🤖 Generated with Claude Code
…nt (#147) The activity-score boundary in ``_check_forager_health`` was a bare ``50.0`` literal, which contradicted the design's "no hardcoded numeric tunables" intent. The value was always meant to track ``CoinHealthTracker._NO_HISTORY_NEUTRAL_SCORE`` (the midpoint the tracker emits for unknown coins, also 50.0); pin the dependency explicitly so adjusting the neutral score in one place updates both. Also add two parametrised tests that pin the strict-inequality boundary semantics: * ``activity == _NO_HISTORY_NEUTRAL_SCORE`` → gate does not engage, cooldown fires. * ``activity > _NO_HISTORY_NEUTRAL_SCORE + epsilon`` → gate engages, cooldown is suppressed. These regression-pin the boundary so a future refactor of ``>`` to ``>=`` is detected by CI. Addresses the should-fix from PR #146 review. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
--auto-exclude. Below a configurable composite threshold for N consecutive checks the coin is paused on the shared cooldown map.--forageropt-in). Both Forager and--auto-excludemay run side-by-side and either may trigger.Changes
strategies/coin_health_tracker.py(new):CoinHealthTrackerwithrecord_fill/record_closeandget_health()returning aCoinHealthdataclass. All scoring formulas read tunables fromForagerConfig— no hardcoded literals.strategies/mm_config.py: newForagerConfigdataclass wired intoMMConfigviafrom_legacy_dict(). Validation in__post_init__.strategies/market_making_strategy.py: instantiates the tracker on startup whenforager_enabled, calls_check_forager_healthin the per-coin run loop right after_check_auto_exclude, feedsrecord_fillfrom the in-loop fill detector. Defensive against fixtures that bypass__init__.ws/fill_feed.py: routes WS userFills torecord_fill(always) andrecord_close(whenclosedPnl != 0) derivingis_makerfrom thecrossedfield.bot.py: 7--forager-*CLI flags (enabled / threshold / consecutive / cooldown / 3 weights), 12 new keys indefault_configsand_STRATEGY_PARAMS, links Forager into the FillFeed when enabled.validation/strategy_validator.py: range checks for the new params.README.md: human-readable section under## Strategies+ AI YAML reference block.test_coin_health_tracker.py(28 cases) for scoring math + config-driven tunability;test_forager.py(15 cases) for strategy integration including throttling, cooldown lifecycle, and co-existence with--auto-exclude.Parameter exposure
7 CLI flags for the operationally tunable knobs (threshold, consecutive count, cooldown, weights). The 5 internal formula constants — rolling window, idle grace, cost scale, min-closes gate, check interval — are overridable via env vars without adding more CLI flags. This keeps
bot.pyfrom continuing to grow while preserving full configurability per the project policy of reading every tunable throughconfig.get('key', default).Test plan
test_coin_health_tracker.py+test_forager.py)flake8 . --max-line-length=120passespytest tests/ -qpasses (1004 tests, 0 failures)forager_enabled=false) preserves prior behaviour — tracker is never instantiated, fill feed hook does not fire🤖 Generated with Claude Code