Skip to content

feat: add Forager composite-score auto-exclude#146

Merged
keitaj merged 1 commit into
mainfrom
feat/forager-coin-health
May 4, 2026
Merged

feat: add Forager composite-score auto-exclude#146
keitaj merged 1 commit into
mainfrom
feat/forager-coin-health

Conversation

@keitaj
Copy link
Copy Markdown
Owner

@keitaj keitaj commented May 4, 2026

Summary

  • Add a multi-axis coin health score (activity / close-quality / cost) that complements the existing markout-based --auto-exclude. Below a configurable composite threshold for N consecutive checks the coin is paused on the shared cooldown map.
  • Catches three failure modes the markout sensor misses: dead markets (no fills), structural taker fallback (low maker-close rate even with neutral markout), and gradual cost bleed.
  • Default disabled (--forager opt-in). Both Forager and --auto-exclude may run side-by-side and either may trigger.

Changes

  • strategies/coin_health_tracker.py (new): CoinHealthTracker with record_fill / record_close and get_health() returning a CoinHealth dataclass. All scoring formulas read tunables from ForagerConfig — no hardcoded literals.
  • strategies/mm_config.py: new ForagerConfig dataclass wired into MMConfig via from_legacy_dict(). Validation in __post_init__.
  • strategies/market_making_strategy.py: instantiates the tracker on startup when forager_enabled, calls _check_forager_health in the per-coin run loop right after _check_auto_exclude, feeds record_fill from the in-loop fill detector. Defensive against fixtures that bypass __init__.
  • ws/fill_feed.py: routes WS userFills to record_fill (always) and record_close (when closedPnl != 0) deriving is_maker from the crossed field.
  • bot.py: 7 --forager-* CLI flags (enabled / threshold / consecutive / cooldown / 3 weights), 12 new keys in default_configs and _STRATEGY_PARAMS, links Forager into the FillFeed when enabled.
  • validation/strategy_validator.py: range checks for the new params.
  • README.md: human-readable section under ## Strategies + AI YAML reference block.
  • Tests: test_coin_health_tracker.py (28 cases) for scoring math + config-driven tunability; test_forager.py (15 cases) for strategy integration including throttling, cooldown lifecycle, and co-existence with --auto-exclude.

Parameter exposure

7 CLI flags for the operationally tunable knobs (threshold, consecutive count, cooldown, weights). The 5 internal formula constants — rolling window, idle grace, cost scale, min-closes gate, check interval — are overridable via env vars without adding more CLI flags. This keeps bot.py from continuing to grow while preserving full configurability per the project policy of reading every tunable through config.get('key', default).

Test plan

  • Unit tests added: 35 new cases (test_coin_health_tracker.py + test_forager.py)
  • flake8 . --max-line-length=120 passes
  • pytest tests/ -q passes (1004 tests, 0 failures)
  • No regressions in existing tests
  • Default config (forager_enabled=false) preserves prior behaviour — tracker is never instantiated, fill feed hook does not fire

🤖 Generated with Claude Code

Complements the existing markout-based ``--auto-exclude`` with a
multi-axis health score that captures coin-level pathologies the
markout sensor misses:

* **Activity** — fill recency, catches dead markets that never accumulate
  enough flow for the markout window to fire.
* **Close quality** — maker-close rate, catches structural taker fallback
  even when markout looks neutral.
* **Cost** — recent dollar-loss per 1K notional, catches gradual bleed.

A rolling ``CoinHealthTracker`` records every fill (activity) and every
position close (quality + cost) from the WS fill feed and computes a
weighted composite in [0, 100]. Strategy ``_check_forager_health()``
samples the score on the existing per-coin run-loop tick; below
``--forager-threshold`` for ``--forager-consecutive`` checks in a row,
the coin is paused via the shared ``_coin_cooldown_until`` map (same
machinery as ``--auto-exclude`` and ``--loss-streak-limit``). Forager
and ``--auto-exclude`` are independent — either may trigger.

All thresholds, weights, and formula constants are sourced from
``ForagerConfig``; the tracker contains no hardcoded numeric tunables.
The 7 most operationally relevant parameters are exposed as
``--forager-*`` CLI flags, and the 5 internal formula constants
(window, idle grace, cost scale, min-closes gate, check interval) are
overridable via env vars without bot.py CLI bloat.

* New module: ``strategies/coin_health_tracker.py`` (CoinHealthTracker
  + CloseEvent / CoinHealth dataclasses).
* ``strategies/mm_config.py``: new ``ForagerConfig`` dataclass wired
  into ``MMConfig`` via ``from_legacy_dict()``.
* ``strategies/market_making_strategy.py``: instantiates the tracker
  on startup when ``forager_enabled``, calls ``_check_forager_health``
  in the per-coin run loop right after ``_check_auto_exclude``,
  feeds ``record_fill`` from the in-loop fill detector.
* ``ws/fill_feed.py``: routes WS userFills to ``record_fill`` (always)
  and ``record_close`` (when ``closedPnl != 0``) using ``crossed`` to
  derive maker/taker.
* ``bot.py``: 7 new CLI flags, 12 new keys in ``default_configs`` and
  ``_STRATEGY_PARAMS``, link Forager into the FillFeed when enabled.
* ``validation/strategy_validator.py``: range checks for the new params.
* ``README.md``: human-readable section + AI YAML reference block.
* Tests: ``test_coin_health_tracker.py`` (28 cases) for scoring math
  and config-driven tunability; ``test_forager.py`` for strategy
  integration including throttling, cooldown lifecycle, and
  co-existence with ``--auto-exclude``.

Backward compatible: ``forager_enabled`` defaults to ``false``; with
the flag absent the tracker is never instantiated and no fill feed
hook fires. ``CoinHealthTracker`` always reads from ``ForagerConfig``
so env-only formula constants tune behaviour without code changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@keitaj keitaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Forager composite-score auto-exclude

The implementation matches the design doc closely, with config-driven scoring throughout the tracker and a clean integration with the existing per-coin cooldown machinery. Backward compatibility is well preserved (default disabled, getattr fallbacks for fixtures bypassing __init__). 35 new tests cover both the math and the strategy integration. One should-fix on a hardcoded literal that escaped the "no-hardcode" goal, plus a few nits.


1. Hardcoded 50.0 activity threshold in the quality gate (should-fix)

strategies/market_making_strategy.py L1208-1212:

if (
    health.n_closes < cfg.min_closes_for_quality
    and health.activity_score > 50.0
):
    return

The design doc states "All thresholds, weights, and formula constants are read from ForagerConfig — the module contains no hardcoded numeric tunables." This literal 50.0 is the only remaining hardcoded score boundary. It also happens to coincide with CoinHealthTracker._NO_HISTORY_NEUTRAL_SCORE, which is the actual semantic anchor — "activity above the no-history neutral suggests the coin is warming up; give it more time."

Suggestion: Replace the literal with the named constant to document the connection:

# strategies/market_making_strategy.py
from strategies.coin_health_tracker import CoinHealthTracker

# inside _check_forager_health:
if (
    health.n_closes < cfg.min_closes_for_quality
    and health.activity_score > CoinHealthTracker._NO_HISTORY_NEUTRAL_SCORE
):
    return

Or, if a future operator might want to tune it independently of the no-history neutral, expose it as a 6th env-only knob (forager_quality_gate_activity_min) with default 50.0. The named-constant approach is lighter and matches current operational intent.

Test gap: the boundary case activity_score == 50.0 is not covered by test_skips_when_quality_data_insufficient_and_active. Worth adding a parametrised case to lock the inequality semantics.


2. Redundant forager_enabled check in bot.py (nit)

bot.py L519-525:

coin_health_tracker = getattr(self.strategy, '_coin_health_tracker', None)
if (
    self.fill_feed is not None
    and coin_health_tracker is not None
    and self.strategy_config.get('forager_enabled', False)
):

The strategy only instantiates _coin_health_tracker when cfg.forager.enabled is True (see MarketMakingStrategy.__init__ L127-130), so coin_health_tracker is not None is already a sufficient guard. The third clause re-checks the source of truth one layer up and could drift if either from_legacy_dict or the dataclass default changes.

Suggestion:

if self.fill_feed is not None and coin_health_tracker is not None:
    self.fill_feed.set_coin_health_tracker(coin_health_tracker)
    logger.info("[ws] Forager linked to FillFeed (CoinHealthTracker)")

3. tracked_coins() is unused outside tests (nit)

strategies/coin_health_tracker.py L181-184:

The diagnostic tracked_coins() returns {coin: n_closes_in_buffer} but no production caller invokes it. The next natural use is a periodic [forager] Summary log similar to the existing [adverse] Summary and [imb-guard] Summary lines (the design doc's ### 観察用ログ section anticipates this).

Not blocking — the method is cheap, harmless, and ready to be plumbed in. But adding the summary log in this PR would round out the observability story without adding much code.


4. Throttle test boundary comment is slightly off (nit)

tests/test_forager.py L177-187 (test_throttle_blocks_repeat_check_within_interval):

The comment says "the third is post-window" but with +30 then +30 the third call lands exactly at the 60s window edge, where now - last_check < cfg.check_interval_seconds evaluates 60.0 < 60.0 → False, so the third call is not throttled. The assertion call_count < 3 happens to pass because the second call (at +30s) is throttled, leaving counts of 1 or 2 — but the comment makes it sound like the third is also throttled.

Suggestion: Either expect exactly call_count == 2 or rewrite the comment to clarify only the second call is suppressed. The strict-inequality semantic is worth pinning down with an explicit assertion.


Verdict: LGTM

No blockers. One should-fix (the hardcoded 50.0 literal in the quality gate) and three nits. The should-fix is a small, isolated change and does not affect the default behaviour with forager_enabled=false (the gate is unreachable in that path). Recommend addressing the literal before enabling Forager in production so the rollout has a clean "no hardcoded tunables" story end-to-end, but the change is acceptable to merge as-is.

🤖 Generated with Claude Code

@keitaj keitaj merged commit 002ce24 into main May 4, 2026
6 checks passed
@keitaj keitaj deleted the feat/forager-coin-health branch May 4, 2026 01:42
keitaj added a commit that referenced this pull request May 4, 2026
…nt (#147)

The activity-score boundary in ``_check_forager_health`` was a bare
``50.0`` literal, which contradicted the design's "no hardcoded
numeric tunables" intent. The value was always meant to track
``CoinHealthTracker._NO_HISTORY_NEUTRAL_SCORE`` (the midpoint the
tracker emits for unknown coins, also 50.0); pin the dependency
explicitly so adjusting the neutral score in one place updates both.

Also add two parametrised tests that pin the strict-inequality
boundary semantics:

* ``activity == _NO_HISTORY_NEUTRAL_SCORE`` → gate does not engage,
  cooldown fires.
* ``activity > _NO_HISTORY_NEUTRAL_SCORE + epsilon`` → gate engages,
  cooldown is suppressed.

These regression-pin the boundary so a future refactor of ``>`` to
``>=`` is detected by CI.

Addresses the should-fix from PR #146 review.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant