feat: add Forager composite-score auto-exclude by keitaj · Pull Request #146 · keitaj/hyperliquid-bot

keitaj · 2026-05-04T01:39:29Z

Summary

Add a multi-axis coin health score (activity / close-quality / cost) that complements the existing markout-based --auto-exclude. Below a configurable composite threshold for N consecutive checks the coin is paused on the shared cooldown map.
Catches three failure modes the markout sensor misses: dead markets (no fills), structural taker fallback (low maker-close rate even with neutral markout), and gradual cost bleed.
Default disabled (--forager opt-in). Both Forager and --auto-exclude may run side-by-side and either may trigger.

Changes

strategies/coin_health_tracker.py (new): CoinHealthTracker with record_fill / record_close and get_health() returning a CoinHealth dataclass. All scoring formulas read tunables from ForagerConfig — no hardcoded literals.
strategies/mm_config.py: new ForagerConfig dataclass wired into MMConfig via from_legacy_dict(). Validation in __post_init__.
strategies/market_making_strategy.py: instantiates the tracker on startup when forager_enabled, calls _check_forager_health in the per-coin run loop right after _check_auto_exclude, feeds record_fill from the in-loop fill detector. Defensive against fixtures that bypass __init__.
ws/fill_feed.py: routes WS userFills to record_fill (always) and record_close (when closedPnl != 0) deriving is_maker from the crossed field.
bot.py: 7 --forager-* CLI flags (enabled / threshold / consecutive / cooldown / 3 weights), 12 new keys in default_configs and _STRATEGY_PARAMS, links Forager into the FillFeed when enabled.
validation/strategy_validator.py: range checks for the new params.
README.md: human-readable section under ## Strategies + AI YAML reference block.
Tests: test_coin_health_tracker.py (28 cases) for scoring math + config-driven tunability; test_forager.py (15 cases) for strategy integration including throttling, cooldown lifecycle, and co-existence with --auto-exclude.

Parameter exposure

7 CLI flags for the operationally tunable knobs (threshold, consecutive count, cooldown, weights). The 5 internal formula constants — rolling window, idle grace, cost scale, min-closes gate, check interval — are overridable via env vars without adding more CLI flags. This keeps bot.py from continuing to grow while preserving full configurability per the project policy of reading every tunable through config.get('key', default).

Test plan

Unit tests added: 35 new cases (test_coin_health_tracker.py + test_forager.py)
flake8 . --max-line-length=120 passes
pytest tests/ -q passes (1004 tests, 0 failures)
No regressions in existing tests
Default config (forager_enabled=false) preserves prior behaviour — tracker is never instantiated, fill feed hook does not fire

🤖 Generated with Claude Code

Complements the existing markout-based ``--auto-exclude`` with a multi-axis health score that captures coin-level pathologies the markout sensor misses: * **Activity** — fill recency, catches dead markets that never accumulate enough flow for the markout window to fire. * **Close quality** — maker-close rate, catches structural taker fallback even when markout looks neutral. * **Cost** — recent dollar-loss per 1K notional, catches gradual bleed. A rolling ``CoinHealthTracker`` records every fill (activity) and every position close (quality + cost) from the WS fill feed and computes a weighted composite in [0, 100]. Strategy ``_check_forager_health()`` samples the score on the existing per-coin run-loop tick; below ``--forager-threshold`` for ``--forager-consecutive`` checks in a row, the coin is paused via the shared ``_coin_cooldown_until`` map (same machinery as ``--auto-exclude`` and ``--loss-streak-limit``). Forager and ``--auto-exclude`` are independent — either may trigger. All thresholds, weights, and formula constants are sourced from ``ForagerConfig``; the tracker contains no hardcoded numeric tunables. The 7 most operationally relevant parameters are exposed as ``--forager-*`` CLI flags, and the 5 internal formula constants (window, idle grace, cost scale, min-closes gate, check interval) are overridable via env vars without bot.py CLI bloat. * New module: ``strategies/coin_health_tracker.py`` (CoinHealthTracker + CloseEvent / CoinHealth dataclasses). * ``strategies/mm_config.py``: new ``ForagerConfig`` dataclass wired into ``MMConfig`` via ``from_legacy_dict()``. * ``strategies/market_making_strategy.py``: instantiates the tracker on startup when ``forager_enabled``, calls ``_check_forager_health`` in the per-coin run loop right after ``_check_auto_exclude``, feeds ``record_fill`` from the in-loop fill detector. * ``ws/fill_feed.py``: routes WS userFills to ``record_fill`` (always) and ``record_close`` (when ``closedPnl != 0``) using ``crossed`` to derive maker/taker. * ``bot.py``: 7 new CLI flags, 12 new keys in ``default_configs`` and ``_STRATEGY_PARAMS``, link Forager into the FillFeed when enabled. * ``validation/strategy_validator.py``: range checks for the new params. * ``README.md``: human-readable section + AI YAML reference block. * Tests: ``test_coin_health_tracker.py`` (28 cases) for scoring math and config-driven tunability; ``test_forager.py`` for strategy integration including throttling, cooldown lifecycle, and co-existence with ``--auto-exclude``. Backward compatible: ``forager_enabled`` defaults to ``false``; with the flag absent the tracker is never instantiated and no fill feed hook fires. ``CoinHealthTracker`` always reads from ``ForagerConfig`` so env-only formula constants tune behaviour without code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

keitaj

Review: Forager composite-score auto-exclude

The implementation matches the design doc closely, with config-driven scoring throughout the tracker and a clean integration with the existing per-coin cooldown machinery. Backward compatibility is well preserved (default disabled, getattr fallbacks for fixtures bypassing __init__). 35 new tests cover both the math and the strategy integration. One should-fix on a hardcoded literal that escaped the "no-hardcode" goal, plus a few nits.

1. Hardcoded `50.0` activity threshold in the quality gate (should-fix)

strategies/market_making_strategy.py L1208-1212:

if (
    health.n_closes < cfg.min_closes_for_quality
    and health.activity_score > 50.0
):
    return

The design doc states "All thresholds, weights, and formula constants are read from ForagerConfig — the module contains no hardcoded numeric tunables." This literal 50.0 is the only remaining hardcoded score boundary. It also happens to coincide with CoinHealthTracker._NO_HISTORY_NEUTRAL_SCORE, which is the actual semantic anchor — "activity above the no-history neutral suggests the coin is warming up; give it more time."

Suggestion: Replace the literal with the named constant to document the connection:

# strategies/market_making_strategy.py
from strategies.coin_health_tracker import CoinHealthTracker

# inside _check_forager_health:
if (
    health.n_closes < cfg.min_closes_for_quality
    and health.activity_score > CoinHealthTracker._NO_HISTORY_NEUTRAL_SCORE
):
    return

Or, if a future operator might want to tune it independently of the no-history neutral, expose it as a 6th env-only knob (forager_quality_gate_activity_min) with default 50.0. The named-constant approach is lighter and matches current operational intent.

Test gap: the boundary case activity_score == 50.0 is not covered by test_skips_when_quality_data_insufficient_and_active. Worth adding a parametrised case to lock the inequality semantics.

2. Redundant `forager_enabled` check in `bot.py` (nit)

bot.py L519-525:

coin_health_tracker = getattr(self.strategy, '_coin_health_tracker', None)
if (
    self.fill_feed is not None
    and coin_health_tracker is not None
    and self.strategy_config.get('forager_enabled', False)
):

The strategy only instantiates _coin_health_tracker when cfg.forager.enabled is True (see MarketMakingStrategy.__init__ L127-130), so coin_health_tracker is not None is already a sufficient guard. The third clause re-checks the source of truth one layer up and could drift if either from_legacy_dict or the dataclass default changes.

Suggestion:

if self.fill_feed is not None and coin_health_tracker is not None:
    self.fill_feed.set_coin_health_tracker(coin_health_tracker)
    logger.info("[ws] Forager linked to FillFeed (CoinHealthTracker)")

3. `tracked_coins()` is unused outside tests (nit)

strategies/coin_health_tracker.py L181-184:

The diagnostic tracked_coins() returns {coin: n_closes_in_buffer} but no production caller invokes it. The next natural use is a periodic [forager] Summary log similar to the existing [adverse] Summary and [imb-guard] Summary lines (the design doc's ### 観察用ログ section anticipates this).

Not blocking — the method is cheap, harmless, and ready to be plumbed in. But adding the summary log in this PR would round out the observability story without adding much code.

4. Throttle test boundary comment is slightly off (nit)

tests/test_forager.py L177-187 (test_throttle_blocks_repeat_check_within_interval):

The comment says "the third is post-window" but with +30 then +30 the third call lands exactly at the 60s window edge, where now - last_check < cfg.check_interval_seconds evaluates 60.0 < 60.0 → False, so the third call is not throttled. The assertion call_count < 3 happens to pass because the second call (at +30s) is throttled, leaving counts of 1 or 2 — but the comment makes it sound like the third is also throttled.

Suggestion: Either expect exactly call_count == 2 or rewrite the comment to clarify only the second call is suppressed. The strict-inequality semantic is worth pinning down with an explicit assertion.

Verdict: LGTM

No blockers. One should-fix (the hardcoded 50.0 literal in the quality gate) and three nits. The should-fix is a small, isolated change and does not affect the default behaviour with forager_enabled=false (the gate is unreachable in that path). Recommend addressing the literal before enabling Forager in production so the rollout has a clean "no hardcoded tunables" story end-to-end, but the change is acceptable to merge as-is.

🤖 Generated with Claude Code

…nt (#147) The activity-score boundary in ``_check_forager_health`` was a bare ``50.0`` literal, which contradicted the design's "no hardcoded numeric tunables" intent. The value was always meant to track ``CoinHealthTracker._NO_HISTORY_NEUTRAL_SCORE`` (the midpoint the tracker emits for unknown coins, also 50.0); pin the dependency explicitly so adjusting the neutral score in one place updates both. Also add two parametrised tests that pin the strict-inequality boundary semantics: * ``activity == _NO_HISTORY_NEUTRAL_SCORE`` → gate does not engage, cooldown fires. * ``activity > _NO_HISTORY_NEUTRAL_SCORE + epsilon`` → gate engages, cooldown is suppressed. These regression-pin the boundary so a future refactor of ``>`` to ``>=`` is detected by CI. Addresses the should-fix from PR #146 review. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

keitaj commented May 4, 2026

View reviewed changes

keitaj merged commit 002ce24 into main May 4, 2026
6 checks passed

keitaj deleted the feat/forager-coin-health branch May 4, 2026 01:42

keitaj mentioned this pull request May 4, 2026

fix: replace Forager quality-gate hardcoded literal with named constant #147

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Forager composite-score auto-exclude#146

feat: add Forager composite-score auto-exclude#146
keitaj merged 1 commit into
mainfrom
feat/forager-coin-health

keitaj commented May 4, 2026

Uh oh!

keitaj left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

keitaj commented May 4, 2026

Summary

Changes

Parameter exposure

Test plan

Uh oh!

keitaj left a comment

Choose a reason for hiding this comment

Review: Forager composite-score auto-exclude

1. Hardcoded 50.0 activity threshold in the quality gate (should-fix)

2. Redundant forager_enabled check in bot.py (nit)

3. tracked_coins() is unused outside tests (nit)

4. Throttle test boundary comment is slightly off (nit)

Verdict: LGTM

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Hardcoded `50.0` activity threshold in the quality gate (should-fix)

2. Redundant `forager_enabled` check in `bot.py` (nit)

3. `tracked_coins()` is unused outside tests (nit)