Skip to content

feat(sentiment): score news sentiment with capable tier (sonnet-4.6)#21

Open
Zeeechenn wants to merge 1 commit into
feat/news-sentiment-packfrom
feat/sentiment-model-tier
Open

feat(sentiment): score news sentiment with capable tier (sonnet-4.6)#21
Zeeechenn wants to merge 1 commit into
feat/news-sentiment-packfrom
feat/sentiment-model-tier

Conversation

@Zeeechenn

Copy link
Copy Markdown
Owner

Stacked on #20(base = feat/news-sentiment-pack)。从新闻层 PR 拆出,单独 review/合并,因为它改模型/成本行为,风险类与纯防御的新闻层加固不同。#20 合并后本 PR 自动 retarget 到 main。

内容(仅 4 文件)

  • config.py:新增 sentiment_model_tier(默认 capableclaude-sonnet-4-6,每个 provider 都映射到它)。
  • analyze_news + m27 backfill:model_tier 从硬编码 fast 改为读配置。
  • 测试:断言情感按配置档打分。

依据

2026-06-15 干净单 provider OOS:sentiment IC 0.0735@sonnet-4.6 vs ~0.020@fast/Codex —— 打分模型质量是该信号的最大杠杆。

⚠️ 生产启用需配 env(.env 未提交)

  • AI_PROVIDER=anthropic(云 API 付费、稳,适合 88 股日批);或
  • AI_PROVIDER=local_cli + LOCAL_CLI_PREFER_CODEX=false(走 claude CLI / CC 订阅免 API 费,但批量有日配额截断风险)。
  • 不配则 local_cli 默认仍走 Codex,本改动不生效。

CI:随 PR 运行。Security/dependency audit 的 fail 是 cryptography 48.0.0→48.0.1 等依赖 CVE,与本 PR 无关。

🤖 Generated with Claude Code

…titles from LLM input

Acts on the 2026-06-15 clean single-provider OOS: sentiment IC measured 0.0735
under sonnet-4.6 vs ~0.020 under the fast/Codex tier — scoring-model quality is
the dominant lever for this signal.

- config: sentiment_model_tier (default "capable" → claude-sonnet-4-6 on every
  provider). analyze_news + the m27 backfill tool now score at this tier instead
  of the hardcoded "fast". NOTE: with AI_PROVIDER=local_cli also set
  LOCAL_CLI_PREFER_CODEX=false, else the Codex path ignores the model.
- analyze_news: the company-evidence check is now a full filter — only
  company-specific titles (after market-flow + alias relevance) are sent to the
  LLM and used for the cache key; a window with none returns neutral and skips
  the call.
- backtest news_cache: resolves stock name+code aliases from Stock metadata and
  passes them through so the relevance filter applies on that path too.

Verification: full suite 1225 passed / 6 skipped; ruff + mypy clean. New tests
assert only company-specific titles reach the LLM, the backtest path forwards
aliases, and sentiment scores at the configured tier.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant