feat(sentiment): harden news layer — market-flow filter, evidence floor, event-override off by default#20
Open
Zeeechenn wants to merge 1 commit into
Open
feat(sentiment): harden news layer — market-flow filter, evidence floor, event-override off by default#20Zeeechenn wants to merge 1 commit into
Zeeechenn wants to merge 1 commit into
Conversation
Owner
Author
干净单 provider OOS 复测确认(2026-06-15)把 test2 全池 387 个情感窗口用 Sonnet 4.6 单一 provider 重打(DB 副本,生产库未动,372/387 覆盖),消除之前 Codex+4.6 混合 confound:
边界:同一时间窗、25 IC 天、ICIR 仍<0.4、exploratory。合并/部署仍需你点头。 |
df122c0 to
477ad33
Compare
…oor, event-override off by default
Follow-up to the 2026-06-15 M27 sentiment IC diagnosis. The event_taxonomy
0.65 override was measured net-negative for IC across every variant tested
(event delta -0.0027 / -0.0146 / -0.0280), and noisy non-company headlines
drove individual-stock whipsaws (兆易 603986: 2 sector-IPO headlines, 0 company
news -> -71.3 sentiment).
Changes:
- event_taxonomy: market/fund-flow/list headlines are filtered before
classification so they cannot misfire as hard corporate events (for example,
"下滑" inside "市场下滑"). Adds is_market_flow / is_company_specific /
company_specific_titles helpers.
- event_score: the 0.65 override blend is now gated by
settings.sentiment_event_override_enabled (default OFF). Matched event types
are still reported for observability; raw sentiment is kept. Re-enableable for
a clean single-provider OOS re-test.
- sentiment.analyze_news: evidence floor returns neutral 0 and skips the LLM
when a window has no company-specific headline. Mixed windows now send only
company-specific titles into the LLM prompt/cache/event scoring.
- postmarket and backtest news-cache paths pass stock name+symbol aliases when
available, so cross-company contamination filtering is not production-only.
- m27_alpha_diagnostic --event-ab forces the override ON so it keeps measuring
the override's counterfactual IC effect independent of the production default.
Verification:
- PYTHONPATH=. pytest -q tests/test_news_sentiment_pack.py tests/test_m27_alpha_event_universe.py tests/test_m27_m28_integration.py tests/test_tavily_news.py tests/test_sentiment_cache.py tests/test_llm_runtime_provider.py
-> 46 passed / 1 skipped
- make verify PYTHON=/Users/zeeechenn/mingcang/.venv/bin/python
-> ruff clean, mypy clean, backend pytest 1224 passed / 6 skipped; local
frontend step not runnable in this linked worktree because frontend
node_modules is absent
- GitHub Actions on PR #20 after force-push -> backend lint/typecheck, backend
tests, frontend test/build, and security/dependency audit all passed for both
push and pull_request runs
This is a defensive change (removes a measured drag + adds robustness); it does
not manufacture new sentiment alpha. Production adoption for live test2 signals
should wait for a clean single-provider OOS confirmation (epoch reset).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
477ad33 to
97bbbbd
Compare
Owner
Author
#2 换打分模型已加入本 PR(commit 4255c1b)依据干净 OOS(情感 IC 0.0735@sonnet-4.6 vs ~0.02@fast/Codex),把情感打分从硬编码
验证:全量 1225 passed / 6 skipped,ruff + mypy clean。 |
4255c1b to
97bbbbd
Compare
Owner
Author
Owner
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景
2026-06-15 M27 情感 IC 诊断的落地。
event_taxonomy的0.65×事件极性覆盖被量出对 IC 净负(event delta 三次测量 −0.0027 / −0.0146 / −0.0280),且非个股新闻驱动个股甩尾(兆易 603986:2 条板块 IPO 新闻、0 条公司新闻 → 情感 −71.3)。改动(阶段0「便宜消融」,在现有标题管线上)
is_market_flow/is_company_specific/company_specific_titles;市场/资金流/榜单标题在分类前剔除,避免「市场下滑」里的「下滑」误判成earnings_warning硬事件。settings.sentiment_event_override_enabled(默认False)。事件类型仍上报(可观测),但不再混入分数。可为干净单 provider OOS 复测重新打开。analyze_news中,窗口无个股特定标题(仅噪声/跨公司,按company_aliases判)时返回中性 0 并跳过 LLM 调用(兼省成本)。m27 --event-ab始终测量 override 的反事实 IC 效应,独立于生产默认值。验证
46 passed / 1 skipped1224 passed / 6 skippedfrontend/node_modules,因此未在本地完成;GitHub Actions 的 frontend test/build 已通过。性质与采用门槛
这是防御性改动:移除一个已测量的 IC 拖累 + 增加鲁棒性(灭兆易式甩尾),不制造新情感 alpha(情感原始 IC ~0.02 的天花板是输入数据决定的,需阶段1 完整 iFinD pack 才可能验证能否抬升)。
🤖 Generated with Claude Code