内置领域源预设库

## 背景

当前 agent 创建 topic 时，源的发现完全依赖 agent 自身的推理能力——由 agent 根据用户描述的追踪意图，自行判断应该接入哪些数据源。这在实践中存在两个问题：

1. **agent 缺乏领域知识**：对于垂直领域（如新能源汽车、生物科技），agent 可能无法举出高质量的源。
2. **重复劳动**：不同用户的相似追踪需求（如"追踪 AI 浏览器"），agent 每次都要从头推理源列表。

## 参考实现

**OpenTrends** (`nexmoe/opentrends`) 内置了 **200+ 源预设**，按领域组织为 9 个 topic 页面：

| Topic | 分组 | 示例源 |
|---|---|---|
| AI | News / Practitioners / Community / Research / Watchlist | HN, r/MachineLearning, arXiv, Qwen Research |
| Embodied AI | Industry / Community / Research | Crowd Supply, Kickstarter |
| Biotech | News / Communities / Research | Nature, arXiv bio |
| Hardware | News / Maker / Community | Hackaday, Crowd Supply |
| Programming | HN / GitHub / Community / Publications | GitHub Trending, r/programming |
| Chinese Web | Hot / Media / Tech / News / Finance / Blogs | 知乎热榜, 36氪, 微博, 掘金 |

每个 topic 定义了 `sections`（分组），每组包含多个 `source preset`，每个 preset 指定 provider 类型、参数、刷新策略（5档 TTL）。

## 提案

在 Skrya 中建立 **源预设库（Source Preset Library）**，作为 agent 的知识增强而非替代：

### 1. 预设数据结构

```yaml
# presets/ai-browser-tracking.yaml
preset_id: ai-browser-tracking
name: AI 浏览器追踪
description: 覆盖 AI 浏览器产品的主要信息源
tags: [ai, browser, product]

source_groups:
  - name: 产品动态
    sources:
      - name: "The Verge - AI"
        type: rss
        url: "https://www.theverge.com/rss/ai-artificial-intelligence/index.xml"
      - name: "TechCrunch - AI"
        type: rss  
        url: "https://techcrunch.com/category/artificial-intelligence/feed/"

  - name: 开发者社区
    sources:
      - name: "Hacker News"
        type: rsshub
        route: "/hackernews/best"
      - name: "r/artificial"
        type: rsshub
        route: "/reddit/r/artificial/hot"

  - name: 深度追踪
    type: runtime-retrieval
    queries: ["AI browser launch", "AI 浏览器 发布"]
    capabilities: [news_search, web_search]
```

### 2. Agent 集成方式

在 `source-curation` SKILL.md 中增加：

```
## 预设库查询

在推荐源之前，先查询 presets/ 目录中是否有与当前追踪意图匹配的预设：
- 按 tags 匹配
- 按关键词匹配 name 和 description
- 如果匹配到预设，基于预设中的源列表作为起点，根据用户具体需求裁剪或扩展
- 如果没有匹配，退回到 agent 自主推理
```

### 3. 预设不等于配置

预设是**建议模板**，不是自动写入。agent 仍然需要：
- 向用户确认哪些源要保留
- 根据用户的特殊需求增减源
- 最终写入 `sources.json` 的仍然是用户确认的结果

### 4. 内置预设领域（建议初始集）

| 领域 | 预估源数 | 重点覆盖 |
|---|---|---|
| AI / LLM | 15-20 | HN, arXiv, 各大模型博客, Reddit, Twitter/X |
| 新能源汽车 | 10-15 | 36氪, 车评媒体, 行业报告 |
| 开发工具 | 10-15 | GitHub Trending, HN, Product Hunt |
| 加密货币 | 10-15 | CoinDesk, r/cryptocurrency, 律动 |
| 科技新闻 | 10-15 | The Verge, TechCrunch, 极客公园 |

## 优先级

**高** — 低实现成本，直接提升 agent 的开箱即用体验。预设文件是纯数据（YAML/JSON），不涉及代码改动。

## 备注

来源：agent=牛马AI, model=mimo-v2.5-pro, reference=opentrends

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

内置领域源预设库 #2

背景

参考实现

提案

1. 预设数据结构

2. Agent 集成方式

3. 预设不等于配置

4. 内置预设领域（建议初始集）

优先级

备注

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Topic	分组	示例源
AI	News / Practitioners / Community / Research / Watchlist	HN, r/MachineLearning, arXiv, Qwen Research
Embodied AI	Industry / Community / Research	Crowd Supply, Kickstarter
Biotech	News / Communities / Research	Nature, arXiv bio
Hardware	News / Maker / Community	Hackaday, Crowd Supply
Programming	HN / GitHub / Community / Publications	GitHub Trending, r/programming
Chinese Web	Hot / Media / Tech / News / Finance / Blogs	知乎热榜, 36氪, 微博, 掘金

领域	预估源数	重点覆盖
AI / LLM	15-20	HN, arXiv, 各大模型博客, Reddit, Twitter/X
新能源汽车	10-15	36氪, 车评媒体, 行业报告
开发工具	10-15	GitHub Trending, HN, Product Hunt
加密货币	10-15	CoinDesk, r/cryptocurrency, 律动
科技新闻	10-15	The Verge, TechCrunch, 极客公园

内置领域源预设库 #2

Description

背景

参考实现

提案

1. 预设数据结构

2. Agent 集成方式

3. 预设不等于配置

4. 内置预设领域（建议初始集）

优先级

备注

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions