Skip to content

Commit f43f901

Browse files
committed
docs(svc): execute phase 5a and relocate shared-doc skill
- move canonical edit-shared-docs skill into InKCre/docs and expose a repo-root wrapper - split glossary by redistributing local vocabulary into local guides - upgrade info_base local guidance before further mixed-doc extraction
1 parent 8ae0dce commit f43f901

16 files changed

Lines changed: 320 additions & 107 deletions

File tree

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
---
2+
name: edit-shared-docs
3+
description: Thin discovery wrapper for the canonical shared-doc workflow in `docs/_shared/.agents/skills/edit-shared-docs`. Use when a Codex agent in this unit repo needs to edit shared durable docs or bump `docs/_shared`.
4+
---
5+
6+
# Edit Shared Docs
7+
8+
This is a thin repo-root discovery wrapper.
9+
10+
Codex auto-loads repo-root `.agents/skills`, but it does not auto-load `docs/_shared/.agents/skills`.
11+
12+
Before doing anything else:
13+
14+
1. if `docs/_shared/` is missing, run `git submodule update --init --recursive`
15+
2. read the canonical skill at [docs/_shared/.agents/skills/edit-shared-docs/SKILL.md](../../../docs/_shared/.agents/skills/edit-shared-docs/SKILL.md)
16+
3. follow the canonical workflow there
17+
18+
Do not maintain the real workflow in this wrapper.

AGENTS.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@ If a requested change would reduce readability or maintainability, stop and disc
2020
4. Read local runtime and deployment docs when they are relevant to the task:
2121
- `docs/40-deployment/`
2222
5. Read local transitional docs only when a local `AGENTS.md` or task explicitly points to them:
23-
- `docs/15-alignment/`
2423
- `docs/20-product-tdd/`
2524
6. Read `tasks/` for volatile plans, exploration, and backlog items.
2625

@@ -73,7 +72,7 @@ Before any reference-sensitive or logic-altering change, restate:
7372
- `docs/_shared/20-product-tdd/`: shared cross-unit technical contracts and architectural truths
7473
- local `AGENTS.md` near code: hard local design memory only when code and tests are not enough
7574
- `docs/40-deployment/`: runtime topology, deployment, CI, operational constraints
76-
- `docs/15-alignment/` and `docs/20-product-tdd/`: local transitional docs pending split; read only when explicitly referenced
75+
- `docs/20-product-tdd/`: local transitional docs pending split; read only when explicitly referenced
7776
- `tasks/`: plans, exploration, backlog, temporary reasoning, migration notes
7877

7978
Do not store volatile plans in durable docs. Do not build a second software system out of prose.

CONTRIBUTING.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Contributing
2+
3+
## Development Setup
4+
5+
```bash
6+
pdm install -G dev
7+
git submodule update --init --recursive
8+
```
9+
10+
## Shared Docs And Skill Discovery
11+
12+
`core-py` consumes shared durable docs from `docs/_shared/`.
13+
14+
The canonical shared-doc editing skill lives in:
15+
16+
- `docs/_shared/.agents/skills/edit-shared-docs/`
17+
18+
Because Codex auto-loads repo-root `.agents/skills`, this repo also carries a thin discovery wrapper at:
19+
20+
- `.agents/skills/edit-shared-docs/SKILL.md`
21+
22+
Use that wrapper only to discover the canonical skill. Do not fork the workflow into the wrapper.
23+
24+
## Shared-Doc Update Order
25+
26+
1. Edit shared docs in `InKCre/docs`.
27+
2. Push the shared source commit first.
28+
3. Bump `core-py/docs/_shared` to that pushed commit.
29+
4. Keep unit-local runtime and implementation docs outside `docs/_shared`.

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ pdm install -G dev
1010
pdm run uvicorn run:api_app --reload
1111
```
1212

13+
Developer setup and shared-skill notes: [CONTRIBUTING.md](CONTRIBUTING.md)
14+
1315
## Documentation Map
1416

1517
If `docs/_shared/` is missing, run `git submodule update --init --recursive` before following shared-doc links.
@@ -20,7 +22,7 @@ If `docs/_shared/` is missing, run `git submodule update --init --recursive` bef
2022
- Shared product glossary: [docs/_shared/15-alignment/product-glossary.md](docs/_shared/15-alignment/product-glossary.md)
2123
- Shared cross-unit technical truth: [docs/_shared/20-product-tdd/](docs/_shared/20-product-tdd/)
2224
- Deployment and runtime truth: [docs/40-deployment/README.md](docs/40-deployment/README.md)
23-
- Transitional local docs: `docs/15-alignment/`, `docs/20-product-tdd/` (only when local guides point to them)
25+
- Transitional local docs: `docs/20-product-tdd/` (only when local guides point to them)
2426
- Volatile plans and backlog: [tasks/](tasks/)
2527

2628
## Generated Artifacts

app/business/extension/AGENTS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@
3333
- 一个 extension ID 在一个 deployment 中只对应一个安装记录。
3434
- 是否运行是按 client 控制的,状态存放在 `ExtensionModel.enabled` UUID 数组中。
3535
- installed 不等于 enabled,也不等于 running。
36+
- `extension runtime class` 指从 `extensions.<ext_id>` 加载出的 Python `Extension` 子类。
37+
- `extension config` 指持久化在 extension record 上的配置 payload,不等于运行中的 Python 对象状态。
3638

3739
### Lifecycle
3840

app/business/info_base/AGENTS.md

Lines changed: 82 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,82 @@
1-
## info_base/ - Information Base (核心信息管理)
2-
3-
核心信息存储和管理模块,采用图结构(Block + Relation)组织信息。
4-
5-
### 核心概念
6-
7-
- **Block**: 信息单元(文本、图片、视频等),有 ID、类型、内容
8-
- **Relation**: Block 间的有向关系,形成信息图
9-
- **SubGraph**: Block + 其入边/出边,用于批量插入
10-
- **Storage**: Block 内容存储后端(DB/HTTP 等)
11-
- **Resolver**: Block 内容解析器(根据类型解析成文本)
12-
13-
### 模块结构
14-
15-
```
16-
info_base/
17-
├── main.py # InfoBaseManager - 子图插入协调
18-
├── block.py # BlockManager - Block CRUD 和解析
19-
├── relation.py # RelationManager - Relation CRUD
20-
├── storage/
21-
│ ├── main.py # StorageManager - 存储后端管理
22-
│ └── http.py # HTTP 存储实现
23-
└── resolver/
24-
├── main.py # ResolverManager - 解析器注册
25-
├── text.py # 文本解析器
26-
├── html.py # HTML 解析器
27-
├── image.py # 图片解析器
28-
└── video.py # 视频解析器
29-
```
30-
31-
### 核心流程
32-
33-
**插入子图** (`InfoBaseManager.insert_subgraph`):
34-
1. fetchsert Block(存在则获取,不存在则创建)
35-
2. 递归处理 in_arcs 和 out_arcs
36-
3. 创建 Relation
37-
4. 返回插入的 Block 和 Relation
38-
39-
**Block 内容解析** (`BlockManager.get_content_as_text`):
40-
1. 根据 storage_type 从对应 Storage 获取内容
41-
2. 根据 block_type 找到对应 Resolver
42-
3. 解析成纯文本返回
43-
44-
### 数据模型
45-
46-
[app/schemas/info_base/](../../schemas/info_base/) 目录:
47-
- `BlockModel` - Block 表模型
48-
- `RelationModel` - Relation 表模型
49-
- `SubGraphForm` - 子图插入表单
50-
51-
### 编码指引
52-
53-
- 新增 Block 类型:无需代码改动(存储在 DB enum 中)
54-
- 新增 Resolver:继承 `ResolverBase`,注册到 `ResolverManager`
55-
- 新增 Storage:继承 `StorageBase`,在 `StorageManager.setup_builtin_storages()` 注册
56-
- Block 唯一性:通过 `identity` 字段保证(由 Source 或 Extension 决定)
1+
# info_base/ Local Guide
2+
3+
本文件只描述 `app/business/info_base/` 的局部事实、术语和编辑边界。全局执行协议看仓库根 [AGENTS.md](../../../AGENTS.md)
4+
5+
## 何时阅读
6+
7+
在以下情况进入本目录前先读这里:
8+
9+
- 修改 `InfoBaseManager``BlockManager``RelationManager`
10+
- 修改 block / relation 持久化与去重语义
11+
- 修改 resolver 与 storage 的职责分界
12+
- 修改 ingestion 过程中 embedding 更新的责任边界
13+
14+
如果改动会影响跨模块契约,先读 [docs/_shared/20-product-tdd/](../../../docs/_shared/20-product-tdd/);如果还涉及本仓库尚未拆分完的 ingestion 细节,再读 [docs/20-product-tdd/info-base-ingestion.md](../../../docs/20-product-tdd/info-base-ingestion.md)
15+
16+
## 局部执行规则
17+
18+
- 先区分 product/shared truth 与本地实现细节。`fetchsert`、resolver 调用顺序、storage 取数路径都先视为本地事实。
19+
- 本地 AGENTS 只保留仍能被代码证明的事实。不要把将来想要的 ingestion 架构写成现状。
20+
- 若改动会改变 block / relation 去重、subgraph 插入顺序、resolver-storage 分工,先核对 `main.py``block.py``relation.py``resolver/main.py``storage/main.py`
21+
22+
## 关键文件
23+
24+
- `app/business/info_base/main.py`: `InfoBaseManager`,负责递归插入子图和 arc
25+
- `app/business/info_base/block.py`: `BlockManager`,负责 block create / fetchsert / resolver 协调
26+
- `app/business/info_base/relation.py`: `RelationManager`,负责 relation create / fetchsert
27+
- `app/business/info_base/resolver/main.py`: resolver registry 与 raw/solved content 入口
28+
- `app/business/info_base/storage/main.py`: storage registry 与 built-in storage setup
29+
- `app/business/sink/embedding.py`: block embedding upsert/query,属于 sink 责任
30+
- `app/schemas/info_base/`: block / relation / subgraph 表单与模型
31+
32+
## 术语与命名
33+
34+
### Domain vs Python Names
35+
36+
- `info-base`: 产品/领域概念
37+
- `info_base`: Python package 与模块路径
38+
39+
不要把这两个层级混用。
40+
41+
### Content Terms
42+
43+
- `block content`: `BlockModel.content` 上持久化的字符串字段
44+
- `raw content`: resolver 真正取到的原始内容;若 `block.storage is None`,通常直接来自 `block.content`
45+
- `solved content`: resolver 解释后的内容表示,用于下游使用场景
46+
47+
### Graph Terms
48+
49+
- `block`: 一条持久化信息单元记录
50+
- `relation`: 连接两个 block 的有向边
51+
- `subgraph`: 一个 block 加上它的入边/出边表单结构,用于递归插入
52+
53+
## 当前稳定事实
54+
55+
### Persistence Ownership
56+
57+
- `InfoBaseManager` 负责递归把 `SubGraphForm` 展开进 session。
58+
- block 先经 `BlockManager.fetchsert()` 落地,relation 再经 `RelationManager.fetchsert()` 定形。
59+
- relation identity 目前按 `from_ + to_ + content` 判定。
60+
61+
### Block Dedup And Resolver Coupling
62+
63+
- block 是否已存在,交给该 block 的 resolver 通过 `resolver.get_existing(...)` 判定。
64+
- 当前默认行为不是 route 层去重,也不是 `InfoBaseManager` 自己决定去重规则。
65+
- 若要改变 block identity,优先检查 resolver contract,而不是在调用方加特殊分支。
66+
67+
### Resolver vs Storage Boundary
68+
69+
- resolver 负责解释 block。
70+
- storage 负责在 `block.storage` 存在时取回原始内容。
71+
- 不要在 resolver 里硬编码 storage 实现细节,除非该 resolver 的代码锚点已经明确要求这样做。
72+
73+
### Embedding Ownership
74+
75+
- block 创建后触发 embedding upsert,但 embedding 仍属于 sink 责任,不属于 info-base 自身责任。
76+
- `BlockManager.fetchsert()` 触发 embedding 更新,不等于 info-base 成为 embedding 的权威 owner。
77+
78+
## 编辑指引
79+
80+
- 若改动只影响某个 resolver 的局部行为,优先写到对应子目录 guide 或代码注释,不要把整个 `info_base/` guide 拉宽。
81+
- 若新增 shared contract,先把本地 manager / resolver 细节从 shared 表述里剥离,再去 `InKCre/docs` 写抽象后的版本。
82+
-`docs/20-product-tdd/info-base-ingestion.md` 被继续拆分,本文件应成为本地 mechanics 的主要承接点。

app/business/source/AGENTS.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,14 @@
99
- **SourceCollectJob**: 采集任务,包含配置、状态、调度信息
1010
- **Collect → Organize**: 采集后自动组织成 Block + Relation 结构
1111

12+
### 术语边界
13+
14+
- `source type`: 注册到 `sources_types` 的 source 类标识,通常长得像 import path
15+
- `source instance`: `sources` 表中的一条配置记录
16+
- `collect job`: `sources_collect_jobs` 中的一次执行记录
17+
18+
不要把这三个词混成一个层级。
19+
1220
### 模块结构
1321

1422
```

docs/15-alignment/glossary.md

Lines changed: 0 additions & 44 deletions
This file was deleted.

tasks/product-docs-repo-submodule/00-meta.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,9 @@ Progress checkpoint:
1010
- Phase 1~3: executed locally with concrete artifacts in `InKCre/docs`
1111
- Remote `InKCre/docs` created and source boundary pushed
1212
- SOP + Skill baseline completed for Phase 3
13-
- Phase 4 pilot implemented locally in `core-py`; validation passed, review pending
13+
- Phase 4 pilot implemented in `core-py` and pushed to `develop`
14+
- Phase 5 split gate recorded for remaining mixed docs
15+
- Phase 5A executed: shared skill relocated, local `info_base` guide upgraded, local mixed glossary removed
1416

1517
## Intent
1618

@@ -43,8 +45,11 @@ The previous read-only mirror fan-out design was judged brittle and operationall
4345
- `40-phase-3-submodule-reliability-pack.md`
4446
- `50-phase-4-core-py-pilot.md`
4547
- `51-phase-4-execution-output.md`
48+
- `60-phase-5-mixed-doc-split-gate.md`
49+
- `61-phase-5-split-matrix.md`
50+
- `62-phase-5a-execution-output.md`
4651
- `90-review-checklist.md`
4752

4853
## Immediate Next Step
4954

50-
Run independent review on the Phase 4 rollout, then commit and push if no blocker is found.
55+
Proceed with the next gated split only after confirming which shared slices of `extension-runtime` and `info-base-ingestion` truly survive the shared-admission gate.

0 commit comments

Comments
 (0)