Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 10 additions & 3 deletions .github/workflows/build-images.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ on:
description: "Services to build (comma-separated: gateway,api,worker,portal,all)"
required: false
default: "all"
portal_branch:
description: "Branch for portal submodule"
required: false
default: "dev"
type: string
push:
description: "Push images to registry"
required: false
Expand Down Expand Up @@ -93,11 +98,13 @@ jobs:
with:
submodules: recursive

- name: Update portal submodule to dev branch
- name: Update portal submodule to target branch
run: |
BRANCH="${{ inputs.portal_branch || 'dev' }}"
cd portal
git fetch origin '+refs/heads/dev:refs/remotes/origin/dev'
git checkout -B dev origin/dev
git fetch origin "+refs/heads/${BRANCH}:refs/remotes/origin/${BRANCH}"
git checkout -B "${BRANCH}" "origin/${BRANCH}"
Comment on lines +103 to +106
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate portal_branch before using it in git commands.

workflow_dispatch input is user-provided. Without validation, malformed values can break the run and increase command-safety risk.

Suggested hardening
         run: |
           BRANCH="${{ inputs.portal_branch || 'dev' }}"
+          if ! git check-ref-format --branch "$BRANCH" >/dev/null 2>&1; then
+            echo "Invalid portal_branch: $BRANCH" >&2
+            exit 1
+          fi
           cd portal
           git fetch origin "+refs/heads/${BRANCH}:refs/remotes/origin/${BRANCH}"
           git checkout -B "${BRANCH}" "origin/${BRANCH}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
BRANCH="${{ inputs.portal_branch || 'dev' }}"
cd portal
git fetch origin '+refs/heads/dev:refs/remotes/origin/dev'
git checkout -B dev origin/dev
git fetch origin "+refs/heads/${BRANCH}:refs/remotes/origin/${BRANCH}"
git checkout -B "${BRANCH}" "origin/${BRANCH}"
BRANCH="${{ inputs.portal_branch || 'dev' }}"
if ! git check-ref-format --branch "$BRANCH" >/dev/null 2>&1; then
echo "Invalid portal_branch: $BRANCH" >&2
exit 1
fi
cd portal
git fetch origin "+refs/heads/${BRANCH}:refs/remotes/origin/${BRANCH}"
git checkout -B "${BRANCH}" "origin/${BRANCH}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/build-images.yaml around lines 103 - 106, Validate and
sanitize the user-provided portal_branch before using it in the BRANCH variable
and in the git commands: enforce a safe branch-name pattern (e.g., allow only
alphanumerics, hyphens, underscores and slashes via a regex) and if validation
fails fall back to 'dev' or exit with an error; apply this check where BRANCH is
set and before the git fetch/checkout steps so the BRANCH value used by git
fetch origin "+refs/heads/${BRANCH}:refs/remotes/origin/${BRANCH}" and git
checkout -B "${BRANCH}" "origin/${BRANCH}" is guaranteed safe and cannot inject
shell metacharacters.

echo "portal submodule branch: ${BRANCH}"
Comment on lines +101 to +107
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Scope the portal submodule checkout to portal builds only.

This step currently runs for every matrix service. Non-portal builds can fail due to portal branch issues even when portal is not requested.

Suggested fix
       - name: Update portal submodule to target branch
+        if: matrix.service == 'portal'
         run: |
           BRANCH="${{ inputs.portal_branch || 'dev' }}"
           cd portal
           git fetch origin "+refs/heads/${BRANCH}:refs/remotes/origin/${BRANCH}"
           git checkout -B "${BRANCH}" "origin/${BRANCH}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/build-images.yaml around lines 101 - 107, The "Update
portal submodule to target branch" step currently runs for every matrix service;
restrict it to only run for portal builds by adding a conditional to the step
(use the step name "Update portal submodule to target branch") that checks the
matrix/service or build input for portal (e.g., if: matrix.service == 'portal'
or an equivalent expression that checks the requested services), leaving the
existing BRANCH, cd portal, git fetch and git checkout commands intact so
non-portal matrix entries skip the submodule update.

echo "portal submodule commit: $(git rev-parse HEAD)"

- name: Set up QEMU
Expand Down
178 changes: 178 additions & 0 deletions docs/proposals/SKILL_DATA_DESIGN.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,184 @@ CREATE TABLE capability_versions (

---

## 技能统计与埋点方案

### 目标

`/api/items` 和 `/api/items/:id` 需要稳定返回以下三个核心统计字段:

- `preview_count`:详情页真实预览量
- `install_count`:真实安装量
- `favorite_count`:收藏量

这三个值都属于**聚合后的展示字段**,读接口只负责返回结果,不负责直接加数。所有增量更新统一由行为埋点或显式写接口驱动。

### 统计字段设计

在 `capability_items` 表中补充:

```sql
ALTER TABLE capability_items
ADD COLUMN preview_count INTEGER DEFAULT 0,
ADD COLUMN favorite_count INTEGER DEFAULT 0;
```

`install_count` 继续保留,最终三个字段统一由后端维护并直接随 item 返回,避免列表页每次现算聚合。

### 行为日志作为统计事实来源

现有 `behavior_logs` 机制继续作为用户行为明细表,承担两类职责:

- 保存可追溯的行为事件,用于推荐、分析、报表
- 驱动 `capability_items` 上的聚合计数字段更新

推荐的动作枚举:

- `view`:打开 item 详情页并完成渲染
- `install`:客户端确认安装成功
- `favorite`:用户收藏
- `unfavorite`:用户取消收藏

说明:

- `view` 和 `install` 适合保留在行为日志中
- `favorite` / `unfavorite` 既可以进入行为日志,也必须有独立收藏关系表保证幂等
- `GET /api/items`、`GET /api/items/:id` 这类读接口不应自动记行为,避免被预取、刷新、爬虫污染

### 三个指标的口径与埋点落点

#### 1. 预览量 `preview_count`

预览量的定义:
- 用户真正进入 item 详情页,并且前端完成首屏渲染后,记一次预览

埋点位置:
- 前端详情页加载成功后,调用 `POST /api/items/:id/behavior`
- `actionType = "view"`

明确不建议的做法:
- 不在 `GET /api/items/:id` 中自动 `preview_count + 1`
- 不把列表页曝光算作预览量

原因:
- 读接口可能被 SSR、预取、重试、机器人访问
- 同一接口刷新会带来脏数据
- 列表曝光和详情预览是两种不同业务语义

如需统计列表曝光,应单独定义 `impression`,不要复用 `view`。

#### 2. 安装量 `install_count`

安装量的定义:
- 用户在客户端完成一次有效安装后,记一次安装

推荐埋点位置:
- 客户端实际安装成功后,调用 `POST /api/items/:id/behavior`
- `actionType = "install"`

不推荐的口径:
- 直接把 `GET /api/items/:id/download` 视为安装成功

原因:
- 下载不等于安装
- 用户可能下载后未使用
- 未来可能同时支持压缩包、命令直连、远程 URL 三种安装模式,统一以“客户端确认成功”作为统计口径更一致

兼容策略:
- 若第一阶段仍需保留“下载即安装”的旧逻辑,则必须与 `install` 行为埋点二选一,避免重复累计
- 后续统一收敛到行为埋点模型,下载接口只负责返回文件或命令

#### 3. 收藏量 `favorite_count`

收藏量的定义:
- 当前仍处于收藏状态的用户数

这里不能只靠行为日志统计,因为:
- 用户可能重复点击收藏
- 用户可能取消收藏
- 需要一个可幂等的当前状态

因此需要新增关系表:

```sql
CREATE TABLE item_favorites (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
item_id UUID NOT NULL REFERENCES capability_items(id) ON DELETE CASCADE,
user_id VARCHAR(191) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),

UNIQUE(item_id, user_id)
);
```

配套接口:

```text
POST /api/items/:id/favorite
DELETE /api/items/:id/favorite
```

更新规则:

- `POST` 成功插入收藏关系时,`favorite_count + 1`
- `DELETE` 成功删除收藏关系时,`favorite_count - 1`
- 重复 `POST` 不重复加数
- 重复 `DELETE` 不重复减数

可选增强:
- 同时写入行为日志 `favorite` / `unfavorite`,用于推荐系统和用户画像

### 后端更新原则

统一原则:**计数更新只发生在写路径,不发生在读路径。**

建议按以下方式落地:

| 指标 | 更新入口 | 更新方式 |
|-----|---------|---------|
| `preview_count` | `POST /api/items/:id/behavior` with `view` | 异步或事务内 `preview_count + 1` |
| `install_count` | `POST /api/items/:id/behavior` with `install` | 异步或事务内 `install_count + 1` |
| `favorite_count` | `POST/DELETE /api/items/:id/favorite` | 按收藏关系表变化增减 |

### `/api/items` 返回约定

列表接口与详情接口统一返回:

```jsonc
{
"id": "uuid",
"name": "Code Reviewer",
"previewCount": 1280,
"installCount": 356,
"favoriteCount": 89
}
```

说明:

- 这三个值直接来自 `capability_items` 聚合字段
- `/api/items/:id/stats` 可继续保留,作为更完整的行为统计接口
- `/api/items` 只返回适合列表展示的轻量统计值

### 与现有项目实现的衔接建议

结合当前后端实现,建议这样演进:

1. 保留现有 `behavior_logs` 体系,继续承接 `view` / `install`
2. 将 `capability_items.install_count` 的更新入口统一收敛为行为埋点,避免与下载接口重复累加
3. 在 `capability_items` 上补 `preview_count`、`favorite_count`
4. 新增 `item_favorites` 表和收藏接口
5. 由前端在“详情页完成渲染”“安装成功”“点击收藏/取消收藏”这三个明确交互点进行埋点或写操作

### 风险与约束

- 预览量如果埋在详情读接口,会被预加载和刷新污染
- 安装量如果同时在下载接口和行为接口更新,会发生双写重复统计
- 收藏量如果只记录事件、不保存当前收藏关系,将无法可靠支持取消收藏与幂等
- 如果未来引入去重策略,可在 `behavior_logs` 中结合 `user_id + session_id + item_id + action_type + 时间窗口` 做防刷

---

## 三种场景的实现路径

### 场景一:研发共建(内部创作)
Expand Down
2 changes: 1 addition & 1 deletion portal
Submodule portal updated 329 files
17 changes: 17 additions & 0 deletions server/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
SWAG_VERSION := v1.16.6
SWAG := go run github.com/swaggo/swag/cmd/swag@$(SWAG_VERSION)

.PHONY: run dev swagger-serve swagger swagger-check

run:
go run ./cmd/api

dev: swagger run

swagger-serve: swagger run

swagger:
$(SWAG) init -g cmd/api/main.go -o docs --parseDependency --parseInternal

swagger-check:
$(SWAG) fmt -d ./cmd/api,./internal
4 changes: 3 additions & 1 deletion server/cmd/api/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,7 @@ func main() {
marketplace.GET("/items/new", recommendHandler.GetNewAndNoteworthy)
}
api.GET("/items/:id/similar", searchHandler.FindSimilar)
api.POST("/items/:id/behavior", recommendHandler.LogBehavior)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find the recommend handler file
fd -t f "recommend" server/ --type f | head -20

Repository: XDfield/costrict-web

Length of output: 147


🏁 Script executed:

# Search for LogBehavior handler implementation
rg -t go "LogBehavior" server/ -A 20

Repository: XDfield/costrict-web

Length of output: 18071


🏁 Script executed:

# Look at the main.go file around line 194 to understand middleware context
sed -n '180,210p' server/cmd/api/main.go

Repository: XDfield/costrict-web

Length of output: 1328


🏁 Script executed:

# Search for behavior types and validation
rg -t go "behavior" server/internal/services/ -i -A 5 | head -50

Repository: XDfield/costrict-web

Length of output: 3844


🏁 Script executed:

# Check ActionType definition and allowed values
rg -t go "ActionType" server/internal/models/ -A 3 | head -40

Repository: XDfield/costrict-web

Length of output: 1548


🏁 Script executed:

# Search for install_count and preview_count updates mentioned in review
rg -t go "install_count|preview_count" server/ -B 3 -A 3

Repository: XDfield/costrict-web

Length of output: 7128


🏁 Script executed:

# Check if there's rate-limiting middleware applied
rg -t go "rate|throttle|limit" server/cmd/api/main.go -i

Repository: XDfield/costrict-web

Length of output: 46


🏁 Script executed:

# Look for any behavior logging logic related to counters
rg -t go "LogBehavior" server/internal/services/behavior_service.go -A 40

Repository: XDfield/costrict-web

Length of output: 2094


🏁 Script executed:

# Check the full middleware setup for the public api group
sed -n '150,200p' server/cmd/api/main.go

Repository: XDfield/costrict-web

Length of output: 2273


🏁 Script executed:

# Verify there's no validation of ActionType values anywhere
rg -t go "ActionType" server/internal/handlers/recommend.go -B 5 -A 10

Repository: XDfield/costrict-web

Length of output: 1197


Do not expose a generic behavior endpoint as public without per-action validation and rate-limiting.

This endpoint accepts any arbitrary actionType string value with no validation or whitelist. The behavior service directly updates preview_count and install_count based on the client-supplied action type, allowing anonymous users to trivially forge item engagement metrics. Add strict ActionType validation (whitelist view-only actions if intended for anonymous use), implement per-IP rate-limiting, and consider a separate read-only endpoint for anonymous view tracking instead of exposing the generic write handler.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/cmd/api/main.go` at line 194, The POST route
api.POST("/items/:id/behavior") currently uses recommendHandler.LogBehavior and
accepts arbitrary actionType values allowing clients to modify
preview_count/install_count; update LogBehavior to validate actionType against a
strict whitelist (e.g., only allow read-only/view actions for anonymous users
and explicitly map allowed action strings to internal enums), enforce per-IP
rate-limiting on this route (use your existing rate limiter middleware or add
one tied to request.RemoteAddr/IP), reject or 403 any disallowed actions, and
consider adding a separate read-only endpoint (e.g., GET or POST
/items/:id/view) that only records view events with tight validation instead of
exposing the generic write handler.

api.GET("/items/:id/stats", recommendHandler.GetItemStats)

// All routes below require authentication
Expand Down Expand Up @@ -247,7 +248,8 @@ func main() {
authed.PUT("/items/:id/transfer", handlers.TransferItemToRepo)
authed.POST("/items/:id/scan", handlers.TriggerItemScan)
authed.POST("/scan-jobs/:id/cancel", handlers.CancelScanJob)
authed.POST("/items/:id/behavior", recommendHandler.LogBehavior)
authed.POST("/items/:id/favorite", recommendHandler.FavoriteItem)
authed.DELETE("/items/:id/favorite", recommendHandler.UnfavoriteItem)

authed.POST("/artifacts/upload", handlers.UploadArtifact)
authed.DELETE("/artifacts/:id", handlers.DeleteArtifact)
Expand Down
2 changes: 2 additions & 0 deletions server/cmd/migrate/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ func main() {
&models.CapabilityVersion{},
&models.CapabilityAsset{},
&models.CapabilityArtifact{},
&models.BehaviorLog{},
&models.ItemFavorite{},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Don’t let AutoMigrate create item_favorites ahead of the SQL migration.

This PR already introduces the favorites schema via goose, but AutoMigrate now creates the table before goose runs. That gives fresh installs a different migration path from upgraded ones and can silently skip SQL-only defaults/constraints once the table already exists. Prefer keeping item_favorites owned by the migration file instead of adding it here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/cmd/migrate/main.go` at line 39, The AutoMigrate invocation is
currently including &models.ItemFavorite{}, which causes GORM to create the
item_favorites table before the SQL goose migration runs; remove
&models.ItemFavorite{} from the AutoMigrate arguments (leave other model types
intact) so the SQL migration file remains the authoritative source for creating
and configuring the item_favorites table (you can add a brief comment near the
db.AutoMigrate call referencing that item_favorites is managed by the goose
migration).

&models.SecurityScan{},
&models.ScanJob{},
&models.Device{},
Expand Down
39 changes: 24 additions & 15 deletions server/internal/handlers/capability_item.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ import (
"fmt"
"io"
"log"
"path/filepath"
"net/http"
"path/filepath"
"strconv"
"strings"
"time"
Expand Down Expand Up @@ -220,6 +220,7 @@ func persistNewItem(db *gorm.DB, req createItemRequest, assets createItemAssets)
type ItemResponse struct {
models.CapabilityItem
RepoVisibility string `json:"repoVisibility,omitempty"`
Favorited bool `json:"favorited"`
}

// ListItems godoc
Expand Down Expand Up @@ -288,16 +289,16 @@ func ListItems(c *gin.Context) {
func CreateItem(c *gin.Context) {
registryId := c.Param("id")
var req struct {
Slug string `json:"slug" binding:"required"`
ItemType string `json:"itemType" binding:"required"`
Name string `json:"name" binding:"required"`
Description string `json:"description"`
Category string `json:"category"`
Version string `json:"version"`
Slug string `json:"slug" binding:"required"`
ItemType string `json:"itemType" binding:"required"`
Name string `json:"name" binding:"required"`
Description string `json:"description"`
Category string `json:"category"`
Version string `json:"version"`
Content string `json:"content"`
Metadata json.RawMessage `json:"metadata"`
SourcePath string `json:"sourcePath"`
CreatedBy string `json:"createdBy" binding:"required"`
CreatedBy string `json:"createdBy" binding:"required"`
}

if err := c.ShouldBindJSON(&req); err != nil {
Expand Down Expand Up @@ -368,6 +369,14 @@ func GetItem(c *gin.Context) {
if item.Registry != nil {
resp.RepoVisibility = getRepoVisibility(item.Registry.RepoID)
}
if userID := c.GetString(middleware.UserIDKey); userID != "" {
var count int64
if err := db.Model(&models.ItemFavorite{}).
Where("item_id = ? AND user_id = ?", item.ID, userID).
Count(&count).Error; err == nil {
resp.Favorited = count > 0
}
}

c.JSON(http.StatusOK, resp)
}
Expand Down Expand Up @@ -999,13 +1008,13 @@ func (h *ItemHandler) CreateItemDirect(c *gin.Context) {
// createItemFromJSON handles the original JSON-body item creation path.
func (h *ItemHandler) createItemFromJSON(c *gin.Context) {
var req struct {
RegistryID string `json:"registryId"`
Slug string `json:"slug"`
ItemType string `json:"itemType" binding:"required"`
Name string `json:"name" binding:"required"`
Description string `json:"description"`
Category string `json:"category"`
Version string `json:"version"`
RegistryID string `json:"registryId"`
Slug string `json:"slug"`
ItemType string `json:"itemType" binding:"required"`
Name string `json:"name" binding:"required"`
Description string `json:"description"`
Category string `json:"category"`
Version string `json:"version"`
Content string `json:"content"`
Metadata json.RawMessage `json:"metadata"`
CreatedBy string `json:"createdBy"`
Expand Down
Loading
Loading