Conversation
Integrate automatic category classification into the security scan workflow: - Add fixed taxonomy of 12 allowed categories (frontend-development, backend-development, etc.) - Update LLM system prompt to include category selection requirement - Add Category field to SecurityScan model - Validate and write back category to capability_items on scan completion - Auto-create category in item_categories via CategoryService.EnsureCategory() - Inject ScanJobService into SyncService for post-sync auto-scan trigger - Update SKILL_SCAN_DESIGN.md to reflect backend implementation status Signed-off-by: 林凯90331 <90331@sangfor.com> Co-authored-by: CoStrict <zgsm@sangfor.com.cn>
📝 WalkthroughWalkthroughThe pull request integrates category tracking into the capability item security scanning system. It updates the design document, adds scan backfill functionality, extends database models with category support, implements category validation and persistence in the scan service, and seeds taxonomy data via migration. Changes
Sequence DiagramsequenceDiagram
participant Client as CLI/Sync
participant Worker as ScanService
participant LLM as LLM API
participant DB as Database
participant Cat as CategoryService
Client->>Worker: ScanItem(capabilityItem)
Worker->>LLM: callLLM(scan prompt + expected JSON schema)
LLM-->>Worker: ScanReport{Category, RiskLevel, Verdict, ...}
Worker->>Worker: isValidScanCategory(category)
Worker->>DB: INSERT security_scans(category, ...)
Worker->>DB: UPDATE capability_items SET security_status, last_scan_id
alt category not empty
Worker->>DB: UPDATE capability_items SET category
Worker->>Cat: EnsureCategory(category slug)
Cat->>DB: INSERT/UPDATE item_categories
end
Worker-->>Client: ScanResult{Category, ...}
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~28 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (4)
server/cmd/worker/main.go (1)
124-129: Consider reusing theCategoryServiceinstance.
CategoryServiceis instantiated twice (line 88 and line 128). While functionally correct since both share the same DB, you could extract a single shared instance for minor memory efficiency and clearer dependency relationships.♻️ Proposed refactor
+ categorySvc := &services.CategoryService{DB: db} scanJobSvc := &services.ScanJobService{DB: db} syncSvc := &services.SyncService{ DB: db, Git: &services.GitService{TempBaseDir: tmpDir}, Parser: &services.ParserService{}, ScanJobService: scanJobSvc, - CategorySvc: &services.CategoryService{DB: db}, + CategorySvc: categorySvc, } // ... scanSvc := &services.ScanService{ DB: db, LLMClient: scanLLMClient, ModelName: llmCfg.Model, - CategorySvc: &services.CategoryService{DB: db}, + CategorySvc: categorySvc, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/cmd/worker/main.go` around lines 124 - 129, CategoryService is being instantiated twice; create a single shared instance (e.g., categorySvc := &services.CategoryService{DB: db}) and reuse that variable when constructing ScanService and any other consumers instead of allocating a second &services.CategoryService{DB: db}; update the ScanService initialization (scanSvc) to reference the shared categorySvc to eliminate the duplicate allocation and clarify dependencies.server/cmd/scan-backfill/main.go (1)
138-141: Uselog.Printfinstead oflog.Println(fmt.Sprintf(...)).
log.Println(fmt.Sprintf(...))is redundant and slightly less efficient. Uselog.Printfdirectly.♻️ Proposed fix
- log.Println(fmt.Sprintf( + log.Printf( "backfill complete: matched=%d enqueued=%d skipped=%d failed=%d", len(items), enqueued, skipped, failed, - )) + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/cmd/scan-backfill/main.go` around lines 138 - 141, Replace the redundant log.Println(fmt.Sprintf(...)) call with a direct log.Printf call: locate the logging statement that currently calls log.Println(fmt.Sprintf("backfill complete: matched=%d enqueued=%d skipped=%d failed=%d", len(items), enqueued, skipped, failed)) and change it to use log.Printf with the same format string and arguments so the formatted message is produced directly (no fmt.Sprintf wrapper).docs/proposals/SKILL_SCAN_DESIGN.md (1)
190-190: Minor: Add language specifier to fenced code block.The code block starting at line 190 is missing a language identifier, which triggers a markdownlint warning and may affect syntax highlighting.
📝 Proposed fix
-``` +```text SyncService 写入 CapabilityItem(新增或更新)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/proposals/SKILL_SCAN_DESIGN.md` at line 190, The fenced code block containing the line "SyncService 写入 CapabilityItem(新增或更新)" is missing a language specifier; update that fence to include a language identifier (e.g., add ```text before the line) so Markdown linting and syntax highlighting work correctly for the block in SKILL_SCAN_DESIGN.md.server/internal/services/scan_service.go (1)
226-228: Consider logging errors fromEnsureCategoryinstead of silently discarding them.The
_, _ = s.CategorySvc.EnsureCategory(...)pattern silently ignores any errors. While taxonomy creation failures shouldn't block scan persistence, silently swallowing errors makes debugging difficult when categories aren't being created as expected.♻️ Proposed fix to log errors
if scanRecord.Category != "" && s.CategorySvc != nil { - _, _ = s.CategorySvc.EnsureCategory(scanRecord.Category, "scan") + if _, err := s.CategorySvc.EnsureCategory(scanRecord.Category, "scan"); err != nil { + // Log but don't fail - category creation is non-critical + log.Printf("warn: failed to ensure category %q: %v", scanRecord.Category, err) + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/services/scan_service.go` around lines 226 - 228, The call to s.CategorySvc.EnsureCategory currently discards errors (_, _ = s.CategorySvc.EnsureCategory...), so change it to capture the error and log it instead of swallowing it; e.g., call _, err := s.CategorySvc.EnsureCategory(scanRecord.Category, "scan") and if err != nil log a warning or error containing scanRecord.Category and the err using the service's logger (e.g., s.Logger or s.log) or the standard log package if no logger field exists so failures to create categories are visible but do not block persistence.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/proposals/SKILL_SCAN_DESIGN.md`:
- Line 190: The fenced code block containing the line "SyncService 写入
CapabilityItem(新增或更新)" is missing a language specifier; update that fence to
include a language identifier (e.g., add ```text before the line) so Markdown
linting and syntax highlighting work correctly for the block in
SKILL_SCAN_DESIGN.md.
In `@server/cmd/scan-backfill/main.go`:
- Around line 138-141: Replace the redundant log.Println(fmt.Sprintf(...)) call
with a direct log.Printf call: locate the logging statement that currently calls
log.Println(fmt.Sprintf("backfill complete: matched=%d enqueued=%d skipped=%d
failed=%d", len(items), enqueued, skipped, failed)) and change it to use
log.Printf with the same format string and arguments so the formatted message is
produced directly (no fmt.Sprintf wrapper).
In `@server/cmd/worker/main.go`:
- Around line 124-129: CategoryService is being instantiated twice; create a
single shared instance (e.g., categorySvc := &services.CategoryService{DB: db})
and reuse that variable when constructing ScanService and any other consumers
instead of allocating a second &services.CategoryService{DB: db}; update the
ScanService initialization (scanSvc) to reference the shared categorySvc to
eliminate the duplicate allocation and clarify dependencies.
In `@server/internal/services/scan_service.go`:
- Around line 226-228: The call to s.CategorySvc.EnsureCategory currently
discards errors (_, _ = s.CategorySvc.EnsureCategory...), so change it to
capture the error and log it instead of swallowing it; e.g., call _, err :=
s.CategorySvc.EnsureCategory(scanRecord.Category, "scan") and if err != nil log
a warning or error containing scanRecord.Category and the err using the
service's logger (e.g., s.Logger or s.log) or the standard log package if no
logger field exists so failures to create categories are visible but do not
block persistence.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 599b66a2-eaf3-48c5-9c8a-76e449cfb1ec
📒 Files selected for processing (9)
docs/proposals/SKILL_SCAN_DESIGN.mdpackage.jsonportalserver/cmd/scan-backfill/main.goserver/cmd/worker/main.goserver/internal/models/models.goserver/internal/services/scan_service.goserver/internal/services/scan_service_test.goserver/migrations/20260407110000_add_scan_category_and_seed_taxonomy.sql
Add a new CLI tool to backfill scan jobs for capability items. Features: - Filter by security status, registry ID, or item type - Support dry-run mode for previewing items - Configurable priority and max attempts for scan jobs - Add npm script 'backfill:scan' for convenient execution Signed-off-by: 林凯90331 <90331@sangfor.com> Co-authored-by: CoStrict <zgsm@sangfor.com.cn>
5962260 to
92ab675
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/cmd/worker/main.go (1)
63-65:⚠️ Potential issue | 🟡 MinorReassign the post-migration GORM session to restore logging.
Line 82 discards the return value from
db.Session(), which means thedbvariable retains the silent logger. The session with the restoredoriginalLoggeris never used. All subsequent code—includingrunPostMigrations(db)and service initialization—continues using the silent logger.Suggested fix
- // Re-enable GORM logging after migration - _ = db.Session(&gorm.Session{Logger: originalLogger}) + // Re-enable GORM logging after migration + db = db.Session(&gorm.Session{Logger: originalLogger})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/cmd/worker/main.go` around lines 63 - 65, The current code sets db = db.Session(&gorm.Session{Logger: db.Logger.LogMode(gormlogger.Silent)}) to silence logs but never restores the original logger; reassign the returned session that uses originalLogger after migrations so subsequent calls (e.g., runPostMigrations(db) and service initialization) use the restored logger. Locate the variables db and originalLogger and replace the discarded Session call after AutoMigrate with a reassignment that sets db = db.Session(&gorm.Session{Logger: originalLogger}) (or equivalent) to re-enable the original GORM logger.
🧹 Nitpick comments (1)
server/cmd/worker/main.go (1)
11-11: Useerrors.IsforErrScanJobAlreadyQueued.
server/internal/services/scan_job_service.go:12exposes a sentinel error. Message matching on Line 293 is brittle and will stop classifying wrapped sentinel errors correctly.Suggested fix
- "strings" + "errors" @@ - case err != nil && strings.Contains(err.Error(), services.ErrScanJobAlreadyQueued.Error()): + case errors.Is(err, services.ErrScanJobAlreadyQueued):Also applies to: 293-295
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/cmd/worker/main.go` at line 11, The code is brittle because it matches the error string to detect the sentinel ErrScanJobAlreadyQueued; instead import the standard errors package and replace string-based message matching with errors.Is(err, server/internal/services.ErrScanJobAlreadyQueued) (or errors.As if you need to extract a wrapped error type), updating the check in the worker main where the current err.Error() substring check is used (apply the same change to the other occurrence in this file). Ensure the errors import is added and remove the fragile string comparison so wrapped sentinel errors are correctly recognized.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/cmd/worker/main.go`:
- Around line 206-211: The CLI currently allows negative values for the flags
`priority` and `maxAttempts` (set via `fs.IntVar`) which are later passed
through unchanged; add validation immediately after `_ = fs.Parse(args)` in
`main` to reject negatives: if `priority < 0` or `maxAttempts < 0` print a clear
error and exit (non-zero) so invalid `--priority`/`--max-attempts` are refused
rather than persisted; ensure the check uses the same variable names
(`priority`, `maxAttempts`, `dryRun`, `limit`, `triggerType`) so callers fail
fast before any call into the scan job creation/`scan_job_service` code.
- Around line 271-274: The current fallback to revision = 1 for items with no
version row (using revisionByItemID[item.ID]) can enqueue jobs for items lacking
any models.CapabilityVersion; instead, check the map presence with the comma-ok
idiom (or equivalent) when reading revisionByItemID for item.ID and skip the
item if the revision is missing or <= 0 (e.g., continue the loop) so only items
with an actual version row are enqueued; update the logic around
revisionByItemID, item.ID and the surrounding enqueuing code in main.go
accordingly.
---
Outside diff comments:
In `@server/cmd/worker/main.go`:
- Around line 63-65: The current code sets db = db.Session(&gorm.Session{Logger:
db.Logger.LogMode(gormlogger.Silent)}) to silence logs but never restores the
original logger; reassign the returned session that uses originalLogger after
migrations so subsequent calls (e.g., runPostMigrations(db) and service
initialization) use the restored logger. Locate the variables db and
originalLogger and replace the discarded Session call after AutoMigrate with a
reassignment that sets db = db.Session(&gorm.Session{Logger: originalLogger})
(or equivalent) to re-enable the original GORM logger.
---
Nitpick comments:
In `@server/cmd/worker/main.go`:
- Line 11: The code is brittle because it matches the error string to detect the
sentinel ErrScanJobAlreadyQueued; instead import the standard errors package and
replace string-based message matching with errors.Is(err,
server/internal/services.ErrScanJobAlreadyQueued) (or errors.As if you need to
extract a wrapped error type), updating the check in the worker main where the
current err.Error() substring check is used (apply the same change to the other
occurrence in this file). Ensure the errors import is added and remove the
fragile string comparison so wrapped sentinel errors are correctly recognized.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2720797f-b89c-432d-8b00-e8adf2fc7dda
📒 Files selected for processing (2)
package.jsonserver/cmd/worker/main.go
✅ Files skipped from review due to trivial changes (1)
- package.json
| fs.IntVar(&limit, "limit", 0, "max number of items to enqueue (0 = no limit)") | ||
| fs.StringVar(&triggerType, "trigger-type", "manual", "scan job trigger type to record") | ||
| fs.IntVar(&priority, "priority", 1, "scan job priority") | ||
| fs.IntVar(&maxAttempts, "max-attempts", 2, "scan job max attempts") | ||
| fs.BoolVar(&dryRun, "dry-run", false, "preview items without inserting scan jobs") | ||
| _ = fs.Parse(args) |
There was a problem hiding this comment.
Reject negative queue controls.
Line 282 forwards priority and maxAttempts verbatim, and server/internal/services/scan_job_service.go:24-63 only normalizes zero. scan-backfill --priority=-1 or --max-attempts=-1 therefore persists invalid job settings instead of failing fast.
Suggested fix
fs.BoolVar(&dryRun, "dry-run", false, "preview items without inserting scan jobs")
_ = fs.Parse(args)
+ if priority < 0 {
+ log.Fatal("--priority must be >= 0")
+ }
+ if maxAttempts < 0 {
+ log.Fatal("--max-attempts must be >= 0")
+ }
cfg := config.Load()Also applies to: 282-285
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@server/cmd/worker/main.go` around lines 206 - 211, The CLI currently allows
negative values for the flags `priority` and `maxAttempts` (set via `fs.IntVar`)
which are later passed through unchanged; add validation immediately after `_ =
fs.Parse(args)` in `main` to reject negatives: if `priority < 0` or `maxAttempts
< 0` print a clear error and exit (non-zero) so invalid
`--priority`/`--max-attempts` are refused rather than persisted; ensure the
check uses the same variable names (`priority`, `maxAttempts`, `dryRun`,
`limit`, `triggerType`) so callers fail fast before any call into the scan job
creation/`scan_job_service` code.
| revision := revisionByItemID[item.ID] | ||
| if revision <= 0 { | ||
| revision = 1 | ||
| } |
There was a problem hiding this comment.
Skip items that have no version row.
server/internal/services/scan_job_service.go:24-63 inserts whatever itemRevision it receives. Falling back to 1 on Lines 272-274 can enqueue pending jobs for items that have no models.CapabilityVersion at all.
Suggested fix
- revision := revisionByItemID[item.ID]
- if revision <= 0 {
- revision = 1
- }
+ revision, ok := revisionByItemID[item.ID]
+ if !ok || revision <= 0 {
+ skipped++
+ log.Printf("skipped item=%s slug=%s reason=no-capability-version", item.ID, item.Slug)
+ continue
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| revision := revisionByItemID[item.ID] | |
| if revision <= 0 { | |
| revision = 1 | |
| } | |
| revision, ok := revisionByItemID[item.ID] | |
| if !ok || revision <= 0 { | |
| skipped++ | |
| log.Printf("skipped item=%s slug=%s reason=no-capability-version", item.ID, item.Slug) | |
| continue | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@server/cmd/worker/main.go` around lines 271 - 274, The current fallback to
revision = 1 for items with no version row (using revisionByItemID[item.ID]) can
enqueue jobs for items lacking any models.CapabilityVersion; instead, check the
map presence with the comma-ok idiom (or equivalent) when reading
revisionByItemID for item.ID and skip the item if the revision is missing or <=
0 (e.g., continue the loop) so only items with an actual version row are
enqueued; update the logic around revisionByItemID, item.ID and the surrounding
enqueuing code in main.go accordingly.
Summary by CodeRabbit
New Features
Documentation
Chores