Skip to content

chore: rename artifact_version.go + drop 6 P2 todos fixed by PR #14#15

Merged
yhuan123 merged 2 commits into
mainfrom
chore/rename-artifact-version-file
May 19, 2026
Merged

chore: rename artifact_version.go + drop 6 P2 todos fixed by PR #14#15
yhuan123 merged 2 commits into
mainfrom
chore/rename-artifact-version-file

Conversation

@yhuan123

@yhuan123 yhuan123 commented May 19, 2026

Copy link
Copy Markdown
Collaborator

背景

Plan 拆出三个 PR:

PR 3 原计划要删的 `createArtifactVersion` 在 PR 2 的 commit `12f63d2` 一并清掉了。这个 PR 把另外两个 long-overdue 的 cleanup 也一起收尾。

Summary

Commit 1 — rename Go 源文件 typo

  • `git mv pkg/operator/operatorhub/artifact_versiong.go → artifact_version.go`(多了个尾巴的 g)
  • 同步更新 `docs/brainstorms/`、`docs/plans/`、todos 中的旧文件名引用

Commit 2 — 清理 todos/

PR #14 的 `a036487` + `12f63d2` 已经把 6 个 P2 finding 实际修复。todos/ 目录定位为「看板」而非「档案」,按"完成即删"约定直接移除对应 pending 文件:

删除文件 修复 commit 改动要点
001 http-download-no-timeout a036487 downloadToTemp 用 `context.WithTimeout(ctx, o.timeout)` 包裹
004 package-manifest-substring-csv-match a036487 `strings.Contains` → 精确匹配 + surface NestedString type error
005 wait-av-no-retry-on-transient-errors a036487 `isTransientAPIError` + switch:NotFound / Transient 继续轮询
006 download-close-error-swallowed a036487 explicit Close + error check,防止 truncated tgz 绕过 sha256
007 empty-csv-fed-to-subscription a036487 `InstallArtifactVersion` 返回 `(av, csv, err)`
008 channel-default-inconsistency a036487 + 12f63d2 删 "stable" 静默 fallback;channel required-check 移到 `validateConfig`

剩余 pending P2(002 / 003 / 009 / 010 / 011 / 012)继续保留 — 代码尚未落地,归档会变成"编辑报告掩盖未做工作"。

验证

  • `go vet ./...` 通过
  • `go build ./...` 通过
  • `go test ./pkg/operator/operatorhub/... ./pkg/exec/...` 通过
  • `git grep -n "versiong"` 已无残留

回退性

纯 rename + 字符串替换 + 删 stale 文档,git revert 一键恢复,无运行时影响。

…o (PR 3/3)

修复 PR 1 之前就遗留的文件名 typo(多了一个尾巴的 g)。

PR 3 原计划要清理的 dead code `createArtifactVersion` 已经在 PR 2 的
commit 12f63d2 (Drop dead createArtifactVersion: violet binary owns the
write path now) 一并清掉了——此 PR 收尾把这个一直没人收的文件名 typo
也解决掉,确保仓库里再没有 `versiong` 这个错字。

同步更新引用:docs/brainstorms、docs/plans、todos/004/005/007/010。
其中 `:行号` 是 2026-05-18 plan / PR 2 review 时刻的历史快照,PR 2
重写后行号已不对应,但属于历史性引用,保留原值不动。

Plan: docs/plans/2026-05-18-refactor-externalize-violet-binary-plan.md
@alaudabot

alaudabot commented May 19, 2026

Copy link
Copy Markdown
Contributor

🤖 AI Code Review

Property Value
Model opencode/minimax-m2.5-free
Style strict
Issues Found 0
Config Source centralized
Profile ❌ Not Found
Personalized Prompt ❌ No
Prompt Path .github/review/profiles/alaudadevops/upgrade-test/pr-review.md
Alauda Skills ✅ base-acp-operator-list, base-acp-operator-release, base-authoring, base-m365, base-ocp-operator-list, base-skill-setup, builders-alauda-component-e2e-release, builders-alauda-component-upgrade, builders-alauda-pipeline, builders-claudetask-submit, builders-component-knowledge, builders-confluence, builders-dev-mesh-qa, builders-edge-ci-trace, builders-gitlab-ops, builders-helm-operator-generator, builders-install-cluster-plugin, builders-jira, builders-notify-wecom, builders-olm-operator-lifecycle, builders-prd-to-testcase, builders-publish-errata, builders-roadmap-studio, builders-story-split, builders-violet, builders-webapp-testing, cross-repo-add-mirror, cross-repo-publish, devops-add-bug-release-notes, devops-autodns, devops-bundle-csv-baseline-diff, devops-candidate-version-supervisor, devops-connectors-acceptance-test, devops-connectors-explore, devops-connectors-poc-case, devops-connectors-review, devops-connectors-unit-test, devops-connectors-upgrade-test, devops-connectors-write-user-docs, devops-creating-tekton-pipelines, devops-fix-go-vulns, devops-fork-alauda-binary-release, devops-gen-advanced-form-descriptors, devops-jira-rfd-acceptance, devops-knowledge-adoption, devops-pr-review, devops-refresh-containerfile-digests, devops-refresh-containerfile-tags, devops-replace-strings, devops-scan-docker-keywords, devops-sync-alauda-github-releases, devops-tekton-dynamic-form-optimizer, devops-tekton-operator-task-e2e, devops-tekton-pipeline-delivery, devops-tekton-refresh-results-tag, devops-tekton-task-delivery, devops-tekton-task-overview-template, devops-tekton-task-version-upgrade, devops-tekton-upgrade-notes, devops-tool-report-troubleshoot, devops-ui-e2e-code-audit, devops-ui-e2e-fix-base-on-report, devops-ui-e2e-regression-and-fix, devops-ui-generate-e2e-from-feature, devops-ui-pre-setup, devops-upgrade-go, devops-upstream-backport-cve, devops-upstream-upgrade
Reviewed at 2026-05-19 07:44:56 UTC

Summary

PR 3/3 renames artifact_versiong.goartifact_version.go and removes 7 stale todo files that tracked bugs already fixed in PRs 1–2. All doc references and todo bodies have been updated to the new filename. No code logic changes.

Review Statistics

Category Count
Critical Issues 0
Warnings 0
Suggestions 0
Files Reviewed 10

Critical Issues

(None)

Warnings

(None)

Suggestions

  • [todos/010-pending-p2-violet-config-validate-method.md:18] The artifact_versiongartifact_version fix is present but the file's issue_id still shows 010 and status: pending. If the Validate method has been implemented elsewhere, consider updating this todo to reflect its current state (e.g., status: in-progress or status: resolved).

Positive Feedback

  • Clean file rename with 100% similarity — no content risk
  • Stale todos from the earlier review cycle have been properly pruned, keeping the todos/ directory accurate
  • Documentation references in docs/brainstorms/ and docs/plans/ were consistently updated alongside the rename
  • The code in artifact_version.go (lines 15–111) shows good engineering: the InstallArtifactVersion signature now returns (*unstructured.Unstructured, string, error) — the CSV is extracted once and passed up, avoiding the silent-empty-csv bug (former todo 007)


ℹ️ About this review

This review was automatically generated using the run-actions workflow.

  • Shared prompt: .github/prompts/code-review.md
  • Config source: centralized
  • Profile path: Not Found
  • Profile ref: e75e733e9aa1b417a8b3c6441e53495dbcb418ad
  • No repository-specific prompt configured
  • Alauda skills: base-acp-operator-list, base-acp-operator-release, base-authoring, base-m365, base-ocp-operator-list, base-skill-setup, builders-alauda-component-e2e-release, builders-alauda-component-upgrade, builders-alauda-pipeline, builders-claudetask-submit, builders-component-knowledge, builders-confluence, builders-dev-mesh-qa, builders-edge-ci-trace, builders-gitlab-ops, builders-helm-operator-generator, builders-install-cluster-plugin, builders-jira, builders-notify-wecom, builders-olm-operator-lifecycle, builders-prd-to-testcase, builders-publish-errata, builders-roadmap-studio, builders-story-split, builders-violet, builders-webapp-testing, cross-repo-add-mirror, cross-repo-publish, devops-add-bug-release-notes, devops-autodns, devops-bundle-csv-baseline-diff, devops-candidate-version-supervisor, devops-connectors-acceptance-test, devops-connectors-explore, devops-connectors-poc-case, devops-connectors-review, devops-connectors-unit-test, devops-connectors-upgrade-test, devops-connectors-write-user-docs, devops-creating-tekton-pipelines, devops-fix-go-vulns, devops-fork-alauda-binary-release, devops-gen-advanced-form-descriptors, devops-jira-rfd-acceptance, devops-knowledge-adoption, devops-pr-review, devops-refresh-containerfile-digests, devops-refresh-containerfile-tags, devops-replace-strings, devops-scan-docker-keywords, devops-sync-alauda-github-releases, devops-tekton-dynamic-form-optimizer, devops-tekton-operator-task-e2e, devops-tekton-pipeline-delivery, devops-tekton-refresh-results-tag, devops-tekton-task-delivery, devops-tekton-task-overview-template, devops-tekton-task-version-upgrade, devops-tekton-upgrade-notes, devops-tool-report-troubleshoot, devops-ui-e2e-code-audit, devops-ui-e2e-fix-base-on-report, devops-ui-e2e-regression-and-fix, devops-ui-generate-e2e-from-feature, devops-ui-pre-setup, devops-upgrade-go, devops-upstream-backport-cve, devops-upstream-upgrade

PR #14 的 commit a036487 + 12f63d2 已经把以下 6 个 P2 finding 实际修复,
对应 todo 文件失去价值,直接删除(按"todos/ 是看板不是档案"约定)。

| 删除文件 | 修复 commit | 改动要点 |
|---------|-------------|---------|
| 001 http-download-no-timeout            | a036487 | downloadToTemp 用 context.WithTimeout(ctx, o.timeout) 包裹 |
| 004 package-manifest-substring-csv-match| a036487 | strings.Contains → 精确匹配 + surface NestedString type error |
| 005 wait-av-no-retry-on-transient-errors| a036487 | 引入 isTransientAPIError + switch 分支:NotFound/Transient 继续轮询 |
| 006 download-close-error-swallowed      | a036487 | explicit Close + 显式 error check,避免 truncated tgz 绕过 sha256 |
| 007 empty-csv-fed-to-subscription       | a036487 | InstallArtifactVersion 返回 (av, csv, err) 替代二次 NestedString |
| 008 channel-default-inconsistency       | a036487 + 12f63d2 | 删 "stable" 静默 fallback;channel required-check 移到 validateConfig |

剩余 pending 的 P2(002 / 003 / 009 / 010 / 011 / 012)继续保留——
代码尚未落地,归档为 completed 会变成"修报告掩盖未做工作"。
@yhuan123 yhuan123 changed the title chore(operatorhub): rename artifact_versiong.go to artifact_version.go (PR 3/3) chore: rename artifact_version.go + drop 6 P2 todos fixed by PR #14 May 19, 2026
@yhuan123 yhuan123 merged commit dcb1780 into main May 19, 2026
4 checks passed
@yhuan123 yhuan123 deleted the chore/rename-artifact-version-file branch May 19, 2026 07:48
yhuan123 pushed a commit that referenced this pull request May 19, 2026
Three silent-failure paths fixed (P1), four hardening / cleanup changes
applied (P2). All P1 fixes close anti-patterns where preflight reported
"clean" while actually bypassing the guard it was built to enforce.

P1 — silent-failure closures:
- #13/#14 cluster guard: in-cluster mode now requires --confirm-cluster
  matching violet.clusters; empty kubeconfig CurrentContext is now a hard
  error with an actionable 'kubectl config use-context' hint. Previously
  both paths log.Warn'd and returned nil, defeating the guard exactly when
  it mattered most (CI pods with violet.clusters set, multi-context kubeconfigs).
- #15 BundleVersion required for operatorhub: empty value used to slip
  past validateConfig, producing 'operatorhub-foo.' AV names with a
  trailing dot — preflight reported clean while upgrade failed late with
  cryptic 404s. Required at load time.

P2 — internal hardening:
- #16 Drop the seen map from runPreflight: dead code under fail-fast
  semantics (architecture-strategist + simplicity-reviewer both flagged
  it independently). -14 LOC.
- #018 checkInstallPlanResidue no longer discards the NestedString error.
  status.phase type drift now surfaces a wrapped error instead of being
  swallowed and producing a 'every IP is non-terminal' noise storm.
- #019 New preflight.NewResidual constructor centralises kubectl cleanup
  template + %q shell-quoting. Three call sites in operatorhub/preflight.go
  go from 5-line struct literals to one-line constructor calls.

P3 follow-ups (020-023) recorded as separate todos for later triage.

Tests for the cmd-layer changes (#17) land in the next commit.
yhuan123 added a commit that referenced this pull request May 20, 2026
* feat(operator): scaffold PreflightBaseline interface

- new pkg/operator/preflight package holds the Residual value type so
  both pkg/operator and its concrete impls can import without cycles
- OperatorInterface gains PreflightBaseline(ctx, version)
  ([]preflight.Residual, error); contract: read-only, baseline-only
- local: no-op returning (nil, nil) — README will document the assumption
  that `make deploy` must be idempotent
- operatorhub: scaffolding stub; real check logic lands in follow-up
- plan: docs/plans/2026-05-19-feat-upgrade-preflight-check-plan.md

* feat(operatorhub): implement PreflightBaseline residue checks

Three read-only checks against the baseline of an upgrade path:

- Subscription/<name> in <namespace>
- ArtifactVersion/<artifact>.<bundleVersion> in cpaas-system
- non-terminal InstallPlan in <namespace>, filtered by the OLM-managed
  label operators.coreos.com/<package>.<ns> so we never scan the
  namespace's full historical IP archive

Behavior:
- 30s context timeout caps the whole call so a hung apiserver cannot
  block the upgrade for the default o.timeout (10m)
- transient API errors (ServerTimeout / TooManyRequests /
  ServiceUnavailable) are wrapped with a 'retry the run' hint
- permanent errors propagate with the check name attached
- kubectl cleanup hints quote both name and namespace via %q so
  copy-pasted commands stay shell-safe

CSV residue is intentionally NOT checked: an independent CSV without
a corresponding Subscription or ArtifactVersion is a degenerate state
already covered by the other three. Plan documents how to add
PackageManifest-driven CSV resolution if a future incident requires it.

Tests cover clean cluster, single/multi residue, non-terminal phases
(empty/Planning/RequiresApproval/Installing), terminal phases
(Complete/Failed) ignored, transient + permanent error paths, and a
read-only contract assertion (spy reactor fails the test on any
Create/Update/Patch/Delete).

* feat(config): validate namespace + bundleVersion for operatorhub

Two new load-time checks gated on Type == operatorhub:

1. operatorConfig.namespace must be non-empty. An empty namespace
   would let preflight's Subscription / InstallPlan queries silently
   degrade to cluster-scope, producing false-positive residue
   reports from completely unrelated operators.

2. Version.bundleVersion must match ^[a-zA-Z0-9._-]+$. The field is
   interpolated into kubectl cleanup hints emitted by preflight and
   into violet argv, so allowing shell metacharacters here would
   propagate $, backticks, semicolons or quotes straight into the
   user's terminal. Centralising the check at load time closes the
   injection vector for every downstream consumer with a single
   chokepoint.

Both checks fire only for operatorhub type — local mode runs
`make deploy` and has no equivalent risk surface.

Tests cover: namespace required (operatorhub), namespace optional
(local), shell-active characters rejected, realistic versions
(v4.6.3 / 4.6.3-rc.91 / v0.74.0) accepted.

* feat(cmd): wire preflight + cluster guard into Execute

Three additions to the upgrade entry point, all ahead of the upgrade loop:

1. *PreflightError (cmd-internal) formats []preflight.Residual into a
   multi-line copy-pasteable report: one cleanup block per residual,
   then a finalizer-unstuck command template, then the bypass hint.
   The type stays in cmd/ so pkg/operator never picks up a dependency
   on how the CLI talks to the user. Decision C (all-English text) is
   the locked default; Error() is the single place to change wording
   without touching any contract.

2. assertClusterMatch enforces the operator≡kubectl cluster identity
   contract whenever operatorConfig.violet.clusters is set. The
   recommended rule (Decision B, locked at default B1) is exact
   string equality against KUBECONFIG's CurrentContext via
   --confirm-cluster. In-cluster runs (no kubeconfig file) degrade
   to a loud warning so CI pods aren't silently blocked.

3. runPreflight iterates UpgradePaths, calls op.PreflightBaseline on
   each baseline (Versions[0]), and fails fast at the first dirty
   path with a *PreflightError. (Kind, NS, Name) dedup is in place
   for future aggregation; --skip-preflight bypasses the whole step
   with a Warn-level audit line.

Cobra's SilenceUsage is set in AddFlags so that PreflightError's
actionable kubectl lines are not buried under --help output on a
non-zero exit. Flag-parsing errors still print usage (they execute
before RunE), so users who typo a flag are not stranded.

* docs(preflight): README usage + CLAUDE.md hard constraints

- README: new 'Preflight 前置检查' section documents what gets
  scanned (Subscription / ArtifactVersion / non-terminal InstallPlan),
  the 30s timeout, a sample error message, the two new flags
  (--skip-preflight / --confirm-cluster), and the minimum RBAC verbs
  needed to run preflight in isolation.

- CLAUDE.md:
  - 数据流: preflight + cluster guard inserted ahead of the upgrade
    loop; explicit note that preflight does NOT honour cfg.Immediate
    (quality gate, not part of the upgrade chain).
  - Operator 抽象 paragraph mentions PreflightBaseline and the leaf
    pkg/operator/preflight package created to avoid an import cycle.
  - 硬约束: four new entries — PreflightBaseline must stay read-only
    and baseline-only (with a note on why CSV is intentionally
    excluded and where to add it back if needed); cmd.SilenceUsage
    is mandatory; --confirm-cluster matching rule (B1 default);
    BundleVersion regex chokepoint.

- Plan acceptance criteria all marked complete.

* docs(plan): mark preflight-check plan as completed

* fix: address P1+P2 review findings for preflight check

Three silent-failure paths fixed (P1), four hardening / cleanup changes
applied (P2). All P1 fixes close anti-patterns where preflight reported
"clean" while actually bypassing the guard it was built to enforce.

P1 — silent-failure closures:
- #13/#14 cluster guard: in-cluster mode now requires --confirm-cluster
  matching violet.clusters; empty kubeconfig CurrentContext is now a hard
  error with an actionable 'kubectl config use-context' hint. Previously
  both paths log.Warn'd and returned nil, defeating the guard exactly when
  it mattered most (CI pods with violet.clusters set, multi-context kubeconfigs).
- #15 BundleVersion required for operatorhub: empty value used to slip
  past validateConfig, producing 'operatorhub-foo.' AV names with a
  trailing dot — preflight reported clean while upgrade failed late with
  cryptic 404s. Required at load time.

P2 — internal hardening:
- #16 Drop the seen map from runPreflight: dead code under fail-fast
  semantics (architecture-strategist + simplicity-reviewer both flagged
  it independently). -14 LOC.
- #018 checkInstallPlanResidue no longer discards the NestedString error.
  status.phase type drift now surfaces a wrapped error instead of being
  swallowed and producing a 'every IP is non-terminal' noise storm.
- #019 New preflight.NewResidual constructor centralises kubectl cleanup
  template + %q shell-quoting. Three call sites in operatorhub/preflight.go
  go from 5-line struct literals to one-line constructor calls.

P3 follow-ups (020-023) recorded as separate todos for later triage.

Tests for the cmd-layer changes (#17) land in the next commit.

* test(cmd): unit tests for assertClusterMatch and runPreflight

Closes P2 #17 — the two cmd-layer functions added by PR #17 carried
zero coverage (pr-test-analyzer criticality 9).

Coverage:
- assertClusterMatch (10 sub-tests): violet nil / clusters empty no-ops;
  in-cluster missing/mismatched/matching --confirm-cluster; file-mode
  unreadable kubeconfig, empty current-context (pins the #14 fix in
  place), missing/mismatched/matching --confirm-cluster.
- runPreflight (5 sub-tests): clean cluster, empty Versions skipped,
  baseline-only invariant (regression guard: a refactor scanning all
  versions would surface mid-upgrade AVs as false positives), fail-fast
  across paths (second path's PreflightBaseline is verifiably not
  reached), operator errors propagate unwrapped (must NOT be wrapped
  as *PreflightError).

The fake operator records the Version passed to each PreflightBaseline
call so the baseline-only test asserts the actual contract rather than
just call count.

* chore(todos): remove completed P1/P2 review findings

The 7 todos (013-019) that drove the silent-failure fixes + cmd-layer
tests are done; their substance lives in the PR description and the
commits that closed them (1201665 + f721f8e). Removing the files
keeps todos/ as a list of OPEN work, which is its only useful state.

Remaining in todos/: 4 P3 follow-ups for this PR (020-023) and 6 P2
items from prior PRs that have not been actioned yet.

* chore(todos): drop P3 review follow-ups for PR #17

The four P3 nice-to-haves (test variants, internal simplifications,
agent-ergonomics knobs, misc cleanups) are not on the roadmap and
filing them as todos would just decorate the backlog with low-value
items nobody plans to pick up. The PR description still records the
reviewer recommendations for archeology; the file copies were
redundant.

---------

Co-authored-by: huanyang@alauda.io <huanyang@alauda.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants