-
Notifications
You must be signed in to change notification settings - Fork 216
NO-ISSUE: OTA-1605 Automate OCP-42543 #1309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@JianLi-RH: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughAdds a new cluster-version-operator payload and a Ginkgo test to verify manifests annotated Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: JianLi-RH The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@test/cvo/cvo.go`:
- Around line 99-105: Update the assertion message to accurately describe the
expectation: change the misleading message on the
o.Expect(err).NotTo(o.HaveOccurred(), ...) call so it states that manifest
extraction should succeed (or that no error is expected) when invoking
ocClient.AdmReleaseExtract with manifestDir
(ocClient.AdmReleaseExtract(manifestDir)); reference the manifest extraction
operation and use clear text like "expected manifest extraction to succeed" or
"no error expected when extracting manifests" instead of "The NotFound error
should occur when extract manifests".
- Around line 117-119: The loop opens files with os.Open and uses defer
file.Close(), which leaks descriptors until the enclosing function returns;
replace the deferred close by closing each file immediately after its processing
(call file.Close() directly at the end of the loop iteration) or move the
per-file logic into a helper function (e.g., processManifestFile(filePath) that
opens the file and defers file.Close() inside that helper) so file handles are
released promptly; update references to the file variable and remove the in-loop
defer file.Close() in the code around os.Open and file usage.
🧹 Nitpick comments (1)
test/oc/cli/cli.go (1)
118-125: Duplicate logging inRunmethod.The
ocExecutor.runmethod (lines 38-42) already logs success/error with command details. ThisRunwrapper adds redundant logging, resulting in double log entries for each command execution.Consider removing the duplicate logging here since the executor already handles it:
♻️ Proposed simplification
func (c *client) Run(args ...string) ([]byte, error) { - b, err := c.executor.run(args...) - if err != nil { - c.logger.Error(err, "running command failed", "output", string(b)) - } else { - c.logger.Info("running command succeeded.") - } - return b, err + return c.executor.run(args...) }
8d09bb9 to
f175131
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@test/cvo/cvo.go`:
- Around line 134-140: The YAML decode loop currently swallows decode errors
with a continue, but the test should fail instead; replace the silent continue
with an assertion that fails the test on decode errors by calling
o.Expect(err).NotTo(o.HaveOccurred(), "decode manifest %s failed", filePath)
immediately after decoder.Decode(&doc) (remove the unreachable continue) so any
decoder.Decode failure in the loop (the symbol decoder.Decode and variable doc)
causes the test to fail and reports the filePath context.
- Around line 141-145: The annotation check in the loop (variables meta, ann and
the key annotation in test/cvo/cvo.go) wrongly assumes the value is the string
"true"; change the check to handle both boolean true and the string "true" (and
treat other truthy forms if present) by retrieving v := ann[annotation] and
using a type switch or explicit type assertions to accept v == true or v ==
"true" (and skip if ann is nil or the value is absent), so resources with
unquoted YAML true are correctly detected for deletion.
- Around line 103-108: The current code uses a fixed manifestDir.To =
"/tmp/OTA-42543-manifest" which can collide in parallel runs; replace that with
a unique temp directory created via os.MkdirTemp and assign the returned path to
manifestDir.To (handle and return/log any error from MkdirTemp), keep the defer
to os.RemoveAll(manifestDir.To) for cleanup, and then call
ocClient.AdmReleaseExtract(manifestDir) as before; update the code around the
manifestDir variable, ocapi.ReleaseExtractOptions initialization, and the defer
cleanup to use the MkdirTemp-created path.
| nameLower := strings.ToLower(entry.Name()) | ||
| if strings.Contains(nameLower, "cleanup") { | ||
| logger.Info(fmt.Sprintf("Skipping file %s because it matches cleanup filter", entry.Name())) | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with the logging I do not see why we skip it.
Could a resource have cleanup in its file name and annotation release.openshift.io/delete=true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a manifest can have cleanup in its name and annotation release.openshift.io/delete=true in its content at same time.
We can ignore this situation, in my impression this object can be deployed (not entirely sure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we find it out and make a comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added in https://github.com/openshift/openshift-tests-private/pull/26185, but there is no comment. maybe we discussed it out of git :(
| args = append(args, "-n", namespace) | ||
| } | ||
| _, err := ocClient.Run(args...) | ||
| o.Expect(err).To(o.HaveOccurred(), "The deleted manifest should not be installed, but actually installed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha. If I understand the code correctly, you are doing the following command with oc-cli:
$ oc get <kind> <name> -n <namespace>Because KIND|GROUP|VERSION are dynamic, it is not easy to do it via client-go (See how CVO does it). Correct?
It is really nasty, and I really do not want it but I do not have a better way.
HOWEVER, I think this should work (which is much simpler if it does) for your case:
cluster-version-operator/vendor/github.com/openshift/library-go/pkg/manifest/manifest.go
Line 386 in e9762c6
| func ParseManifests(r io.Reader) ([]Manifest, error) { |
- Parse manifests out of files in payload
- check if a manifest.Raw contains string
release.openshift.io/delete=true;
if yes, save it to a temp file and dooc get -fcommand with the temp file and expect not-found error(s).
You do not need to any yaml/json parsing here. And you will get Manifest for free. GVK is also difficult to use correctly and the way you do it now might not be accurate (for example, you are using Kind only, not Version nor Group).
Let me know what you think about it or you need more clarification.
Unlike other cases, a simple shell script would do the case. But we like Go code more. 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me give a try today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in recent commit and passed in my local machine:
[jianl@jianl-thinkpadt14gen4 cluster-version-operator]$ _output/linux/amd64/cluster-version-operator-tests run-test "[Jira:\"Cluster Version Operator\"] cluster-version-operator should not install resources annotated with release.openshift.io/delete=true"
Running Suite: - /home/jianl/1_code/cluster-version-operator
=============================================================
Random Seed: 1770004261 - will randomize all specs
Will run 1 of 1 specs
------------------------------
[Jira:"Cluster Version Operator"] cluster-version-operator should not install resources annotated with release.openshift.io/delete=true [Conformance, High, 42543]
/home/jianl/1_code/cluster-version-operator/test/cvo/cvo.go:90
STEP: Setting up oc @ 02/02/26 11:51:01.098
cluster-version-operator-tests "level"=0 "msg"="will use the environment timeout variable to run command: 90s"
cluster-version-operator-tests "level"=0 "msg"="timeout is: 1m30s"
STEP: Extracting manifests in the release @ 02/02/26 11:51:01.099
cluster-version-operator-tests "level"=0 "msg"="Extract manifests to: /tmp/OTA-42543-manifest"
cluster-version-operator-tests "level"=0 "msg"="the output directory does not exist, will create it: /tmp/OTA-42543-manifest"
cluster-version-operator-tests "level"=0 "msg"="the output directory has been created: /tmp/OTA-42543-manifest"
cluster-version-operator-tests "level"=0 "msg"="Running command succeeded." "cmd"="/usr/local/sbin/oc" "args"="adm release extract --to=/tmp/OTA-42543-manifest"
STEP: Checking if getting manifests with release.openshift.io/delete on the cluster led to not-found error @ 02/02/26 11:51:31.859
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_31_cluster-baremetal-operator_06_deployment-hostedcluster-delete.yaml" "output"="Error from server (NotFound): deployments.apps \"cluster-baremetal-operator-hostedcluster\" not found\n"
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_50_cloud-credential-operator_01-service-delete.yaml" "output"="Error from server (NotFound): services \"controller-manager-service\" not found\n"
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_70_cluster-network-operator_02_rbac.yaml" "output"="NAME SECRETS AGE\nserviceaccount/cluster-network-operator 1 96m\n\nNAME ROLE AGE\nclusterrolebinding.rbac.authorization.k8s.io/cluster-network-operator ClusterRole/cluster-admin 96m\nError from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"default-account-cluster-network-operator\" not found\n"
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_90_cluster-authentication-operator_03_prometheusrule.yaml" "output"="Error from server (NotFound): prometheusrules.monitoring.coreos.com \"authentication-operator\" not found\n"
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_90_machine-config_90_deletion.yaml" "output"="Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"default-account-openshift-machine-config-operator\" not found\nError from server (NotFound): cronjobs.batch \"machine-config-nodes-crd-cleanup\" not found\nError from server (NotFound): services \"machine-config-operator\" not found\nError from server (NotFound): servicemonitors.monitoring.coreos.com \"machine-config-operator\" not found\n"
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_90_machine-config_90_deletion.yaml" "output"="Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"default-account-openshift-machine-config-operator\" not found\nError from server (NotFound): cronjobs.batch \"machine-config-nodes-crd-cleanup\" not found\nError from server (NotFound): services \"machine-config-operator\" not found\nError from server (NotFound): servicemonitors.monitoring.coreos.com \"machine-config-operator\" not found\n"
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_90_machine-config_90_deletion.yaml" "output"="Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"default-account-openshift-machine-config-operator\" not found\nError from server (NotFound): cronjobs.batch \"machine-config-nodes-crd-cleanup\" not found\nError from server (NotFound): services \"machine-config-operator\" not found\nError from server (NotFound): servicemonitors.monitoring.coreos.com \"machine-config-operator\" not found\n"
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_90_machine-config_90_deletion.yaml" "output"="Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"default-account-openshift-machine-config-operator\" not found\nError from server (NotFound): cronjobs.batch \"machine-config-nodes-crd-cleanup\" not found\nError from server (NotFound): services \"machine-config-operator\" not found\nError from server (NotFound): servicemonitors.monitoring.coreos.com \"machine-config-operator\" not found\n"
cluster-version-operator-tests "msg"="failed to parse manifest file: /tmp/OTA-42543-manifest/release-metadata" "error"="error parsing: Resource with fields Group: \"\" Kind: \"cincinnati-metadata-v0\" Name: \"\" must contain kubernetes required fields kind and name"
• [46.763 seconds]
------------------------------
Ran 1 of 1 Specs in 46.764 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
{
"name": "[Jira:\"Cluster Version Operator\"] cluster-version-operator should not install resources annotated with release.openshift.io/delete=true",
"lifecycle": "blocking",
"duration": 46764,
"startTime": "2026-02-02 03:51:01.095874 UTC",
"endTime": "2026-02-02 03:51:47.859924 UTC",
"result": "passed",
"output": " STEP: Setting up oc @ 02/02/26 11:51:01.098\n STEP: Extracting manifests in the release @ 02/02/26 11:51:01.099\n STEP: Checking if getting manifests with release.openshift.io/delete on the cluster led to not-found error @ 02/02/26 11:51:31.859\n"
}
]
[jianl@jianl-thinkpadt14gen4 cluster-version-operator]$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some errors from the new output:
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_90_machine-config_90_deletion.yaml" "output"="Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"default-account-openshift-machine-config-operator\" not found\nError from server (NotFound): cronjobs.batch \"machine-config-nodes-crd-cleanup\" not found\nError from server (NotFound): services \"machine-config-operator\" not found\nError from server (NotFound): servicemonitors.monitoring.coreos.com \"machine-config-operator\" not found\n"
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_90_machine-config_90_deletion.yaml" "output"="Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"default-account-openshift-machine-config-operator\" not found\nError from server (NotFound): cronjobs.batch \"machine-config-nodes-crd-cleanup\" not found\nError from server (NotFound): services \"machine-config-operator\" not found\nError from server (NotFound): servicemonitors.monitoring.coreos.com \"machine-config-operator\" not found\n"
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_90_machine-config_90_deletion.yaml" "output"="Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"default-account-openshift-machine-config-operator\" not found\nError from server (NotFound): cronjobs.batch \"machine-config-nodes-crd-cleanup\" not found\nError from server (NotFound): services \"machine-config-operator\" not found\nError from server (NotFound): servicemonitors.monitoring.coreos.com \"machine-config-operator\" not found\n"
cluster-version-operator-tests "msg"="Running command failed" "error"="exit status 1" "cmd"="/usr/local/sbin/oc" "args"="get -f /tmp/OTA-42543-manifest/0000_90_machine-config_90_deletion.yaml" "output"="Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io \"default-account-openshift-machine-config-operator\" not found\nError from server (NotFound): cronjobs.batch \"machine-config-nodes-crd-cleanup\" not found\nError from server (NotFound): services \"machine-config-operator\" not found\nError from server (NotFound): servicemonitors.monitoring.coreos.com \"machine-config-operator\" not found\n"
cluster-version-operator-tests "msg"="failed to parse manifest file: /tmp/OTA-42543-manifest/release-metadata" "error"="error parsing: Resource with fields Group: \"\" Kind: \"cincinnati-metadata-v0\" Name: \"\" must contain kubernetes required fields kind and name"
This is the content of related manifest:
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
release.openshift.io/delete: "true"
name: default-account-openshift-machine-config-operator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: default
namespace: openshift-machine-config-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
annotations:
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
release.openshift.io/delete: "true"
release.openshift.io/feature-set: Default
name: machine-config-nodes-crd-cleanup
namespace: openshift-machine-config-operator
It also not found in previous code.
| type OC interface { | ||
| AdmReleaseExtract(o ReleaseExtractOptions) error | ||
| Version(o VersionOptions) (string, error) | ||
| Run(args ...string) ([]byte, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get(args ...string) (string, error) should be enough for your case.
The method Run() is be for oc run command.
If the idea https://github.com/openshift/cluster-version-operator/pull/1309/changes#r2751255955 works out, I would just do
GetFileExpectNotFoundError(args ...string) (string, error) to avoid abuse of oc get.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like I said before https://github.com/openshift/cluster-version-operator/pull/1267/changes#r2579341159
If we adding GetFileExpectNotFoundError(), Run to OC client, we have to adding them to the interface as well. This is really not a good practice for using interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like to implement interface for a single instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implement interface for a single instance
Do you mean the interface OC has only one struct that implements it at the moment?
We thought we have gone thro it in the last round.
Moreover, the interface tells us how many oc cmds we have to reply on.
Everyone of them is a compromise we made between go-library and os.exec.
It will save the time for us to make a Sheet like David did the last time.
If you want to have more discussions about it, we can certainly do it.
We can always revisit it in the future if it brings us more pains than gains.
My wild guess is that it will settle down after a while.
But now we just started to use it, the methods there might grow.
But I will try my best to keep the things there under control. 🤞
As we discussed, direct calls of an oc cmd is the last resort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure we can have a discussion.
I agree works will be settle down after a while. But this is completely different from how I used to use interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to convince me of making Run function to run any oc cmd, should your reason be more specific? 🙂
My reasons are given in the previous comment.
test/oc/cli/cli.go
Outdated
| _, err := os.Stat(o.To) | ||
| if errors.Is(err, os.ErrNotExist) { | ||
| c.logger.Info(fmt.Sprintf("the output directory does not exist, will create it: %s", o.To)) | ||
| if err = os.Mkdir(o.To, 0755); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us make the directory in the case, instead of the function.
The method here just calls oc adm release extract (maybe include some logs for debugging), nothing else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can also discuss this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reason is to keep this pkg as thin as possible because this (spawning a process to call oc) is the last resort.
Pushing more logic into this is the other way around.
test/oc/cli/cli.go
Outdated
| @@ -70,12 +70,13 @@ func NewOCCli(logger logr.Logger) (api.OC, error) { | |||
| timeout := 30 * time.Second | |||
| timeoutStr := os.Getenv("OC_CLI_TIMEOUT") | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 30s too short for oc adm release extract? You want 90s for it?
In that case, let us do another function.
func NewOCCliWithTimeout(logger logr.Logger, timeout time.Duration) (api.OC, error)
and
func NewOCCli(logger logr.Logger) (api.OC, error) {
return NewOCCli(logger, 30 * time.Second) (api.OC, error)
}
We could remove the logic about OC_CLI_TIMEOUT (i think no one is using it at the moment). I have to admit that I did not understand your request here.
Please do it in another commit. I can do it too if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, please go ahead, I will not update my code for now.
BTW, I really do not want to introduce a new function for a parameter.
Today we add NewOCCliWithTimeout for timeout, tomorrow we may add other functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today we add NewOCCliWithTimeout for timeout, tomorrow we may add other functions.
Fair point. I will make an option for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There it goes:
f175131 to
788dc98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@test/cvo/cvo.go`:
- Around line 121-160: The deferred close inside the loop captures the loop
variable file by reference and is unnecessary because file.Close() is already
called explicitly later; remove the inline defer that references closeFilePass
and the closeFilePass variable declaration, and rely on the existing explicit
err = file.Close() error handling block (keep the o.Expect checks around
file.Close()); ensure no other code paths expect closeFilePass and that
file.Close() remains invoked for each opened file.
test/cvo/cvo.go
Outdated
| file, err := os.Open(filePath) | ||
| o.Expect(err).NotTo(o.HaveOccurred()) | ||
| defer func() { | ||
| if !closeFilePass { | ||
| // Close the file again | ||
| if err = file.Close(); err != nil { | ||
| o.Expect(err).NotTo(o.HaveOccurred(), "close file failed") | ||
| } | ||
| } | ||
| }() | ||
| decoder := yamlv3.NewDecoder(file) | ||
| for { | ||
| var doc map[string]interface{} | ||
| if err := decoder.Decode(&doc); err != nil { | ||
| if err == io.EOF { | ||
| break | ||
| } | ||
| continue | ||
| } | ||
| meta, _ := doc["metadata"].(map[string]interface{}) | ||
| ann, _ := meta["annotations"].(map[string]interface{}) | ||
| if ann == nil || ann[annotation] != "true" { | ||
| continue | ||
| } | ||
| kind, _ := doc["kind"].(string) | ||
| name, _ := meta["name"].(string) | ||
| namespace, _ := meta["namespace"].(string) | ||
| args := []string{"get", kind, name} | ||
| if namespace != "" { | ||
| args = append(args, "-n", namespace) | ||
| } | ||
| _, err := ocClient.Run(args...) | ||
| o.Expect(err).To(o.HaveOccurred(), "The deleted manifest should not be installed, but actually installed") | ||
| } | ||
| // close each file | ||
| err = file.Close() | ||
| if err != nil { | ||
| closeFilePass = false | ||
| o.Expect(err).NotTo(o.HaveOccurred(), "close file failed") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handle management is overly complex and still has a subtle bug.
The defer at line 123 captures file by reference in a loop—all accumulated defers will operate on the last assigned file when the function exits. The closeFilePass flag adds complexity without fixing this. Since you already call file.Close() explicitly at line 156, remove the inner defer entirely.
🐛 Proposed simplification
filePath := filepath.Join(manifestDir.To, manifest.Name())
file, err := os.Open(filePath)
o.Expect(err).NotTo(o.HaveOccurred())
- defer func() {
- if !closeFilePass {
- // Close the file again
- if err = file.Close(); err != nil {
- o.Expect(err).NotTo(o.HaveOccurred(), "close file failed")
- }
- }
- }()
decoder := yamlv3.NewDecoder(file)
for {
// ... decode loop unchanged ...
}
- // close each file
- err = file.Close()
- if err != nil {
- closeFilePass = false
- o.Expect(err).NotTo(o.HaveOccurred(), "close file failed")
- }
+ file.Close()
}Also remove the closeFilePass variable declaration at line 114.
🤖 Prompt for AI Agents
In `@test/cvo/cvo.go` around lines 121 - 160, The deferred close inside the loop
captures the loop variable file by reference and is unnecessary because
file.Close() is already called explicitly later; remove the inline defer that
references closeFilePass and the closeFilePass variable declaration, and rely on
the existing explicit err = file.Close() error handling block (keep the o.Expect
checks around file.Close()); ensure no other code paths expect closeFilePass and
that file.Close() remains invoked for each opened file.
159290d to
875c19b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@test/cvo/cvo.go`:
- Around line 136-137: The test currently calls ocClient.Run("get", "-f",
filePath) which queries the whole manifest; instead parse the manifest
referenced by filePath to extract the specific resource's kind, name, and
namespace (or default namespace) and call ocClient.Run("get", <kind>, <name>,
"-n", <namespace>) asserting the returned error is a NotFound (use the same
error check helper used elsewhere or assert the error string contains
"NotFound"); update the assertion around ocClient.Run and replace the file-based
get with the per-resource get so the deleted resource is verified precisely
(refer to ocClient.Run and filePath variables to locate the code to change).
| _, err := ocClient.Run("get", "-f", filePath) | ||
| o.Expect(err).To(o.HaveOccurred(), "The deleted manifest should not be installed, but actually installed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the file structure and find the exact code
head -n 150 test/cvo/cvo.go | tail -n 50Repository: openshift/cluster-version-operator
Length of output: 2095
🏁 Script executed:
# Get more context around lines 136-137
sed -n '120,160p' test/cvo/cvo.goRepository: openshift/cluster-version-operator
Length of output: 822
🏁 Script executed:
# Check if ms is defined and what ms.Obj refers to
sed -n '100,200p' test/cvo/cvo.goRepository: openshift/cluster-version-operator
Length of output: 1765
Query the specific resource instead of the entire manifest file to ensure accurate NotFound verification.
oc get -f filePath processes all resources in the manifest file, which makes it impossible to verify that a specific resource is deleted when the file contains multiple resources. Query the individual resource by kind/name/namespace extracted from the manifest and verify the error is NotFound.
Proposed fix
- _, err := ocClient.Run("get", "-f", filePath)
- o.Expect(err).To(o.HaveOccurred(), "The deleted manifest should not be installed, but actually installed")
+ kind := ms.Obj.GetKind()
+ name := ms.Obj.GetName()
+ namespace := ms.Obj.GetNamespace()
+ args := []string{"get", kind, name}
+ if namespace != "" {
+ args = append(args, "-n", namespace)
+ }
+ output, err := ocClient.Run(args...)
+ o.Expect(err).To(o.HaveOccurred(), "The deleted manifest should not be installed, but actually installed")
+ o.Expect(string(output)).To(o.ContainSubstring("NotFound"))🤖 Prompt for AI Agents
In `@test/cvo/cvo.go` around lines 136 - 137, The test currently calls
ocClient.Run("get", "-f", filePath) which queries the whole manifest; instead
parse the manifest referenced by filePath to extract the specific resource's
kind, name, and namespace (or default namespace) and call ocClient.Run("get",
<kind>, <name>, "-n", <namespace>) asserting the returned error is a NotFound
(use the same error check helper used elsewhere or assert the error string
contains "NotFound"); update the assertion around ocClient.Run and replace the
file-based get with the per-resource get so the deleted resource is verified
precisely (refer to ocClient.Run and filePath variables to locate the code to
change).
|
/hold |
875c19b to
c43b487
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@test/oc/cli/cli.go`:
- Around line 92-100: os.Stat call result is only checked for os.ErrNotExist, so
other errors (e.g., permission denied) are ignored; modify the block around
os.Stat(o.To) to handle non-nil errors that are not os.ErrNotExist by returning
or logging the error (include context about o.To) before proceeding to os.Mkdir;
preserve the existing os.ErrNotExist path that uses os.Mkdir and c.logger.Info,
but ensure any other err from os.Stat is propagated (or wrapped) so downstream
code doesn't run on invalid assumptions.
🧹 Nitpick comments (1)
test/cvo/cvo.go (1)
121-129: Simplify error handling and consider stricter parse-error handling.The
if err != nilcheck at line 122 is redundant sinceo.Expect()handles nil errors correctly. For parse errors (line 126-129), silently continuing could mask issues with actual manifest files. Consider distinguishing between expected non-manifest files (like release-metadata) and unexpected parse failures.♻️ Suggested simplification
raw, err := os.ReadFile(filePath) - if err != nil { - o.Expect(err).NotTo(o.HaveOccurred(), "failed to read manifest file") - } + o.Expect(err).NotTo(o.HaveOccurred(), "failed to read manifest file: %s", filePath) manifests, err := manifest.ParseManifests(bytes.NewReader(raw)) if err != nil { - // files like release-metadata are not manifest file, so skip them - logger.Error(err, "failed to parse manifest file: "+filePath) + // Non-manifest files (e.g., release-metadata) are expected to fail parsing + logger.Info(fmt.Sprintf("skipping non-manifest file %s: %v", filePath, err)) + continue }
test/oc/cli/cli.go
Outdated
| _, err := os.Stat(o.To) | ||
| if errors.Is(err, os.ErrNotExist) { | ||
| c.logger.Info(fmt.Sprintf("the output directory does not exist, will create it: %s", o.To)) | ||
| if err = os.Mkdir(o.To, 0755); err != nil { | ||
| err = fmt.Errorf("failed to create directory: %v", err) | ||
| return err | ||
| } | ||
| c.logger.Info(fmt.Sprintf("the output directory has been created: %s", o.To)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-ErrNotExist errors from os.Stat are silently ignored.
If os.Stat fails with an error other than os.ErrNotExist (e.g., permission denied), the code continues execution without reporting it, which could lead to confusing failures downstream.
Proposed fix
func (c *client) AdmReleaseExtract(o api.ReleaseExtractOptions) error {
_, err := os.Stat(o.To)
- if errors.Is(err, os.ErrNotExist) {
+ if err != nil && !errors.Is(err, os.ErrNotExist) {
+ return fmt.Errorf("failed to stat output directory %s: %w", o.To, err)
+ } else if errors.Is(err, os.ErrNotExist) {
c.logger.Info(fmt.Sprintf("the output directory does not exist, will create it: %s", o.To))
if err = os.Mkdir(o.To, 0755); err != nil {
- err = fmt.Errorf("failed to create directory: %v", err)
+ err = fmt.Errorf("failed to create directory: %w", err)
return err
}
c.logger.Info(fmt.Sprintf("the output directory has been created: %s", o.To))
}🤖 Prompt for AI Agents
In `@test/oc/cli/cli.go` around lines 92 - 100, os.Stat call result is only
checked for os.ErrNotExist, so other errors (e.g., permission denied) are
ignored; modify the block around os.Stat(o.To) to handle non-nil errors that are
not os.ErrNotExist by returning or logging the error (include context about
o.To) before proceeding to os.Mkdir; preserve the existing os.ErrNotExist path
that uses os.Mkdir and c.logger.Info, but ensure any other err from os.Stat is
propagated (or wrapped) so downstream code doesn't run on invalid assumptions.
c43b487 to
df9c6b0
Compare
df9c6b0 to
2edaa45
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@test/cvo/cvo.go`:
- Around line 132-136: The code currently calls
manifest.ParseManifests(bytes.NewReader(raw)) and only logs parse errors with
logger.Error, which hides real failures; change this by first skipping known
non-manifest filenames (e.g., "release-metadata") before calling ParseManifests
(use a small skip list keyed on filePath) and, for any other file where
manifest.ParseManifests returns an error, treat it as a test/fatal error instead
of just logging (fail the test or return the error), so that true parse failures
are surfaced; update the ParseManifests call/site and replace the logger.Error
swallow with a conditional skip or a fatal/return on error.
In `@test/util/io_util.go`:
- Around line 11-19: CreateFolder currently ignores non-NotExist errors from
os.Stat and silently succeeds if path exists but is a file; update CreateFolder
to check the error returned by os.Stat: if err != nil and not errors.Is(err,
os.ErrNotExist) return a wrapped error (include context), and if os.Stat
succeeds verify the FileInfo.IsDir() and return an error if it's not a
directory; when directory doesn't exist, use os.Mkdir (or os.MkdirAll if nested
paths are expected) and propagate/wrap any mkdir errors from os.Mkdir so callers
see failures.
🧹 Nitpick comments (1)
test/cvo/cvo.go (1)
128-131: Redundant error check before assertion.The
if err != nilblock is unnecessary sinceo.Expect(err).NotTo(o.HaveOccurred())will fail the test iferris non-nil regardless of the condition.♻️ Simplified code
raw, err := os.ReadFile(filePath) - if err != nil { - o.Expect(err).NotTo(o.HaveOccurred(), "failed to read manifest file") - } + o.Expect(err).NotTo(o.HaveOccurred(), "failed to read manifest file: "+filePath)
| manifests, err := manifest.ParseManifests(bytes.NewReader(raw)) | ||
| if err != nil { | ||
| // files like release-metadata are not manifest file, so skip them | ||
| logger.Error(err, "failed to parse manifest file: "+filePath) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parse errors are silently ignored, potentially masking real failures.
While the comment notes that some files like release-metadata aren't manifests, logging the error and continuing means actual manifest parsing failures will also be silently skipped. This could lead to false negatives where annotated resources aren't checked.
💡 Suggested improvement
manifests, err := manifest.ParseManifests(bytes.NewReader(raw))
if err != nil {
- // files like release-metadata are not manifest file, so skip them
- logger.Error(err, "failed to parse manifest file: "+filePath)
+ // Non-manifest files (release-metadata, image-references) are expected to fail parsing
+ if !strings.HasPrefix(manifestFile.Name(), "release-") && !strings.HasPrefix(manifestFile.Name(), "image-") {
+ o.Expect(err).NotTo(o.HaveOccurred(), "failed to parse manifest file: "+filePath)
+ }
+ continue
}Or explicitly check for known non-manifest filenames before attempting to parse.
🤖 Prompt for AI Agents
In `@test/cvo/cvo.go` around lines 132 - 136, The code currently calls
manifest.ParseManifests(bytes.NewReader(raw)) and only logs parse errors with
logger.Error, which hides real failures; change this by first skipping known
non-manifest filenames (e.g., "release-metadata") before calling ParseManifests
(use a small skip list keyed on filePath) and, for any other file where
manifest.ParseManifests returns an error, treat it as a test/fatal error instead
of just logging (fail the test or return the error), so that true parse failures
are surfaced; update the ParseManifests call/site and replace the logger.Error
swallow with a conditional skip or a fatal/return on error.
test/util/io_util.go
Outdated
| func CreateFolder(path string, perm os.FileMode) error { | ||
| _, err := os.Stat(path) | ||
| if errors.Is(err, os.ErrNotExist) { | ||
| if err = os.Mkdir(path, perm); err != nil { | ||
| err = fmt.Errorf("failed to create directory: %v", err) | ||
| return err | ||
| } | ||
| } | ||
| return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent failure when os.Stat returns unexpected errors.
If os.Stat fails with an error other than os.ErrNotExist (e.g., permission denied), the function returns nil without creating the directory and without reporting the error. Additionally, if path exists but is a file rather than a directory, no error is returned.
🐛 Proposed fix
func CreateFolder(path string, perm os.FileMode) error {
- _, err := os.Stat(path)
- if errors.Is(err, os.ErrNotExist) {
- if err = os.Mkdir(path, perm); err != nil {
- err = fmt.Errorf("failed to create directory: %v", err)
- return err
- }
+ info, err := os.Stat(path)
+ if err == nil {
+ if !info.IsDir() {
+ return fmt.Errorf("path exists but is not a directory: %s", path)
+ }
+ return nil
+ }
+ if !errors.Is(err, os.ErrNotExist) {
+ return fmt.Errorf("failed to stat path: %w", err)
+ }
+ if err := os.Mkdir(path, perm); err != nil {
+ return fmt.Errorf("failed to create directory: %w", err)
}
return nil
}🤖 Prompt for AI Agents
In `@test/util/io_util.go` around lines 11 - 19, CreateFolder currently ignores
non-NotExist errors from os.Stat and silently succeeds if path exists but is a
file; update CreateFolder to check the error returned by os.Stat: if err != nil
and not errors.Is(err, os.ErrNotExist) return a wrapped error (include context),
and if os.Stat succeeds verify the FileInfo.IsDir() and return an error if it's
not a directory; when directory doesn't exist, use os.Mkdir (or os.MkdirAll if
nested paths are expected) and propagate/wrap any mkdir errors from os.Mkdir so
callers see failures.
f82f200 to
1576833
Compare
1576833 to
a110f44
Compare
…se.openshift.io/delete=true
a110f44 to
21ec0e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@test/cvo/cvo.go`:
- Around line 101-104: oc.NewOC is being called with a logger value but its
signature expects an ocapi.Options struct; update the call site so you construct
and pass an ocapi.Options (e.g. create an ocapi.Options instance with the Logger
field set to logger) instead of passing logger directly, ensuring option wiring
is preserved (adjust variable names around oc.NewOC and ocClient if needed).
|
@JianLi-RH: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Test case: https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-42543
Test it locally:
/cc @hongkailiu @DavidHurta