[WiP]SREP-3024: Add E2E tests for ACME DNS-01 challenge verification by TheUndeadKing · Pull Request #424 · openshift/certman-operator

TheUndeadKing · 2026-02-19T09:08:34Z

Summary

Adds comprehensive E2E tests for ACME DNS-01 challenge verification with real AWS Route53 integration.

Changes Made

New E2E Tests

TXT Record Creation Test: Verifies that _acme-challenge TXT records are created in Route53 for DNS-01 challenge
Certificate Creation Test: Validates primary-cert-bundle-secret creation with proper certificate data
TXT Record Cleanup Test: Ensures TXT records are deleted from Route53 after certificate issuance
Certificate Renewal Test: Simulates and validates certificate renewal process

Utility Functions Added

CreateRoute53Client(): Creates AWS Route53 client for DNS operations
GetHostedZoneIDFromDNSZone(): Extracts hosted zone ID from Hive DNSZone resource
ListAcmeChallengeTXTRecords(): Lists ACME challenge TXT records from Route53
FindDNSZoneForClusterDeployment(): Finds DNSZone resource for a ClusterDeployment
VerifyDNSZoneReady(): Verifies DNSZone is ready with valid zone ID
ParseCertificateFromSecret(): Parses X.509 certificate from Kubernetes secret
VerifyCertificateExpiry(): Validates certificate expiration dates
GetCertificateRemainingDays(): Calculates remaining days until certificate expiry

Code Cleanup

Removed duplicate import of k8s.io/apimachinery/pkg/api/errors
Removed duplicate constant namespace_certman_operator (consolidated to operatorNS)
Removed unused function SanitizeInput()
Fixed incorrect usage of errors.IsNotFound → apierrors.IsNotFound

Test Coverage

Tests verify the complete certificate lifecycle:

ClusterDeployment creation → CertificateRequest creation
TXT records created in Route53 for DNS-01 challenge
Certificate issued and stored in secret
TXT records cleaned up after issuance
Certificate renewal when secret is deleted

Technical Details

Tests use hybrid approach: Mock ACME client + Real AWS Route53
All tests are sequential using ginkgo.Ordered to maintain proper lifecycle flow
Tests only modify files in test/e2e/ directory
All error cases properly handled with gomega assertions

Testing

E2E tests run against real OpenShift cluster
AWS Route53 integration verified with actual DNS operations
Certificate validation includes expiry checks and DNS name verification

openshift-ci-robot · 2026-02-19T09:08:38Z

@TheUndeadKing: This pull request references SREP-3024 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

Adds comprehensive E2E tests for ACME DNS-01 challenge verification with real AWS Route53 integration.

Changes Made

New E2E Tests

TXT Record Creation Test: Verifies that _acme-challenge TXT records are created in Route53 for DNS-01 challenge

Certificate Creation Test: Validates primary-cert-bundle-secret creation with proper certificate data

TXT Record Cleanup Test: Ensures TXT records are deleted from Route53 after certificate issuance

Certificate Renewal Test: Simulates and validates certificate renewal process

Utility Functions Added

CreateRoute53Client(): Creates AWS Route53 client for DNS operations

GetHostedZoneIDFromDNSZone(): Extracts hosted zone ID from Hive DNSZone resource

ListAcmeChallengeTXTRecords(): Lists ACME challenge TXT records from Route53

FindDNSZoneForClusterDeployment(): Finds DNSZone resource for a ClusterDeployment

VerifyDNSZoneReady(): Verifies DNSZone is ready with valid zone ID

ParseCertificateFromSecret(): Parses X.509 certificate from Kubernetes secret

VerifyCertificateExpiry(): Validates certificate expiration dates

GetCertificateRemainingDays(): Calculates remaining days until certificate expiry

Code Cleanup

Removed duplicate import of k8s.io/apimachinery/pkg/api/errors

Removed duplicate constant namespace_certman_operator (consolidated to operatorNS)

Removed unused function SanitizeInput()

Fixed incorrect usage of errors.IsNotFound → apierrors.IsNotFound

Test Coverage

Tests verify the complete certificate lifecycle:

ClusterDeployment creation → CertificateRequest creation

TXT records created in Route53 for DNS-01 challenge

Certificate issued and stored in secret

TXT records cleaned up after issuance

Certificate renewal when secret is deleted

Technical Details

Tests use hybrid approach: Mock ACME client + Real AWS Route53

All tests are sequential using ginkgo.Ordered to maintain proper lifecycle flow

Tests only modify files in test/e2e/ directory

All error cases properly handled with gomega assertions

Testing

E2E tests run against real OpenShift cluster

AWS Route53 integration verified with actual DNS operations

Certificate validation includes expiry checks and DNS name verification

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2026-02-19T09:08:57Z

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Excluded labels (none allowed) (1)

do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Walkthrough

Adds Route53 DNS-01 TXT record verification and certificate lifecycle checks to end-to-end tests, plus Route53 and TLS utilities (Route53 client, hosted zone discovery, TXT record listing, DNSZone readiness) and centralized certificate parsing/expiry helpers. Updates namespace usage and error handling in tests.

Changes

Cohort / File(s)	Summary
E2E tests (DNS-01 & cert lifecycle) `test/e2e/certman_operator_tests.go`	Adds Route53 DNS-01 TXT record verification (hosted zone lookup, polling, TXT record attribute checks), duplicate DNS-01 test blocks, TXT cleanup checks, certificate renewal simulation (delete/recreate secret), replaces inline PEM/x509 parsing with utils.ParseCertificateFromSecret, adds expiry/remaining-days checks, enforces DNSName assertions, and switches namespace refs to `operatorNS` plus `apierrors.IsNotFound` usage.
Test utilities (Route53 / TLS) `test/e2e/utils/utils.go`	Adds Route53 and DNS helpers and certificate utilities: `CreateRoute53Client`, `GetHostedZoneIDFromDNSZone`, `ListAcmeChallengeTXTRecords`, `FindDNSZoneForClusterDeployment`, `VerifyDNSZoneReady`, `ParseCertificateFromSecret`, `VerifyCertificateExpiry`, `GetCertificateRemainingDays`. Removes `SanitizeInput`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

test/e2e/utils/utils.go (1)

1688-1709: ListAcmeChallengeTXTRecords does not handle Route53 pagination.

ListResourceRecordSets returns at most 300 records per call. If the hosted zone contains more records, ACME challenge TXT records could be missed. For a controlled E2E test zone this is unlikely to matter, but if the zone grows or is shared, results will silently be incomplete.

Consider using ListResourceRecordSetsPages (or checking IsTruncated / NextRecordName) to paginate, or at minimum starting the listing at the _acme-challenge prefix via StartRecordName + StartRecordType to reduce the result set.

♻️ Suggested fix using pagination or a targeted start

 func ListAcmeChallengeTXTRecords(client *route53.Route53, hostedZoneID string) ([]*route53.ResourceRecordSet, error) {
-	input := &route53.ListResourceRecordSetsInput{
-		HostedZoneId: aws.String(hostedZoneID),
-	}
-
-	result, err := client.ListResourceRecordSets(input)
-	if err != nil {
-		return nil, fmt.Errorf("failed to list Route53 resource record sets: %w", err)
-	}
-
-	// Filter for _acme-challenge TXT records
 	var challengeRecords []*route53.ResourceRecordSet
-	for _, record := range result.ResourceRecordSets {
-		if record.Type != nil && *record.Type == "TXT" &&
-			record.Name != nil && strings.Contains(*record.Name, "_acme-challenge") {
-			challengeRecords = append(challengeRecords, record)
+	input := &route53.ListResourceRecordSetsInput{
+		HostedZoneId:    aws.String(hostedZoneID),
+		StartRecordName: aws.String("_acme-challenge"),
+		StartRecordType: aws.String("TXT"),
+	}
+
+	err := client.ListResourceRecordSetsPages(input,
+		func(page *route53.ListResourceRecordSetsOutput, lastPage bool) bool {
+			for _, record := range page.ResourceRecordSets {
+				if record.Type != nil && *record.Type == "TXT" &&
+					record.Name != nil && strings.Contains(*record.Name, "_acme-challenge") {
+					challengeRecords = append(challengeRecords, record)
+				}
+			}
+			return !lastPage
+		})
+	if err != nil {
+		return nil, fmt.Errorf("failed to list Route53 resource record sets: %w", err)
 	}
-	}
 
 	return challengeRecords, nil
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/utils/utils.go` around lines 1688 - 1709,
ListAcmeChallengeTXTRecords currently calls route53.ListResourceRecordSets once
and can miss records if the hosted zone has >300 entries; update the function to
handle pagination by using route53.ListResourceRecordSetsPages (or loop using
IsTruncated/NextRecordName/NextRecordType) and accumulate ResourceRecordSets
across pages, while still filtering for TXT records whose Name contains
"_acme-challenge"; alternatively, to reduce scanned records, set StartRecordName
to "_acme-challenge" and StartRecordType to "TXT" when constructing the
ListResourceRecordSetsInput and iterate pages until IsTruncated is false,
returning the combined filtered slice.

test/e2e/certman_operator_tests.go (1)

654-771: This single It block conflates TXT-record cleanup verification with certificate renewal simulation — two independent concerns.

If the TXT cleanup assertion passes but renewal fails, the test name ("should delete TXT records … after certificate issuance") is misleading, and the renewal logic (lines 729-770) largely duplicates the existing "delete secret primary-cert-bundle-secret if exists" test (lines 852-878). Splitting into two It blocks would improve failure diagnosis and remove duplication.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/certman_operator_tests.go` around lines 654 - 771, The single It
block "should delete TXT records from Route53 after certificate issuance" mixes
two concerns — TXT-record cleanup and certificate renewal simulation — causing
misleading failures and duplicated logic; split the renewal simulation into a
separate ginkgo.It (e.g., "should renew certificate when secret is deleted") and
move the code that calls utils.FindCertificateRequestForClusterDeployment,
utils.GetCertificateSecretNameFromCR, clientset.CoreV1().Secrets().Delete/Get,
utils.ParseCertificateFromSecret and utils.VerifyCertificateExpiry into that new
It; keep the original It focused only on DNS TXT cleanup
(ListAcmeChallengeTXTRecords/Route53 checks) and remove the duplicated renewal
block from the original test so each It targets one behavior and the existing
"delete secret primary-cert-bundle-secret if exists" test isn’t duplicated.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/certman_operator_tests.go`:
- Around line 514-547: The test currently dereferences record.TTL (and also
*record.ResourceRecords[0].Value in the log) which can panic if TTL or
ResourceRecords/Value is nil; update the validation loop over challengeRecords
to first assert TTL is not nil (e.g.
gomega.Expect(record.TTL).NotTo(gomega.BeNil(), "Record TTL should not be nil")
) before using *record.TTL, and similarly ensure ResourceRecords and
ResourceRecords[0].Value are non-nil before dereferencing for logging; adjust
the ginkgo.GinkgoLogr.Info calls to check pointers and use the safe,
already-asserted values (record.TTL, record.ResourceRecords) rather than
directly dereferencing without guards.

---

Nitpick comments:
In `@test/e2e/certman_operator_tests.go`:
- Around line 654-771: The single It block "should delete TXT records from
Route53 after certificate issuance" mixes two concerns — TXT-record cleanup and
certificate renewal simulation — causing misleading failures and duplicated
logic; split the renewal simulation into a separate ginkgo.It (e.g., "should
renew certificate when secret is deleted") and move the code that calls
utils.FindCertificateRequestForClusterDeployment,
utils.GetCertificateSecretNameFromCR, clientset.CoreV1().Secrets().Delete/Get,
utils.ParseCertificateFromSecret and utils.VerifyCertificateExpiry into that new
It; keep the original It focused only on DNS TXT cleanup
(ListAcmeChallengeTXTRecords/Route53 checks) and remove the duplicated renewal
block from the original test so each It targets one behavior and the existing
"delete secret primary-cert-bundle-secret if exists" test isn’t duplicated.

In `@test/e2e/utils/utils.go`:
- Around line 1688-1709: ListAcmeChallengeTXTRecords currently calls
route53.ListResourceRecordSets once and can miss records if the hosted zone has
>300 entries; update the function to handle pagination by using
route53.ListResourceRecordSetsPages (or loop using
IsTruncated/NextRecordName/NextRecordType) and accumulate ResourceRecordSets
across pages, while still filtering for TXT records whose Name contains
"_acme-challenge"; alternatively, to reduce scanned records, set StartRecordName
to "_acme-challenge" and StartRecordType to "TXT" when constructing the
ListResourceRecordSetsInput and iterate pages until IsTruncated is false,
returning the combined filtered slice.

coderabbitai · 2026-02-19T09:12:29Z

test/e2e/certman_operator_tests.go

+			// Verify we have records for our domains
+			for _, record := range records {
+				ginkgo.GinkgoLogr.Info("Found TXT record",
+					"name", *record.Name,
+					"type", *record.Type,
+					"ttl", *record.TTL)
+			}
+
+			return true
+		}, 10*time.Minute, 15*time.Second).Should(gomega.BeTrue(),
+			"TXT records should be created in Route53 for DNS-01 challenge")
+
+		// Validate record format
+		for _, record := range challengeRecords {
+			// Verify it's a TXT record
+			gomega.Expect(*record.Type).To(gomega.Equal("TXT"), "Record should be TXT type")
+
+			// Verify name starts with _acme-challenge
+			gomega.Expect(*record.Name).To(gomega.ContainSubstring("_acme-challenge"),
+				"Record name should contain _acme-challenge")
+
+			// Verify TTL is reasonable (usually 60 seconds for ACME challenges)
+			gomega.Expect(*record.TTL).To(gomega.BeNumerically("<=", 300),
+				"TTL should be short for challenge records")
+
+			// Verify record has a value
+			gomega.Expect(len(record.ResourceRecords)).To(gomega.BeNumerically(">", 0),
+				"TXT record should have a value")
+
+			ginkgo.GinkgoLogr.Info("✅ TXT record validated",
+				"name", *record.Name,
+				"ttl", *record.TTL,
+				"value", *record.ResourceRecords[0].Value)
+		}


⚠️ Potential issue | 🟡 Minor

Potential nil-pointer dereference on record.TTL will produce a panic instead of a clear test failure.

TTL is a *int64 and will be nil for alias-type record sets. Although the filter selects only TXT records (which should always carry a TTL), a defensive nil-guard keeps the failure message useful if an unexpected record slips through.

🛡️ Suggested nil-safe logging and assertion

for _, record := range records { + ttl := int64(0) + if record.TTL != nil { + ttl = *record.TTL + } ginkgo.GinkgoLogr.Info("Found TXT record", "name", *record.Name, "type", *record.Type, - "ttl", *record.TTL) + "ttl", ttl) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Verify we have records for our domains

for _, record := range records {

ginkgo.GinkgoLogr.Info("Found TXT record",

"name", *record.Name,

"type", *record.Type,

"ttl", *record.TTL)

}

return true

}, 10*time.Minute, 15*time.Second).Should(gomega.BeTrue(),

"TXT records should be created in Route53 for DNS-01 challenge")

// Validate record format

for _, record := range challengeRecords {

// Verify it's a TXT record

gomega.Expect(*record.Type).To(gomega.Equal("TXT"), "Record should be TXT type")

// Verify name starts with _acme-challenge

gomega.Expect(*record.Name).To(gomega.ContainSubstring("_acme-challenge"),

"Record name should contain _acme-challenge")

// Verify TTL is reasonable (usually 60 seconds for ACME challenges)

gomega.Expect(*record.TTL).To(gomega.BeNumerically("<=", 300),

"TTL should be short for challenge records")

// Verify record has a value

gomega.Expect(len(record.ResourceRecords)).To(gomega.BeNumerically(">", 0),

"TXT record should have a value")

ginkgo.GinkgoLogr.Info("✅ TXT record validated",

"name", *record.Name,

"ttl", *record.TTL,

"value", *record.ResourceRecords[0].Value)

}

// Verify we have records for our domains

for _, record := range records {

ttl := int64(0)

if record.TTL != nil {

ttl = *record.TTL

}

ginkgo.GinkgoLogr.Info("Found TXT record",

"name", *record.Name,

"type", *record.Type,

"ttl", ttl)

}

return true

}, 10*time.Minute, 15*time.Second).Should(gomega.BeTrue(),

"TXT records should be created in Route53 for DNS-01 challenge")

// Validate record format

for _, record := range challengeRecords {

// Verify it's a TXT record

gomega.Expect(*record.Type).To(gomega.Equal("TXT"), "Record should be TXT type")

// Verify name starts with _acme-challenge

gomega.Expect(*record.Name).To(gomega.ContainSubstring("_acme-challenge"),

"Record name should contain _acme-challenge")

// Verify TTL is reasonable (usually 60 seconds for ACME challenges)

gomega.Expect(*record.TTL).To(gomega.BeNumerically("<=", 300),

"TTL should be short for challenge records")

// Verify record has a value

gomega.Expect(len(record.ResourceRecords)).To(gomega.BeNumerically(">", 0),

"TXT record should have a value")

ginkgo.GinkgoLogr.Info("✅ TXT record validated",

"name", *record.Name,

"ttl", *record.TTL,

"value", *record.ResourceRecords[0].Value)

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/e2e/certman_operator_tests.go` around lines 514 - 547, The test currently dereferences record.TTL (and also *record.ResourceRecords[0].Value in the log) which can panic if TTL or ResourceRecords/Value is nil; update the validation loop over challengeRecords to first assert TTL is not nil (e.g. gomega.Expect(record.TTL).NotTo(gomega.BeNil(), "Record TTL should not be nil") ) before using *record.TTL, and similarly ensure ResourceRecords and ResourceRecords[0].Value are non-nil before dereferencing for logging; adjust the ginkgo.GinkgoLogr.Info calls to check pointers and use the safe, already-asserted values (record.TTL, record.ResourceRecords) rather than directly dereferencing without guards.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/certman_operator_tests.go`:
- Around line 550-554: The initial call to FindDNSZoneForClusterDeployment is
not retried and can fail if Hive hasn't created the DNSZone yet; wrap that
lookup in a gomega.Eventually closure that repeatedly calls
FindDNSZoneForClusterDeployment (using the same dynamicClient,
certConfig.TestNamespace and clusterDeploymentName) and asserts no error and
non-nil DNSZone before proceeding; apply the exact same change to the
TXT-cleanup test where the identical pattern appears (the second
FindDNSZoneForClusterDeployment usage).

In `@test/e2e/utils/utils.go`:
- Around line 1688-1709: ListAcmeChallengeTXTRecords currently calls
client.ListResourceRecordSets once and drops records when Route53 pages results;
fix by iterating pages until result.IsTruncated is false and accumulate all
ResourceRecordSets before filtering. Implement pagination by either using
client.ListResourceRecordSetsPages (if available) or by repeatedly calling
client.ListResourceRecordSets with input.StartRecordName and
input.StartRecordType set from result.NextRecordName and result.NextRecordType,
appending each page's ResourceRecordSets to a slice, then apply the existing TXT
+ "_acme-challenge" filter to the combined slice before returning.

---

Duplicate comments:
In `@test/e2e/certman_operator_tests.go`:
- Around line 621-654: The loop validating challengeRecords dereferences
optional fields without nil checks and can panic; before asserting or logging
use explicit nil guards: for each record in challengeRecords first
gomega.Expect(record.TTL).NotTo(gomega.BeNil(), ...) then perform
gomega.Expect(*record.TTL).To(gomega.BeNumerically("<=", 300), ...); similarly
ensure gomega.Expect(len(record.ResourceRecords)).To(gomega.BeNumerically(">",
0), ...) then gomega.Expect(record.ResourceRecords[0]).NotTo(gomega.BeNil(),
...) and gomega.Expect(record.ResourceRecords[0].Value).NotTo(gomega.BeNil(),
...) before dereferencing *record.ResourceRecords[0].Value for assertions or
logging; update the ginkgo.GinkgoLogr.Info calls to only dereference after those
nil checks (or log safely by checking nil).

coderabbitai · 2026-02-19T09:22:29Z

test/e2e/certman_operator_tests.go

+	ginkgo.It("should create TXT records in Route53 for DNS-01 challenge", func(ctx context.Context) {
+		// Find the DNSZone for our cluster
+		dnsZone, err := utils.FindDNSZoneForClusterDeployment(ctx, dynamicClient,
+			certConfig.TestNamespace, clusterDeploymentName)
+		gomega.Expect(err).ShouldNot(gomega.HaveOccurred(), "DNSZone should exist for ClusterDeployment")


⚠️ Potential issue | 🟠 Major

Initial FindDNSZoneForClusterDeployment call is not retried — test fails immediately if the DNSZone hasn't been created yet.

Hive creates the DNSZone asynchronously after ClusterDeployment creation. The bare gomega.Expect at line 554 asserts the DNSZone exists right now; if Hive hasn't reconciled yet, the test hard-fails before reaching the Eventually readiness poll. The identical pattern is repeated at lines 762–765 in the TXT-cleanup test.

🛡️ Proposed fix — wrap initial lookup in Eventually

- dnsZone, err := utils.FindDNSZoneForClusterDeployment(ctx, dynamicClient, - certConfig.TestNamespace, clusterDeploymentName) - gomega.Expect(err).ShouldNot(gomega.HaveOccurred(), "DNSZone should exist for ClusterDeployment") - - // Verify DNSZone is ready before querying Route53 - gomega.Eventually(func() bool { - // Re-fetch DNSZone to get latest status - freshDNSZone, err := utils.FindDNSZoneForClusterDeployment(ctx, dynamicClient, - certConfig.TestNamespace, clusterDeploymentName) - if err != nil { - return false - } - ready, err := utils.VerifyDNSZoneReady(freshDNSZone) - if err != nil { - ginkgo.GinkgoLogr.Error(err, "Error checking DNSZone readiness") - return false - } - if !ready { - ginkgo.GinkgoLogr.Info("DNSZone not ready yet, waiting...") - } else { - // Update dnsZone with fresh data when ready - dnsZone = freshDNSZone - } - return ready - }, 5*time.Minute, 10*time.Second).Should(gomega.BeTrue(), "DNSZone should be ready") + // Wait for DNSZone to exist AND be ready (creation is asynchronous) + var dnsZone *unstructured.Unstructured + gomega.Eventually(func() bool { + freshDNSZone, err := utils.FindDNSZoneForClusterDeployment(ctx, dynamicClient, + certConfig.TestNamespace, clusterDeploymentName) + if err != nil { + ginkgo.GinkgoLogr.Info("DNSZone not found yet, waiting...", "error", err) + return false + } + ready, err := utils.VerifyDNSZoneReady(freshDNSZone) + if err != nil { + ginkgo.GinkgoLogr.Error(err, "Error checking DNSZone readiness") + return false + } + if !ready { + ginkgo.GinkgoLogr.Info("DNSZone not ready yet, waiting...") + return false + } + dnsZone = freshDNSZone + return true + }, 5*time.Minute, 10*time.Second).Should(gomega.BeTrue(), "DNSZone should exist and be ready")

Apply the same fix to lines 762–782.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/e2e/certman_operator_tests.go` around lines 550 - 554, The initial call to FindDNSZoneForClusterDeployment is not retried and can fail if Hive hasn't created the DNSZone yet; wrap that lookup in a gomega.Eventually closure that repeatedly calls FindDNSZoneForClusterDeployment (using the same dynamicClient, certConfig.TestNamespace and clusterDeploymentName) and asserts no error and non-nil DNSZone before proceeding; apply the exact same change to the TXT-cleanup test where the identical pattern appears (the second FindDNSZoneForClusterDeployment usage).

coderabbitai · 2026-02-19T09:22:29Z

test/e2e/utils/utils.go

+// ListAcmeChallengeTXTRecords queries Route53 for _acme-challenge TXT records in a hosted zone
+func ListAcmeChallengeTXTRecords(client *route53.Route53, hostedZoneID string) ([]*route53.ResourceRecordSet, error) {
+	input := &route53.ListResourceRecordSetsInput{
+		HostedZoneId: aws.String(hostedZoneID),
+	}
+
+	result, err := client.ListResourceRecordSets(input)
+	if err != nil {
+		return nil, fmt.Errorf("failed to list Route53 resource record sets: %w", err)
+	}
+
+	// Filter for _acme-challenge TXT records
+	var challengeRecords []*route53.ResourceRecordSet
+	for _, record := range result.ResourceRecordSets {
+		if record.Type != nil && *record.Type == "TXT" &&
+			record.Name != nil && strings.Contains(*record.Name, "_acme-challenge") {
+			challengeRecords = append(challengeRecords, record)
+		}
+	}
+
+	return challengeRecords, nil
+}


⚠️ Potential issue | 🟠 Major

ListAcmeChallengeTXTRecords silently drops records when the hosted zone is paginated.

ListResourceRecordSets returns at most 300 records per call and sets result.IsTruncated = true when more exist. The current implementation makes exactly one call, so ACME-challenge TXT records beyond the first page are silently ignored. The Eventually poll loop in the test would then incorrectly conclude that no challenge records exist.

🛡️ Proposed pagination fix

func ListAcmeChallengeTXTRecords(client *route53.Route53, hostedZoneID string) ([]*route53.ResourceRecordSet, error) { - input := &route53.ListResourceRecordSetsInput{ - HostedZoneId: aws.String(hostedZoneID), - } - - result, err := client.ListResourceRecordSets(input) - if err != nil { - return nil, fmt.Errorf("failed to list Route53 resource record sets: %w", err) - } - - // Filter for _acme-challenge TXT records var challengeRecords []*route53.ResourceRecordSet - for _, record := range result.ResourceRecordSets { - if record.Type != nil && *record.Type == "TXT" && - record.Name != nil && strings.Contains(*record.Name, "_acme-challenge") { - challengeRecords = append(challengeRecords, record) + input := &route53.ListResourceRecordSetsInput{ + HostedZoneId: aws.String(hostedZoneID), + } + for { + result, err := client.ListResourceRecordSets(input) + if err != nil { + return nil, fmt.Errorf("failed to list Route53 resource record sets: %w", err) + } + for _, record := range result.ResourceRecordSets { + if record.Type != nil && *record.Type == "TXT" && + record.Name != nil && strings.Contains(*record.Name, "_acme-challenge") { + challengeRecords = append(challengeRecords, record) + } + } + if result.IsTruncated == nil || !*result.IsTruncated { + break } + input.StartRecordName = result.NextRecordName + input.StartRecordType = result.NextRecordType + input.StartRecordIdentifier = result.NextRecordIdentifier } return challengeRecords, nil }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// ListAcmeChallengeTXTRecords queries Route53 for _acme-challenge TXT records in a hosted zone

func ListAcmeChallengeTXTRecords(client *route53.Route53, hostedZoneID string) ([]*route53.ResourceRecordSet, error) {

input := &route53.ListResourceRecordSetsInput{

HostedZoneId: aws.String(hostedZoneID),

}

result, err := client.ListResourceRecordSets(input)

if err != nil {

return nil, fmt.Errorf("failed to list Route53 resource record sets: %w", err)

}

// Filter for _acme-challenge TXT records

var challengeRecords []*route53.ResourceRecordSet

for _, record := range result.ResourceRecordSets {

if record.Type != nil && *record.Type == "TXT" &&

record.Name != nil && strings.Contains(*record.Name, "_acme-challenge") {

challengeRecords = append(challengeRecords, record)

}

}

return challengeRecords, nil

}

// ListAcmeChallengeTXTRecords queries Route53 for _acme-challenge TXT records in a hosted zone

func ListAcmeChallengeTXTRecords(client *route53.Route53, hostedZoneID string) ([]*route53.ResourceRecordSet, error) {

var challengeRecords []*route53.ResourceRecordSet

input := &route53.ListResourceRecordSetsInput{

HostedZoneId: aws.String(hostedZoneID),

}

for {

result, err := client.ListResourceRecordSets(input)

if err != nil {

return nil, fmt.Errorf("failed to list Route53 resource record sets: %w", err)

}

for _, record := range result.ResourceRecordSets {

if record.Type != nil && *record.Type == "TXT" &&

record.Name != nil && strings.Contains(*record.Name, "_acme-challenge") {

challengeRecords = append(challengeRecords, record)

}

}

if result.IsTruncated == nil || !*result.IsTruncated {

break

}

input.StartRecordName = result.NextRecordName

input.StartRecordType = result.NextRecordType

input.StartRecordIdentifier = result.NextRecordIdentifier

}

return challengeRecords, nil

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/e2e/utils/utils.go` around lines 1688 - 1709, ListAcmeChallengeTXTRecords currently calls client.ListResourceRecordSets once and drops records when Route53 pages results; fix by iterating pages until result.IsTruncated is false and accumulate all ResourceRecordSets before filtering. Implement pagination by either using client.ListResourceRecordSetsPages (if available) or by repeatedly calling client.ListResourceRecordSets with input.StartRecordName and input.StartRecordType set from result.NextRecordName and result.NextRecordType, appending each page's ResourceRecordSets to a slice, then apply the existing TXT + "_acme-challenge" filter to the combined slice before returning.

codecov · 2026-02-19T09:25:27Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.15%. Comparing base (4b2fc9d) to head (cb49991).

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #424   +/-   ##
=======================================
  Coverage   57.15%   57.15%           
=======================================
  Files          29       29           
  Lines        2138     2138           
=======================================
  Hits         1222     1222           
  Misses        802      802           
  Partials      114      114

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

openshift-ci · 2026-02-19T09:26:24Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: TheUndeadKing
Once this PR has been reviewed and has the lgtm label, please assign rbhilare for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2026-02-19T11:51:15Z

@TheUndeadKing: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/validate	`cb49991`	link	true	`/test validate`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 19, 2026

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 19, 2026

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

Added E2E testcase for ACME DNS-01 challenge

92eea8f

TheUndeadKing force-pushed the SREP-3024 branch from e6986ec to 92eea8f Compare February 19, 2026 09:17

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 19, 2026

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

openshift-ci bot requested review from Tafhim and ravitri February 19, 2026 09:23

chores

5100c2d

TheUndeadKing changed the title ~~SREP-3024: Add E2E tests for ACME DNS-01 challenge verification~~ [WiP]SREP-3024: Add E2E tests for ACME DNS-01 challenge verification Feb 19, 2026

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 19, 2026

TheUndeadKing force-pushed the SREP-3024 branch from 81199e0 to 5100c2d Compare February 19, 2026 11:04

Fixing chores

cb49991

TheUndeadKing force-pushed the SREP-3024 branch from be85501 to cb49991 Compare February 19, 2026 11:41

Comments

Conversation

TheUndeadKing commented Feb 19, 2026

Summary

Changes Made

New E2E Tests

Utility Functions Added

Code Cleanup

Test Coverage

Technical Details

Testing

Uh oh!

openshift-ci-robot commented Feb 19, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes Made

New E2E Tests

Utility Functions Added

Code Cleanup

Test Coverage

Technical Details

Testing

Uh oh!

coderabbitai bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

openshift-ci bot commented Feb 19, 2026

Uh oh!

openshift-ci bot commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

openshift-ci-robot commented Feb 19, 2026 •

edited by openshift-ci bot

Loading

coderabbitai bot commented Feb 19, 2026 •

edited

Loading

codecov bot commented Feb 19, 2026 •

edited

Loading