feat(dis-cache-operator): scaffold self-service Azure Managed Redis#3472
feat(dis-cache-operator): scaffold self-service Azure Managed Redis#3472arealmaas wants to merge 11 commits into
Conversation
…perator (RFC 0014) Adds a new operator that reconciles a Redis CR into Azure Managed Redis (Microsoft.Cache/redisEnterprise) via ASO. Mirrors dis-vault-operator (single-resource-per-CR, federated-identity-owned) and dis-pgsql-operator (private endpoint + private DNS + AKS VNet link). Entra-only data-plane access; no shared keys exposed. Shared privatelink.redis.azure.net zone (get-or-create, label-managed). Access policy assignment is deferred to a follow-up PR pending the ASO type. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR introduces dis-cache-operator: a Kubernetes operator that provisions Azure Managed Redis (Redis Enterprise) via a Redis CRD, handling identity resolution, deterministic naming, ASO resource creation (cluster, database, private endpoint, DNS), status aggregation, manifests, build tooling, and CI/CD workflows. ChangesDIS Redis Cache Operator
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 16
🧹 Nitpick comments (3)
pr_description.md (1)
34-81: ⚡ Quick winAdd language identifier to fenced code block.
The ASCII diagram uses a fenced code block without a language specifier. While the content renders fine, adding a language identifier (e.g.,
textorascii) improves tooling compatibility and silences the linter.📝 Proposed fix
-``` +```text ┌──────────────────────────────┐🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pr_description.md` around lines 34 - 81, The fenced ASCII diagram in pr_description.md lacks a language identifier; update the opening triple-backtick that precedes the diagram to include a language specifier (e.g., change ``` to ```text) so linters and tooling recognize it as plain text—look for the fenced block containing the big ASCII diagram and modify its opening fence accordingly.services/dis-redis-operator/.omc/sessions/0f974381-8734-4756-95d3-bc013825cdcd.json (1)
1-8: 💤 Low valueConsider excluding AI session artifacts from version control.
This file appears to be a session artifact from an AI code generation tool (OMC/Claude Code). Development tool artifacts are typically excluded from source control via
.gitignoreto reduce repository noise and prevent accumulation of ephemeral session data.If these artifacts serve a purpose (e.g., audit trail, reproducibility), please clarify; otherwise, consider adding
.omc/to.gitignore.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/dis-redis-operator/.omc/sessions/0f974381-8734-4756-95d3-bc013825cdcd.json` around lines 1 - 8, This file is an ephemeral AI session artifact stored under the .omc session output (e.g., the session file named 0f974381-8734-4756-95d3-bc013825cdcd.json); exclude these artifacts from version control by adding the .omc/ directory to .gitignore, remove any already-committed session files from the repo (unstage/remove them so they no longer track), and commit the updated .gitignore; if you need to retain certain session logs for auditing, move those specific files into a dedicated tracked audit directory and document that policy so only intended files are committed.services/dis-redis-operator/config/samples/kustomization.yaml (1)
1-4: ⚡ Quick winConsider including the service-account-based Redis sample.
The review context indicates two sample Redis CRs exist (
redis_v1alpha1_redis.yamlandredis_v1alpha1_service_account_redis.yaml), but only the first is listed in this kustomization. If both samples demonstrate different authentication patterns (ApplicationIdentity vs ServiceAccount) and have distinct names, including both would help users test each pattern.If they would conflict or should be applied separately, this is fine as-is.
📝 Proposed addition if both samples should be included
namespace: default resources: - redis_v1alpha1_redis.yaml +- redis_v1alpha1_service_account_redis.yaml🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/dis-redis-operator/config/samples/kustomization.yaml` around lines 1 - 4, Update the kustomization resource list to include the service-account-based sample by adding an entry for "redis_v1alpha1_service_account_redis.yaml" alongside "redis_v1alpha1_redis.yaml" in the resources block of the kustomization.yaml; ensure the CR filenames and metadata (names/namespaces) in the manifest "redis_v1alpha1_service_account_redis.yaml" do not conflict with the existing Redis CR (check metadata.name and metadata.namespace) so both can be applied together, otherwise document/keep them separate and leave only the intended sample referenced.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/dis-redis-lint-test.yml:
- Around line 58-61: The CI step currently runs a mutating command ("go mod
tidy") before tests; replace that with non-mutating verification targets so CI
fails on drift instead of fixing files. In the "Running Tests" job step remove
"go mod tidy" and invoke the Make targets that validate dependencies and run
tests without modifying files (e.g., run "make verify-deps && make test-ci" or
your project's equivalent), ensuring the step keeps the existing "make test"
behavior replaced by the CI-safe "make test-ci" target.
In @.github/workflows/dis-redis-release.yml:
- Around line 48-49: The checkout step using
actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd should set
persist-credentials: false to avoid exposing the workflow token in
release/publish jobs; update the Checkout step where uses:
actions/checkout@de0fac2e... to include the persist-credentials: false input,
and make the same change for the other Checkout occurrence referenced (the one
also using actions/checkout in the same workflow).
- Around line 25-28: The workflow-wide permissions block currently grants
packages: write and id-token: write; change the global permissions block
(permissions) to minimal defaults (e.g., contents: read, packages: read or none,
id-token: none) and then grant packages: write and id-token: write only inside
the specific job(s) that perform publishing (e.g., the release/publish job) by
adding a permissions stanza under that job; update the permissions keys
(packages, id-token) accordingly so only publishing jobs get write access while
the global scope remains minimal.
- Around line 51-52: The second invocation of the action
./actions/flux/build-push-image in the build-release-flux-oci-release job is
missing required inputs (azure_subscription_id, azure_app_id, azure_tenant_id);
update the with block for that invocation to include the same three Azure
credential inputs used in the first invocation (azure_subscription_id,
azure_app_id, azure_tenant_id) so the action metadata requirements are satisfied
and the job can authenticate successfully.
In `@services/dis-redis-operator/api/v1alpha1/redis_types.go`:
- Around line 76-88: Add a kubebuilder XValidation annotation to
RedisPersistence to reject specs that set both AOF and RDB; specifically add a
comment annotation above the RedisPersistence type like //
+kubebuilder:validation:XValidation:rule="!(has(self.aof) && has(self.rdb))" and
a friendly message via // +kubebuilder:validation:XValidation:message="Only one
of 'aof' or 'rdb' may be set" so the API server will enforce the mutual
exclusivity for the AOF and RDB fields on the RedisPersistence struct.
In `@services/dis-redis-operator/config/manager/manager.yaml`:
- Around line 46-72: The pod spec currently sets
securityContext.readOnlyRootFilesystem: true but leaves volumeMounts and volumes
empty; add an emptyDir volume (name e.g. tmp-writable) to the pod spec's volumes
and add a matching volumeMount entry in the container's volumeMounts with
mountPath: /tmp and no readOnly flag so the container (and libraries used by
controller-runtime) have a writable /tmp; reference the existing keys
securityContext, volumeMounts, and volumes when making the change.
In `@services/dis-redis-operator/config/rbac/role.yaml`:
- Around line 69-80: The Role currently grants overly broad write permissions on
the source Redis CRs: remove the "create" and "delete" verbs from the verbs list
for the resource "redises" in the Role/ClusterRole defined in role.yaml so the
service account keeps read/watch and needed update/patch operations (e.g. keep
"get", "list", "watch", "patch", "update") but cannot create or delete
user-authored Redis CRs; update the verbs array under the "resources: - redises"
block accordingly.
In `@services/dis-redis-operator/Dockerfile`:
- Line 2: The Dockerfile's builder stage uses the mutable tag
"golang:1.26.3"—replace it with a digest-pinned image (e.g., the same
golang:1.26.3 image referenced by its sha256 digest) so builds are reproducible;
update the FROM line in the Dockerfile's builder stage (the "FROM golang:1.26.3
AS builder" statement) to use the digest form, obtaining the correct sha256 from
Docker Hub or by pulling the image locally, and follow the same digest-pinning
pattern used in the other services (lakmus, k6-image).
In `@services/dis-redis-operator/go.mod`:
- Around line 67-75: Update the vulnerable gRPC dependency in go.mod by changing
the google.golang.org/grpc version to at least v1.79.3 (replace the current
v1.78.0 reference) to remediate CVE-2026-33186, then run go get
google.golang.org/grpc@v1.79.3 and go mod tidy to ensure the lockfile and
indirect deps are updated; verify the service builds and tests pass (check for
any usages in your modules that import google.golang.org/grpc and update imports
if tooling reports mismatches).
In `@services/dis-redis-operator/internal/controller/redis_controller_network.go`:
- Around line 30-72: Both ensureSharedDNSZone and ensureSharedVNetLink are
create-only and skip reconciliation when the resource exists; change them to
upsert (create-or-update) so drift in spec/labels/annotations is reconciled. For
each function (ensureSharedDNSZone calling redispkg.BuildSharedPrivateDNSZone
and ensureSharedVNetLink calling redispkg.BuildSharedVNetLink) after r.Get(...)
finds an existing resource, compare and merge the desired fields (spec, labels,
annotations, relevant status-less fields) into the current object and call
r.Update (or use controllerutil.CreateOrUpdate) instead of returning nil; keep
the create path but handle apierrors.IsAlreadyExists gracefully and handle
update conflicts by retrying or returning the wrapped error. Ensure you update
the same resource types used (networkv1.PrivateDnsZone and
networkv1.PrivateDnsZonesVirtualNetworkLink) and preserve immutable fields (UID,
ResourceVersion, OwnerReferences) when copying values.
In `@services/dis-redis-operator/internal/controller/redis_controller_test.go`:
- Around line 9-14: Replace the tautological spec in the "smoke check:
reconciler scheme is registered" test with a real reconciler behavior assertion:
instantiate the controller's Reconcile logic (or the reconciler type used in
redis_controller_test.go), use a controller-runtime fake client to create a
Redis custom resource in the identity-pending state, call the reconciler's
Reconcile(ctx, request) and assert the returned Result requests a requeue (or
non-zero requeue duration) and that the object's status or conditions
transitioned as expected; update the test to refer to the reconciler type/method
used in this package (the reconciler struct and its Reconcile method) and the
Redis CR name/type to locate the code under test.
In `@services/dis-redis-operator/internal/controller/redis_controller.go`:
- Around line 165-167: The current reconcile early-return checks only
clusterReady and databaseReady (vars clusterReady, databaseReady) and stops
requeueing once they are true, which ignores private networking conditions;
update the readiness gate in the reconcile function so it also checks the
PrivateEndpoint and PrivateDNS condition objects (e.g., look up vars/conditions
named privateEndpointReady and privateDNSReady or their ConditionType values)
and include them in the OR expression that triggers ctrl.Result{RequeueAfter:
provisioningRequeueDelay}, nil so the controller continues periodic requeues
until all four conditions (clusterReady, databaseReady, PrivateEndpoint,
PrivateDNS) have Found==true and Status==metav1.ConditionTrue; ensure you
reference the same condition names used elsewhere in this file and preserve
existing requeue delay (provisioningRequeueDelay) behavior.
In `@services/dis-redis-operator/internal/redis/builders.go`:
- Around line 175-214: BuildPrivateEndpoint currently uses the
clusterKubernetesName parameter to populate PrivateLinkServiceReference.Name
without validation; ensure clusterKubernetesName is non-empty before composing
the pev1.PrivateLinkServiceConnection (e.g., validate at start of
BuildPrivateEndpoint that clusterKubernetesName != "" and return a descriptive
error if empty) so PrivateLinkServiceReference.Name is never left blank during
reconciliation.
In `@services/dis-redis-operator/internal/redis/identity.go`:
- Around line 62-77: In ActiveAuthReferenceName, add an explicit check that
rejects the case when both r.Spec.ServiceAccountRef and r.Spec.IdentityRef are
non-nil and return an error (e.g., "exactly one of identityRef or
serviceAccountRef must be set") before the existing switch so you don't silently
prefer ServiceAccountRef; keep the existing empty-name validation for
ServiceAccountRef.Name and IdentityRef.Name unchanged and mirror the same
invariant enforced by ResolveOwnerIdentity.
In `@services/dis-redis-operator/internal/redis/status.go`:
- Around line 50-88: AggregateReadyCondition currently returns Ready=True when
called with an empty conditions slice; add an explicit guard at the top of
AggregateReadyCondition to detect when len(conditions) == 0 and return a
non-ready condition (e.g., metav1.ConditionUnknown) by calling NewCondition with
redisv1alpha1.ConditionReady, the provided generation, metav1.ConditionUnknown
and a clear reason/message like "NoDependencies" / "no dependency conditions
present". Keep the subsequent hasFalse/hasUnknown logic unchanged so existing
cases still apply when conditions are provided.
In `@services/dis-redis-operator/Makefile`:
- Around line 221-227: The install/uninstall Makefile targets currently suppress
kustomize build failures via the "|| true" in the command assigned to variable
out, which hides real rendering errors; update the install and uninstall recipe
lines that set out (the ones referencing "$(KUSTOMIZE)" build config/crd) to
remove the "|| true" so the shell command fails on kustomize errors, and then
guard the subsequent apply/delete on whether $$out is empty (i.e. only skip when
build produces no output) so install/uninstall for targets install and uninstall
(and their use of KUSTOMIZE/KUBECTL) fail fast on build errors instead of
silently succeeding.
---
Nitpick comments:
In `@pr_description.md`:
- Around line 34-81: The fenced ASCII diagram in pr_description.md lacks a
language identifier; update the opening triple-backtick that precedes the
diagram to include a language specifier (e.g., change ``` to ```text) so linters
and tooling recognize it as plain text—look for the fenced block containing the
big ASCII diagram and modify its opening fence accordingly.
In
`@services/dis-redis-operator/.omc/sessions/0f974381-8734-4756-95d3-bc013825cdcd.json`:
- Around line 1-8: This file is an ephemeral AI session artifact stored under
the .omc session output (e.g., the session file named
0f974381-8734-4756-95d3-bc013825cdcd.json); exclude these artifacts from version
control by adding the .omc/ directory to .gitignore, remove any
already-committed session files from the repo (unstage/remove them so they no
longer track), and commit the updated .gitignore; if you need to retain certain
session logs for auditing, move those specific files into a dedicated tracked
audit directory and document that policy so only intended files are committed.
In `@services/dis-redis-operator/config/samples/kustomization.yaml`:
- Around line 1-4: Update the kustomization resource list to include the
service-account-based sample by adding an entry for
"redis_v1alpha1_service_account_redis.yaml" alongside
"redis_v1alpha1_redis.yaml" in the resources block of the kustomization.yaml;
ensure the CR filenames and metadata (names/namespaces) in the manifest
"redis_v1alpha1_service_account_redis.yaml" do not conflict with the existing
Redis CR (check metadata.name and metadata.namespace) so both can be applied
together, otherwise document/keep them separate and leave only the intended
sample referenced.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c402d165-943c-4a8b-bceb-3e2b36b866b7
⛔ Files ignored due to path filters (2)
rfcs/0014-self-service-managed-redis.mdis excluded by!rfcs/**services/dis-redis-operator/go.sumis excluded by!**/*.sum
📒 Files selected for processing (64)
.github/workflows/dis-redis-lint-test.yml.github/workflows/dis-redis-release.yml.release-please-manifest.jsonpr_description.mdrelease-please-config.jsonservices/dis-redis-operator/.omc/sessions/0f974381-8734-4756-95d3-bc013825cdcd.jsonservices/dis-redis-operator/AGENTS.mdservices/dis-redis-operator/CHANGELOG.mdservices/dis-redis-operator/Dockerfileservices/dis-redis-operator/Makefileservices/dis-redis-operator/PROJECTservices/dis-redis-operator/README.mdservices/dis-redis-operator/api/v1alpha1/groupversion_info.goservices/dis-redis-operator/api/v1alpha1/redis_types.goservices/dis-redis-operator/api/v1alpha1/zz_generated.deepcopy.goservices/dis-redis-operator/cmd/main.goservices/dis-redis-operator/config/crd/bases/redis.dis.altinn.cloud_redises.yamlservices/dis-redis-operator/config/crd/kustomization.yamlservices/dis-redis-operator/config/default/deploy_vars_patch.yamlservices/dis-redis-operator/config/default/kustomization.yamlservices/dis-redis-operator/config/default/manager_metrics_patch.yamlservices/dis-redis-operator/config/default/metrics_service.yamlservices/dis-redis-operator/config/kind/applicationidentities.yamlservices/dis-redis-operator/config/kind/kustomization.yamlservices/dis-redis-operator/config/kind/manager_kind_patch.yamlservices/dis-redis-operator/config/kind/serviceaccounts.yamlservices/dis-redis-operator/config/manager/kustomization.yamlservices/dis-redis-operator/config/manager/manager.yamlservices/dis-redis-operator/config/network-policy/allow-metrics-traffic.yamlservices/dis-redis-operator/config/network-policy/kustomization.yamlservices/dis-redis-operator/config/prometheus/kustomization.yamlservices/dis-redis-operator/config/prometheus/monitor.yamlservices/dis-redis-operator/config/rbac/kustomization.yamlservices/dis-redis-operator/config/rbac/leader_election_role.yamlservices/dis-redis-operator/config/rbac/leader_election_role_binding.yamlservices/dis-redis-operator/config/rbac/metrics_auth_role.yamlservices/dis-redis-operator/config/rbac/metrics_auth_role_binding.yamlservices/dis-redis-operator/config/rbac/metrics_reader_role.yamlservices/dis-redis-operator/config/rbac/role.yamlservices/dis-redis-operator/config/rbac/role_binding.yamlservices/dis-redis-operator/config/rbac/service_account.yamlservices/dis-redis-operator/config/samples/kustomization.yamlservices/dis-redis-operator/config/samples/redis_v1alpha1_redis.yamlservices/dis-redis-operator/config/samples/redis_v1alpha1_service_account_redis.yamlservices/dis-redis-operator/go.modservices/dis-redis-operator/hack/boilerplate.go.txtservices/dis-redis-operator/internal/config/config.goservices/dis-redis-operator/internal/config/config_test.goservices/dis-redis-operator/internal/controller/redis_auth_watch.goservices/dis-redis-operator/internal/controller/redis_controller.goservices/dis-redis-operator/internal/controller/redis_controller_network.goservices/dis-redis-operator/internal/controller/redis_controller_role.goservices/dis-redis-operator/internal/controller/redis_controller_status.goservices/dis-redis-operator/internal/controller/redis_controller_test.goservices/dis-redis-operator/internal/controller/suite_test.goservices/dis-redis-operator/internal/redis/builders.goservices/dis-redis-operator/internal/redis/builders_test.goservices/dis-redis-operator/internal/redis/identity.goservices/dis-redis-operator/internal/redis/identity_test.goservices/dis-redis-operator/internal/redis/labels.goservices/dis-redis-operator/internal/redis/naming.goservices/dis-redis-operator/internal/redis/naming_test.goservices/dis-redis-operator/internal/redis/status.goservices/dis-redis-operator/version.txt
- requeue gate now also waits on private endpoint and shared DNS readiness - getSharedDNSReady checks both PrivateDnsZone and VNet link before reporting True - skip stale ASO condition reporting while identity is pending; emit IdentityNotReady instead - enforce mutual exclusivity of aof/rdb on RedisPersistence via CRD XValidation - validate clusterKubernetesName in BuildPrivateEndpoint, mutual identity/SA refs in ActiveAuthReferenceName, and empty input in AggregateReadyCondition - Makefile: stop swallowing kustomize errors in install/uninstall, add test-e2e with guaranteed cleanup, align GOLANGCI_LINT_VERSION with CI (v2.11.4) - workflows: persist-credentials: false on checkouts, narrow global permissions with per-job write grants, replace mutating go mod tidy with make verify-deps - redis_types.go: fix Enum marker syntax and restore resource path plural to unblock make manifests Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…operator Mechanical rename of the operator prefix from dis-redis to dis-cache: - services/dis-redis-operator/ moved to services/dis-cache-operator/ - workflow files dis-redis-*.yml renamed to dis-cache-*.yml - Go module path, image name, release tag prefix, Flux artifact name, leader-election ID, and all kustomize labels/selectors updated - CRD Kind (Redis), API group (redis.dis.altinn.cloud), and internal/redis package are unchanged — only the operator name carries the new prefix Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
services/dis-cache-operator/config/rbac/role.yaml (1)
74-75:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDrop
create/deleteon source Redis CRs.Line 74 and Line 75 still allow creating/deleting user-authored
redises, which is over-privileged for this controller pattern.Suggested RBAC narrowing
- apiGroups: - redis.dis.altinn.cloud resources: - redises verbs: - - create - - delete - get - list - patch - update - watch🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/dis-cache-operator/config/rbac/role.yaml` around lines 74 - 75, The Role currently grants overly broad permissions by including the "create" and "delete" verbs for the Redis custom resources; remove the "create" and "delete" entries from the verbs list that reference the Redis CR (resource name "redises") in role.yaml so the controller cannot create or delete user-authored Redis CRs—keep only the necessary read/modify verbs (e.g., get, list, watch, update, patch) required by the controller's reconcile operations.services/dis-cache-operator/go.mod (1)
93-93:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winUpgrade
google.golang.org/grpcfrom v1.78.0 before release.Line 93 pins a version with a reported critical auth-bypass advisory. Please bump to a patched release and re-resolve the module graph.
#!/bin/bash set -euo pipefail cd services/dis-cache-operator echo "Current grpc version:" GOWORK=off go list -m -f '{{.Path}} {{.Version}}' google.golang.org/grpc echo "Available updates:" GOWORK=off go list -m -u -json google.golang.org/grpc | sed -n '1,160p'🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/dis-cache-operator/go.mod` at line 93, The go.mod pins google.golang.org/grpc at v1.78.0 which has a critical advisory; update the module to a patched release (bump the version for google.golang.org/grpc in go.mod) and re-resolve the module graph by running Go module update commands (e.g., use go get to fetch the newer grpc version and then run go mod tidy) so the dependency tree is refreshed; target the latest patched release reported by go list -m -u for google.golang.org/grpc and verify with GOWORK=off go list -m -f '{{.Path}} {{.Version}}' google.golang.org/grpc that the version has been updated.
🧹 Nitpick comments (1)
.github/workflows/dis-cache-lint-test.yml (1)
58-61: ⚡ Quick winPrefer
make test-ciin this workflow.Line 61 currently runs
make test; switching tomake test-cikeeps this workflow on the CI-oriented path (DISREDIS_SKIP_ENVTEST=1) and avoids unnecessary envtest dependency in CI.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/dis-cache-lint-test.yml around lines 58 - 61, Replace the CI test invocation to use the CI-optimized target: change the workflow step that runs the Make target from "make test" to "make test-ci" so the job follows the CI path (DISREDIS_SKIP_ENVTEST=1) and avoids pulling in envtest; locate the step containing the two commands "make verify-deps" and "make test" and update the second command to "make test-ci".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/dis-cache-lint-test.yml:
- Line 29: The checkout steps use
actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd (the two occurrences
of the checkout step) without disabling credential persistence; update both
checkout steps to include persist-credentials: false so the job does not
automatically expose the GITHUB_TOKEN to subsequent steps. Locate the two
occurrences of the actions/checkout usage in the workflow and add the
persist-credentials: false input to each step, keeping the existing version pin
intact.
In `@services/dis-cache-operator/Makefile`:
- Around line 199-202: The Makefile currently prefixes the build step with a
leading hyphen which causes make to ignore failures; edit the recipe so the line
containing "$(CONTAINER_TOOL) buildx build --push --platform=$(PLATFORMS) --tag
${IMG} -f Dockerfile.cross ." is not prefixed with '-' (and likewise remove any
'-' before the matching "$(CONTAINER_TOOL) buildx rm dis-cache-operator-builder"
cleanup if present) so that buildx publish failures propagate and the target
fails instead of reporting success.
---
Duplicate comments:
In `@services/dis-cache-operator/config/rbac/role.yaml`:
- Around line 74-75: The Role currently grants overly broad permissions by
including the "create" and "delete" verbs for the Redis custom resources; remove
the "create" and "delete" entries from the verbs list that reference the Redis
CR (resource name "redises") in role.yaml so the controller cannot create or
delete user-authored Redis CRs—keep only the necessary read/modify verbs (e.g.,
get, list, watch, update, patch) required by the controller's reconcile
operations.
In `@services/dis-cache-operator/go.mod`:
- Line 93: The go.mod pins google.golang.org/grpc at v1.78.0 which has a
critical advisory; update the module to a patched release (bump the version for
google.golang.org/grpc in go.mod) and re-resolve the module graph by running Go
module update commands (e.g., use go get to fetch the newer grpc version and
then run go mod tidy) so the dependency tree is refreshed; target the latest
patched release reported by go list -m -u for google.golang.org/grpc and verify
with GOWORK=off go list -m -f '{{.Path}} {{.Version}}' google.golang.org/grpc
that the version has been updated.
---
Nitpick comments:
In @.github/workflows/dis-cache-lint-test.yml:
- Around line 58-61: Replace the CI test invocation to use the CI-optimized
target: change the workflow step that runs the Make target from "make test" to
"make test-ci" so the job follows the CI path (DISREDIS_SKIP_ENVTEST=1) and
avoids pulling in envtest; locate the step containing the two commands "make
verify-deps" and "make test" and update the second command to "make test-ci".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d008b9cb-d32f-4b98-84e3-fce749798707
⛔ Files ignored due to path filters (2)
rfcs/0014-self-service-managed-redis.mdis excluded by!rfcs/**services/dis-cache-operator/go.sumis excluded by!**/*.sum
📒 Files selected for processing (64)
.github/workflows/dis-cache-lint-test.yml.github/workflows/dis-cache-release.yml.release-please-manifest.jsonpr_description.mdrelease-please-config.jsonservices/dis-cache-operator/.omc/sessions/0f974381-8734-4756-95d3-bc013825cdcd.jsonservices/dis-cache-operator/AGENTS.mdservices/dis-cache-operator/CHANGELOG.mdservices/dis-cache-operator/Dockerfileservices/dis-cache-operator/Makefileservices/dis-cache-operator/PROJECTservices/dis-cache-operator/README.mdservices/dis-cache-operator/api/v1alpha1/groupversion_info.goservices/dis-cache-operator/api/v1alpha1/redis_types.goservices/dis-cache-operator/api/v1alpha1/zz_generated.deepcopy.goservices/dis-cache-operator/cmd/main.goservices/dis-cache-operator/config/crd/bases/redis.dis.altinn.cloud_redises.yamlservices/dis-cache-operator/config/crd/kustomization.yamlservices/dis-cache-operator/config/default/deploy_vars_patch.yamlservices/dis-cache-operator/config/default/kustomization.yamlservices/dis-cache-operator/config/default/manager_metrics_patch.yamlservices/dis-cache-operator/config/default/metrics_service.yamlservices/dis-cache-operator/config/kind/applicationidentities.yamlservices/dis-cache-operator/config/kind/kustomization.yamlservices/dis-cache-operator/config/kind/manager_kind_patch.yamlservices/dis-cache-operator/config/kind/serviceaccounts.yamlservices/dis-cache-operator/config/manager/kustomization.yamlservices/dis-cache-operator/config/manager/manager.yamlservices/dis-cache-operator/config/network-policy/allow-metrics-traffic.yamlservices/dis-cache-operator/config/network-policy/kustomization.yamlservices/dis-cache-operator/config/prometheus/kustomization.yamlservices/dis-cache-operator/config/prometheus/monitor.yamlservices/dis-cache-operator/config/rbac/kustomization.yamlservices/dis-cache-operator/config/rbac/leader_election_role.yamlservices/dis-cache-operator/config/rbac/leader_election_role_binding.yamlservices/dis-cache-operator/config/rbac/metrics_auth_role.yamlservices/dis-cache-operator/config/rbac/metrics_auth_role_binding.yamlservices/dis-cache-operator/config/rbac/metrics_reader_role.yamlservices/dis-cache-operator/config/rbac/role.yamlservices/dis-cache-operator/config/rbac/role_binding.yamlservices/dis-cache-operator/config/rbac/service_account.yamlservices/dis-cache-operator/config/samples/kustomization.yamlservices/dis-cache-operator/config/samples/redis_v1alpha1_redis.yamlservices/dis-cache-operator/config/samples/redis_v1alpha1_service_account_redis.yamlservices/dis-cache-operator/go.modservices/dis-cache-operator/hack/boilerplate.go.txtservices/dis-cache-operator/internal/config/config.goservices/dis-cache-operator/internal/config/config_test.goservices/dis-cache-operator/internal/controller/redis_auth_watch.goservices/dis-cache-operator/internal/controller/redis_controller.goservices/dis-cache-operator/internal/controller/redis_controller_network.goservices/dis-cache-operator/internal/controller/redis_controller_role.goservices/dis-cache-operator/internal/controller/redis_controller_status.goservices/dis-cache-operator/internal/controller/redis_controller_test.goservices/dis-cache-operator/internal/controller/suite_test.goservices/dis-cache-operator/internal/redis/builders.goservices/dis-cache-operator/internal/redis/builders_test.goservices/dis-cache-operator/internal/redis/identity.goservices/dis-cache-operator/internal/redis/identity_test.goservices/dis-cache-operator/internal/redis/labels.goservices/dis-cache-operator/internal/redis/naming.goservices/dis-cache-operator/internal/redis/naming_test.goservices/dis-cache-operator/internal/redis/status.goservices/dis-cache-operator/version.txt
💤 Files with no reviewable changes (26)
- services/dis-cache-operator/version.txt
- services/dis-cache-operator/config/crd/kustomization.yaml
- services/dis-cache-operator/config/rbac/kustomization.yaml
- services/dis-cache-operator/config/manager/kustomization.yaml
- services/dis-cache-operator/internal/controller/suite_test.go
- services/dis-cache-operator/hack/boilerplate.go.txt
- services/dis-cache-operator/config/default/manager_metrics_patch.yaml
- services/dis-cache-operator/config/rbac/metrics_auth_role_binding.yaml
- services/dis-cache-operator/.omc/sessions/0f974381-8734-4756-95d3-bc013825cdcd.json
- services/dis-cache-operator/config/samples/kustomization.yaml
- services/dis-cache-operator/Dockerfile
- services/dis-cache-operator/config/default/deploy_vars_patch.yaml
- services/dis-cache-operator/CHANGELOG.md
- services/dis-cache-operator/config/kind/applicationidentities.yaml
- services/dis-cache-operator/config/network-policy/kustomization.yaml
- services/dis-cache-operator/internal/controller/redis_controller_test.go
- services/dis-cache-operator/config/rbac/metrics_auth_role.yaml
- services/dis-cache-operator/internal/redis/naming_test.go
- services/dis-cache-operator/api/v1alpha1/groupversion_info.go
- services/dis-cache-operator/config/prometheus/kustomization.yaml
- services/dis-cache-operator/config/rbac/metrics_reader_role.yaml
- services/dis-cache-operator/internal/redis/naming.go
- services/dis-cache-operator/internal/config/config_test.go
- services/dis-cache-operator/internal/config/config.go
- services/dis-cache-operator/config/kind/serviceaccounts.yaml
- services/dis-cache-operator/internal/controller/redis_controller_role.go
✅ Files skipped from review due to trivial changes (9)
- services/dis-cache-operator/config/rbac/leader_election_role_binding.yaml
- services/dis-cache-operator/README.md
- services/dis-cache-operator/config/rbac/role_binding.yaml
- services/dis-cache-operator/config/kind/kustomization.yaml
- services/dis-cache-operator/PROJECT
- services/dis-cache-operator/AGENTS.md
- services/dis-cache-operator/config/default/kustomization.yaml
- services/dis-cache-operator/api/v1alpha1/zz_generated.deepcopy.go
- pr_description.md
…w fixes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- workflow: set persist-credentials: false on both checkout steps in dis-cache-lint-test.yml so GITHUB_TOKEN is not implicitly exposed to subsequent steps - Makefile: drop the leading '-' from the docker-buildx push command so publish failures propagate (the '-' on the create/rm cleanup steps is retained intentionally — those legitimately tolerate failures) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- verify-deps cleanup now chmod -R u+w the temp module cache before rm, since go mod download writes files read-only and the trap-rm previously failed with Permission denied in CI - add empty .trivyignore so the image-scan workflow can find the expected ignorefile (mirrors dis-vault-operator) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Dockerfile: pin golang:1.26.3 to its OCI image-index digest (sha256:6df14f4a...) so the builder image is reproducible, matching the lakmus pattern - go.mod: bump google.golang.org/grpc v1.78.0 -> v1.79.3 to remediate CVE-2026-33186 (indirect, pulled by ASO/controller-runtime deps); cel.dev/expr carried up to v0.25.1 via go mod tidy Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review (functionality focus)Reviewed against 1. CRITICAL — The RFC explicitly calls out creating a 2. MAJOR —
altinn-platform/services/dis-cache-operator/api/v1alpha1/redis_types.go Lines 126 to 130 in 8238be5 altinn-platform/services/dis-cache-operator/internal/redis/builders.go Lines 73 to 122 in 8238be5 3. MAJOR — Access policy assignment deferred → deployed cluster is unauthenticatable, and
4. MAJOR — "Shared" DNS zone is actually per-namespace, but every K8s CR points at the same Azure ARM resource.
altinn-platform/services/dis-cache-operator/internal/redis/builders.go Lines 227 to 275 in 8238be5 5. NIT — Dead code in
Comparison summary: identity resolution, status condition shape, deterministic naming, and the controller skeleton mirror 🤖 Generated with Claude Code |
Trivy image scan flagged two HIGH CVEs in the manager binary: - CVE-2026-29181 in go.opentelemetry.io/otel v1.40.0 (fixed in 1.41.0) - CVE-2026-39883 in go.opentelemetry.io/otel/sdk v1.40.0 (fixed in 1.43.0) Align with dis-pgsql-operator and dis-vault-operator, which already pin the OTel stack at v1.43.0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…NS pattern The PE was being created but no PrivateEndpointsPrivateDnsZoneGroup bound it into the shared privatelink.redis.azure.net zone, so DNS resolution from AKS to the cluster's private IP would never work. This adds the missing per-CR DnsZoneGroup, owner-referenced to the Redis CR (cascades on delete), pointing at the shared zone via ARM ID. Also aligns the shared-DNS reconciliation with dis-pgsql-operator: - Use SyncSpecAndLabels for in-place drift detection instead of create-only get-or-create. - Tag shared resources with managed-by=dis-cache-operator. - Wire the new zone group into getSharedDNSReady so PrivateDNSReady only flips true once the full DNS path (zone + link + zone group) is ready. - Replace inline groupID literal with a named constant. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
services/dis-cache-operator/internal/redis/builders.go (1)
278-284:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftUse a fixed operator namespace for shared DNS ASO objects.
These builders model shared Azure resources, but they place the K8s ASO objects in the caller namespace. Across multiple Redis namespaces, this can create multiple ASO objects targeting the same ARM resource, risking reconciliation contention and ambiguous lifecycle semantics.
Also applies to: 302-310
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/dis-cache-operator/internal/redis/builders.go` around lines 278 - 284, Builders like BuildSharedPrivateDNSZone currently set the Kubernetes object Namespace to the caller-provided namespace, causing duplicate ASO objects; change these builders to use a fixed operator namespace from the OperatorConfig (e.g., cfg.OperatorNamespace) instead of the function parameter namespace so shared Azure resources are represented by a single ASO object; update BuildSharedPrivateDNSZone and the other shared-resource builders referenced around lines 302-310 to set ObjectMeta.Namespace = cfg.OperatorNamespace and adjust any callers if necessary.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@services/dis-cache-operator/internal/k8s/resource_sync.go`:
- Around line 12-23: SyncSpecAndLabels dereferences existingSpec unconditionally
which can panic if callers pass nil; add a nil guard at the start of
SyncSpecAndLabels: if existingSpec == nil { updated = true /* mark spec as
changed since no existing value */ } else { keep the existing
equality.Semantic.DeepEqual(*existingSpec, desiredSpec) branch and assign
*existingSpec = desiredSpec when different }. Ensure the rest of the function
(labels handling and returned map,bool) still executes so callers get the
expected label merge and update flag even when existingSpec is nil.
---
Outside diff comments:
In `@services/dis-cache-operator/internal/redis/builders.go`:
- Around line 278-284: Builders like BuildSharedPrivateDNSZone currently set the
Kubernetes object Namespace to the caller-provided namespace, causing duplicate
ASO objects; change these builders to use a fixed operator namespace from the
OperatorConfig (e.g., cfg.OperatorNamespace) instead of the function parameter
namespace so shared Azure resources are represented by a single ASO object;
update BuildSharedPrivateDNSZone and the other shared-resource builders
referenced around lines 302-310 to set ObjectMeta.Namespace =
cfg.OperatorNamespace and adjust any callers if necessary.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 28430fea-a542-4fe6-86c2-a9708e9cf99a
📒 Files selected for processing (7)
services/dis-cache-operator/config/rbac/role.yamlservices/dis-cache-operator/internal/controller/redis_controller.goservices/dis-cache-operator/internal/controller/redis_controller_network.goservices/dis-cache-operator/internal/k8s/resource_sync.goservices/dis-cache-operator/internal/redis/builders.goservices/dis-cache-operator/internal/redis/builders_test.goservices/dis-cache-operator/internal/redis/labels.go
| func SyncSpecAndLabels[S any]( | ||
| existingSpec *S, | ||
| desiredSpec S, | ||
| existingLabels map[string]string, | ||
| desiredLabels map[string]string, | ||
| ) (map[string]string, bool) { | ||
| updated := false | ||
|
|
||
| if !equality.Semantic.DeepEqual(*existingSpec, desiredSpec) { | ||
| *existingSpec = desiredSpec | ||
| updated = true | ||
| } |
There was a problem hiding this comment.
Add a nil guard for existingSpec to avoid panic.
*existingSpec is dereferenced unconditionally at Line 20. Since this is an exported helper, a nil caller input will crash reconciliation paths unexpectedly.
Suggested patch
func SyncSpecAndLabels[S any](
existingSpec *S,
desiredSpec S,
existingLabels map[string]string,
desiredLabels map[string]string,
) (map[string]string, bool) {
updated := false
+
+ if existingSpec == nil {
+ return existingLabels, false
+ }
if !equality.Semantic.DeepEqual(*existingSpec, desiredSpec) {
*existingSpec = desiredSpec
updated = true
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func SyncSpecAndLabels[S any]( | |
| existingSpec *S, | |
| desiredSpec S, | |
| existingLabels map[string]string, | |
| desiredLabels map[string]string, | |
| ) (map[string]string, bool) { | |
| updated := false | |
| if !equality.Semantic.DeepEqual(*existingSpec, desiredSpec) { | |
| *existingSpec = desiredSpec | |
| updated = true | |
| } | |
| func SyncSpecAndLabels[S any]( | |
| existingSpec *S, | |
| desiredSpec S, | |
| existingLabels map[string]string, | |
| desiredLabels map[string]string, | |
| ) (map[string]string, bool) { | |
| updated := false | |
| if existingSpec == nil { | |
| return existingLabels, false | |
| } | |
| if !equality.Semantic.DeepEqual(*existingSpec, desiredSpec) { | |
| *existingSpec = desiredSpec | |
| updated = true | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/dis-cache-operator/internal/k8s/resource_sync.go` around lines 12 -
23, SyncSpecAndLabels dereferences existingSpec unconditionally which can panic
if callers pass nil; add a nil guard at the start of SyncSpecAndLabels: if
existingSpec == nil { updated = true /* mark spec as changed since no existing
value */ } else { keep the existing equality.Semantic.DeepEqual(*existingSpec,
desiredSpec) branch and assign *existingSpec = desiredSpec when different }.
Ensure the rest of the function (labels handling and returned map,bool) still
executes so callers get the expected label merge and update flag even when
existingSpec is nil.
Add dis-cache-operator (RFC 0014) — self-service Azure Managed Redis
This PR scaffolds a new operator,
dis-cache-operator, that reconciles aRedisCR into Azure Managed Redis (Microsoft.Cache/redisEnterprise) via Azure Service Operator. It mirrors the proven patterns fromdis-vault-operator(single-resource-per-CR, federated-identity-owned) anddis-pgsql-operator(private endpoint + shared private DNS + AKS VNet link).See RFC 0014 for the full design.
Feature Behavior (BDD)
Given a
Rediscustom resource in the team namespace that references a readyApplicationIdentityviaspec.identityRef,When the operator reconciles the CR,
Then it computes a deterministic Azure cluster name from
namespace + name + environment,And it creates ASO
RedisEnterprise(cluster) andRedisEnterpriseDatabaseresources withaccessKeysAuthentication=Disabled, TLS-only client protocol by default, and port 10000,And it creates an ASO
PrivateEndpointtargeting the cluster in the configured AKS data subnet,And it get-or-creates the shared
privatelink.redis.azure.netprivate DNS zone and the AKS VNet link to it (label-managed, not owner-referenced to any single CR),And it publishes status conditions
IdentityReady,ClusterReady,DatabaseReady,PrivateEndpointReady,PrivateDNSReady,AccessPolicyReady, and an aggregatedReady,And it populates
status.azureName,status.hostName,status.port,status.clusterResourceId,status.databaseResourceId, andstatus.ownerPrincipalId.Given the referenced identity is not yet ready (missing
status.principalIdorReady=True),When the operator reconciles,
Then it sets
IdentityReady=Falsewith reasonIdentityNotReady, emits dependent ASO conditions asIdentityNotReady(rather than reading stale state), leaves Azure resources untouched, and requeues after 5 seconds.Given the
RedisCR specifiesspec.serviceAccountRefinstead ofspec.identityRef,When the operator reconciles,
Then it resolves the principal from the workload-identity annotations (
azure.workload.identity/client-idanddis.altinn.cloud/principal-id) on the referencedServiceAccount.Given the
RedisCR is deleted,When the operator observes the deletion,
Then Kubernetes garbage collection cascades deletion to the owner-referenced cluster, database, and private endpoint resources, while the shared DNS zone and VNet link remain (they outlive any single CR).
Given the
PrivateDnsZonesVirtualNetworkLinkis not yet ready,When the operator computes
PrivateDNSReady,Then it requires both the zone and the VNet link to be Ready before reporting True, so applications never see a True DNS condition while name resolution from AKS still fails.
Given any of
ClusterReady,DatabaseReady,PrivateEndpointReady, orPrivateDNSReadyis not yet True,When the reconcile loop completes,
Then the controller requeues after
provisioningRequeueDelayinstead of waiting only for an owned-resource watch event (the shared DNS zone is label-managed and never fires watches).ASCII Diagram
Test plan
cd services/dis-cache-operator && make fmt vet test lint— all green locallymake manifests— CRD reflects theRedisPersistenceaof/rdbmutual-exclusion XValidation ruledis-cache-lint-test.ymlruns golangci-lint andmake teston the new path filterdis-cache-release.ymlbuilds the image on merge to main; release-please opens adis-cache-v0.1.0PRconfig/samples/redis_v1alpha1_redis.yamlagainst a Kind cluster; verify the CR is admitted and conditions surface as expectedSummary by CodeRabbit
New Features
Documentation
Tests
Chores