feat(stacks): add NetworkPolicy micro-segmentation controlled by Settings#396
feat(stacks): add NetworkPolicy micro-segmentation controlled by Settings#396
Conversation
…ings Add network isolation between services within a Stack namespace. When enabled via `networkpolicies.enabled=true` Setting, creates 3 NetworkPolicies: - default-deny-ingress: blocks all ingress traffic - allow-gateway-ingress: allows all ingress to gateway pods - allow-from-gateway: allows ingress only from gateway to other services Policies are owned by the Stack (garbage-collected on deletion) and can be toggled on/off dynamically. Egress is left unrestricted.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: Organization UI Review profile: CHILL Plan: Pro Disabled knowledge base sources:
⛔ Files ignored due to path filters (1)
📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughAdds network policy reconciliation to the Stack controller. A Settings feature flag ( Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(240,248,255,0.5)
participant Settings
participant Controller
participant Stack
participant K8sAPI
end
Settings->>Controller: patch `networkpolicies.enabled` (stack or "*")
Controller->>Stack: enqueue reconcile request
Controller->>Controller: Reconcile(...) -> reconcileNetworkPolicies(ctx, stack)
alt enabled
Controller->>K8sAPI: CreateOrUpdate NetworkPolicy `default-deny-ingress`
Controller->>K8sAPI: CreateOrUpdate NetworkPolicy `allow-gateway-ingress`
Controller->>K8sAPI: CreateOrUpdate NetworkPolicy `allow-from-gateway`
K8sAPI-->>Controller: return created/updated objects
else disabled
Controller->>K8sAPI: DeleteIfExists for 3 NetworkPolicies
K8sAPI-->>Controller: deletion results
end
Controller->>K8sAPI: set ownerRef -> Stack on created policies
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Document the new network micro-segmentation feature in the Settings reference page with usage example and behavior details.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
internal/resources/stacks/networkpolicies.go (1)
102-110: Centralize policy names/labels to avoid drift.
The policy names and gateway label are repeated across create/delete. A small set of constants keeps them in sync.♻️ Suggested refactor
+const ( + npDefaultDenyIngress = "default-deny-ingress" + npAllowGatewayIngress = "allow-gateway-ingress" + npAllowFromGateway = "allow-from-gateway" + gatewayLabelKey = "app.kubernetes.io/name" + gatewayLabelValue = "gateway" +) + +var networkPolicyNames = []string{ + npDefaultDenyIngress, + npAllowGatewayIngress, + npAllowFromGateway, +} @@ - Name: "default-deny-ingress", + Name: npDefaultDenyIngress, @@ - Name: "allow-gateway-ingress", + Name: npAllowGatewayIngress, @@ - MatchLabels: map[string]string{ - "app.kubernetes.io/name": "gateway", - }, + MatchLabels: map[string]string{ + gatewayLabelKey: gatewayLabelValue, + }, @@ - Name: "allow-from-gateway", + Name: npAllowFromGateway, @@ - MatchLabels: map[string]string{ - "app.kubernetes.io/name": "gateway", - }, + MatchLabels: map[string]string{ + gatewayLabelKey: gatewayLabelValue, + }, @@ - for _, name := range []string{"default-deny-ingress", "allow-gateway-ingress", "allow-from-gateway"} { + for _, name := range networkPolicyNames {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/resources/stacks/networkpolicies.go` around lines 102 - 110, The network policy names and gateway label are duplicated and can drift; introduce a small set of constants (e.g., PolicyDefaultDenyIngress, PolicyAllowGatewayIngress, PolicyAllowFromGateway, and GatewayLabelKey/Value) and use them instead of string literals in deleteNetworkPolicies and the corresponding create function(s) (look for functions that construct these policies). Update DeleteIfExists calls in deleteNetworkPolicies to reference the new constants and replace any repeated label usage in policy selectors/annotations with the GatewayLabel constants so names/labels stay centralized and consistent.internal/resources/stacks/init.go (1)
416-444: Log wildcard list failures to avoid silent skips.
If listing stacks fails, the handler currently returns nil with no signal. A log helps diagnose missing reconciles.🔎 Suggested logging
if s.IsWildcard() { stackList := &v1beta1.StackList{} if err := ctx.GetClient().List(watchCtx, stackList); err != nil { + log.FromContext(watchCtx).Error(err, "unable to list stacks for wildcard networkpolicies setting") return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/resources/stacks/init.go` around lines 416 - 444, The code path inside WithRaw[*v1beta1.Stack] that calls ctx.GetClient().List(stackList) silently returns nil on error; change it to capture the error and emit a log entry before returning so the failure is visible. Specifically, in the handler.EnqueueRequestsFromMapFunc closure where ctx.GetClient().List(watchCtx, stackList) is called, assign the error to a variable (err) and call the context logger (e.g., ctx.GetLogger().Error or ctx.Log.Error) with a clear message and the err value (include operation and resource like "listing stacks failed") and then return nil as before; keep the surrounding logic in WithRaw[*v1beta1.Stack] and the use of stackList and reconcile.Request unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/resources/stacks/init.go`:
- Around line 416-444: The code path inside WithRaw[*v1beta1.Stack] that calls
ctx.GetClient().List(stackList) silently returns nil on error; change it to
capture the error and emit a log entry before returning so the failure is
visible. Specifically, in the handler.EnqueueRequestsFromMapFunc closure where
ctx.GetClient().List(watchCtx, stackList) is called, assign the error to a
variable (err) and call the context logger (e.g., ctx.GetLogger().Error or
ctx.Log.Error) with a clear message and the err value (include operation and
resource like "listing stacks failed") and then return nil as before; keep the
surrounding logic in WithRaw[*v1beta1.Stack] and the use of stackList and
reconcile.Request unchanged.
In `@internal/resources/stacks/networkpolicies.go`:
- Around line 102-110: The network policy names and gateway label are duplicated
and can drift; introduce a small set of constants (e.g.,
PolicyDefaultDenyIngress, PolicyAllowGatewayIngress, PolicyAllowFromGateway, and
GatewayLabelKey/Value) and use them instead of string literals in
deleteNetworkPolicies and the corresponding create function(s) (look for
functions that construct these policies). Update DeleteIfExists calls in
deleteNetworkPolicies to reference the new constants and replace any repeated
label usage in policy selectors/annotations with the GatewayLabel constants so
names/labels stay centralized and consistent.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
docs/09-Configuration reference/01-Settings.md (1)
412-436: State the default behavior explicitly.The new section explains behavior “when enabled,” but doesn’t explicitly say that policies are disabled by default unless
networkpolicies.enabled=true. This can mislead users skimming the doc. Consider a one-line callout.📌 Suggested doc tweak
The operator can create Kubernetes NetworkPolicies to enforce network micro-segmentation within a Stack namespace. When enabled, the following rules are applied: +By default, no NetworkPolicies are created unless `networkpolicies.enabled` is set to `true`.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/09-Configuration` reference/01-Settings.md around lines 412 - 436, Add an explicit one-line statement clarifying the default: NetworkPolicies are disabled unless the Settings resource with kind "Settings" and spec.key "networkpolicies.enabled" is set to "true" (e.g., the shown spec). Update the "Configure Network Policies" section to prepend or append a short callout like: "Note: NetworkPolicies are disabled by default; enable them by setting networkpolicies.enabled = \"true\" in a Settings resource."
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/09-Configuration` reference/01-Settings.md:
- Around line 412-436: Add an explicit one-line statement clarifying the
default: NetworkPolicies are disabled unless the Settings resource with kind
"Settings" and spec.key "networkpolicies.enabled" is set to "true" (e.g., the
shown spec). Update the "Configure Network Policies" section to prepend or
append a short callout like: "Note: NetworkPolicies are disabled by default;
enable them by setting networkpolicies.enabled = \"true\" in a Settings
resource."
Keep v3 module paths from main and clean YAML formatting from the network-policies branch. Remove duplicate --provenance=false entries.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/09-Configuration` reference/01-Settings.md:
- Around line 415-423: Update the "NetworkPolicies" section that begins with
"The operator can create Kubernetes NetworkPolicies..." to explicitly state the
default configuration: add a clear sentence (e.g., "Disabled by default; no
NetworkPolicies are created unless this setting is explicitly enabled/ set to
true") immediately before or after the existing description and bullet list so
readers know the operator does not create policies unless enabled.
In `@internal/resources/stacks/init.go`:
- Around line 426-428: The call to ctx.GetClient().List(watchCtx, stackList)
currently swallows errors by returning nil, which prevents the controller from
retrying; change the handler in init.go to propagate the error (or log and
return it) instead of returning nil so that List failures surface to the
reconciler and trigger retries—locate the block containing ctx.GetClient().List,
watchCtx, and stackList and replace the silent-return with a return of the error
(or a wrapped error with context) so wildcard fan-out failures are not dropped.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (3)
.goreleaser.ymlis excluded by!**/*.ymlconfig/rbac/role.yamlis excluded by!**/*.yamlhelm/operator/templates/gen/rbac.authorization.k8s.io_v1_clusterrole_formance-manager-role.yamlis excluded by!**/gen/**,!**/*.yaml,!**/gen/**
📒 Files selected for processing (2)
docs/09-Configuration reference/01-Settings.mdinternal/resources/stacks/init.go
| The operator can create Kubernetes NetworkPolicies to enforce network micro-segmentation within a Stack namespace. When enabled, the following rules are applied: | ||
|
|
||
| - **All ingress traffic is denied by default** to all pods in the namespace | ||
| - **The Gateway is accessible by everyone** (it is the entry point) | ||
| - **All other services** (Ledger, Payments, Auth, etc.) **are only accessible from the Gateway** | ||
|
|
||
| Egress traffic is not restricted — pods can still reach DNS, databases, brokers, and external services. | ||
|
|
||
| NetworkPolicies are owned by the Stack and are automatically garbage-collected when the Stack is deleted. |
There was a problem hiding this comment.
Document the default state explicitly (disabled unless set to true).
The section explains enabled behavior, but it does not clearly state that no NetworkPolicies are created by default. Please make this explicit to avoid rollout confusion.
📝 Suggested doc patch
The operator can create Kubernetes NetworkPolicies to enforce network micro-segmentation within a Stack namespace. When enabled, the following rules are applied:
+By default, NetworkPolicies are **not** created. Set `networkpolicies.enabled` to `"true"` to enable this behavior.
+
- **All ingress traffic is denied by default** to all pods in the namespace
- **The Gateway is accessible by everyone** (it is the entry point)
- **All other services** (Ledger, Payments, Auth, etc.) **are only accessible from the Gateway**📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| The operator can create Kubernetes NetworkPolicies to enforce network micro-segmentation within a Stack namespace. When enabled, the following rules are applied: | |
| - **All ingress traffic is denied by default** to all pods in the namespace | |
| - **The Gateway is accessible by everyone** (it is the entry point) | |
| - **All other services** (Ledger, Payments, Auth, etc.) **are only accessible from the Gateway** | |
| Egress traffic is not restricted — pods can still reach DNS, databases, brokers, and external services. | |
| NetworkPolicies are owned by the Stack and are automatically garbage-collected when the Stack is deleted. | |
| The operator can create Kubernetes NetworkPolicies to enforce network micro-segmentation within a Stack namespace. When enabled, the following rules are applied: | |
| By default, NetworkPolicies are **not** created. Set `networkpolicies.enabled` to `"true"` to enable this behavior. | |
| - **All ingress traffic is denied by default** to all pods in the namespace | |
| - **The Gateway is accessible by everyone** (it is the entry point) | |
| - **All other services** (Ledger, Payments, Auth, etc.) **are only accessible from the Gateway** | |
| Egress traffic is not restricted — pods can still reach DNS, databases, brokers, and external services. | |
| NetworkPolicies are owned by the Stack and are automatically garbage-collected when the Stack is deleted. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/09-Configuration` reference/01-Settings.md around lines 415 - 423,
Update the "NetworkPolicies" section that begins with "The operator can create
Kubernetes NetworkPolicies..." to explicitly state the default configuration:
add a clear sentence (e.g., "Disabled by default; no NetworkPolicies are created
unless this setting is explicitly enabled/ set to true") immediately before or
after the existing description and bullet list so readers know the operator does
not create policies unless enabled.
| if err := ctx.GetClient().List(watchCtx, stackList); err != nil { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Don’t silently drop wildcard fan-out failures.
On Line 426–Line 428, a List failure returns nil with no signal. This can skip all stack reconciles for a wildcard settings update and leave NetworkPolicies stale until some unrelated reconcile event occurs.
🔧 Minimal reliability fix
if s.IsWildcard() {
stackList := &v1beta1.StackList{}
if err := ctx.GetClient().List(watchCtx, stackList); err != nil {
+ log.FromContext(watchCtx).Error(err, "cannot list stacks for wildcard networkpolicies.enabled setting")
return nil
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/resources/stacks/init.go` around lines 426 - 428, The call to
ctx.GetClient().List(watchCtx, stackList) currently swallows errors by returning
nil, which prevents the controller from retrying; change the handler in init.go
to propagate the error (or log and return it) instead of returning nil so that
List failures surface to the reconciler and trigger retries—locate the block
containing ctx.GetClient().List, watchCtx, and stackList and replace the
silent-return with a return of the error (or a wrapped error with context) so
wildcard fan-out failures are not dropped.
The module path was incorrectly changed from github.com/formancehq/operator/v3 to github.com/formancehq/operator, breaking CI since Go requires the /v3 suffix for modules with v3.x.x tags.
3445da6 to
5a5573f
Compare
Summary
networkpolicies.enabledSettingUsage
Test plan
networkpolicies.enabled=true: 3 NetworkPolicies created, controlled by Stacktrue→false: all 3 NetworkPolicies deleted*): works for all stacks