Skip to content

CFP-45607: enable delegated-ipam in conjunction with Gateway API#93

Open
matmerr wants to merge 1 commit into
cilium:mainfrom
matmerr:matmerr/delegated-ingress-ipam
Open

CFP-45607: enable delegated-ipam in conjunction with Gateway API#93
matmerr wants to merge 1 commit into
cilium:mainfrom
matmerr:matmerr/delegated-ingress-ipam

Conversation

@matmerr
Copy link
Copy Markdown

@matmerr matmerr commented Apr 24, 2026

tl;dr: when running with delegated ipam enabled, allow delegated ipam to also allocate ingress IP's for gateway API.

Cilium feature issue here: cilium/cilium#45607

Implementation here: cilium/cilium#45818

@xmulligan
Copy link
Copy Markdown
Member

cc: @cilium/sig-agent

Comment thread cilium/CFP-45607-delegated-ipam-ingress.md Outdated
Comment thread cilium/CFP-45607-delegated-ipam-ingress.md Outdated
Comment thread cilium/CFP-45607-delegated-ipam-ingress.md Outdated
@matmerr matmerr force-pushed the matmerr/delegated-ingress-ipam branch 2 times, most recently from 020435c to 754f6d2 Compare April 27, 2026 15:16

### Allocation notes

* If ingress IPs are already present in memory, skip allocation to avoid leaking IPs via duplicate CNI ADD calls.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if it calls delegated ipam cni again, there won't be leak since you are using same container ID?


### Deallocation

Deallocation issues a CNI DEL with the same stable container ID.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so delete will be invoked always even if gateway API disabled and if delegated-ipam-bin-path set right?

what if some one allocated cilium ingress ip and accidentally removed delegated-ipam-bin-path and restart cilium. will that leak ip?


### Skipping identity restoration:

We also need to consider skipping identity restoration from the [local IP cache.](https://github.com/cilium/cilium/blob/6f1c09105e01105d8dca315036a244505997e893/pkg/ipcache/restore/local_identity_restorer.go#L187).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this different only for ingress IP allocated via delegated ipam? if so, why it's different for this case?

Add CFP for enabling Gateway API with delegated IPAM by having
cilium-agent invoke the external IPAM plugin directly using a stable
container ID. Scoped to ingress IPs as an incremental step building
on previous CFP: Delegated IPAM CiliumIPs.

Ref: cilium/cilium#45607
Signed-off-by: Mathew Merrick <matmerr@microsoft.com>
@matmerr matmerr force-pushed the matmerr/delegated-ingress-ipam branch from 754f6d2 to 8a96ded Compare April 29, 2026 00:07
matmerr added a commit to matmerr/cilium that referenced this pull request May 6, 2026
Relates to cilium/design-cfps#93.

When using delegated IPAM mode (ipam.mode=delegated-plugin), enabling
gateway API was rejected at startup because checkIPAMDelegatedPlugin
returned an error when EnableEnvoyConfig was set, which gatewayAPI
implies. This prevented running gateway API with delegated IPAM entirely.

Remove the EnableEnvoyConfig guard and add a delegated IPAM code path
for gateway ingress IP allocation. On allocation, the agent invokes the
external IPAM plugin directly via the CNI exec interface, since there is
no kubelet-driven CNI call chain for agent-owned IPs. On deallocation,
the same path is used to release the IPs back to the plugin. Do this by
referencing the CNI config manager into infraIPAllocator so it can read
the conflist and discover the configured IPAM plugin.

Identity restoration for ingress IPs is skipped under delegated IPAM
since the IPs are re-allocated on agent restart.

Signed-off-by: Mathew Merrick <matmerr@microsoft.com>
@joestringer
Copy link
Copy Markdown
Member

@cilium/sig-ipam could you also take a look? Seems like you might have some thoughts or insights into the structure.

Copy link
Copy Markdown
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, from an IPAM perspective this looks reasonable to me. I have one concern though.


The other CNI parameters have defaults in the CNI call since we're only doing IPAM, not actual network plumbing:

* `CNI_NETNS` = `/proc/1/ns/net` (host network namespace)
Copy link
Copy Markdown
Member

@gandro gandro May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried about this. What if the delegated IPAM plugin starts messing with eth0 in the host network namespace (for example, removes IPs from eth0)?

Have you considered spawning a fake network namespace for the ingress IP instead? Similarly to how Cilium does it with the cilium-health endpoint?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second though, it's only an IPAM plugin, not a full CNI delegation. So maybe this is less bad than I thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants