CFP-45836: Multicast Egress as CiliumEgressGatewayPolicy Extension#95
Open
kkroo wants to merge 1 commit into
Open
CFP-45836: Multicast Egress as CiliumEgressGatewayPolicy Extension#95kkroo wants to merge 1 commit into
kkroo wants to merge 1 commit into
Conversation
Extends CiliumEgressGatewayPolicy.destinationCIDRs to accept multicast CIDRs (e.g., 232.0.0.0/4). On match, the BPF datapath redirects via the existing VXLAN egress-gateway path to the selected gateway node, SNATs source IP to egressIP, and clone_redirects to the gateway's primary interface — letting the host kernel handle MAC-layer multicast. No new CRD type. Same operator UX as unicast egress. Designed atop the existing LPM_TRIE map; minimal datapath additions on origin (bpf_host.c:1425) and gateway (bpf_overlay.c:359 + new clone_redirect branch). Issue: cilium/cilium#45836 Signed-off-by: Omar Ramadan <ramadan@blockcast.net> Signed-off-by: Omar Ramadan <omar@blockcast.net>
eb06f72 to
9fb4bc3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This CFP proposes extending
CiliumEgressGatewayPolicy.destinationCIDRsto accept multicast CIDRs (e.g.,232.0.0.0/4). When matched, the BPF datapath redirects via the existing VXLAN egress-gateway path to the selected gateway node, SNATs source IP toegressIP, andclone_redirects to the gateway's primary interface — letting the host kernel handle MAC-layer multicast and the upstream network handle PIM-SSM tree distribution.No new CRD type, no new fields. Same operator UX as unicast egress.
Why now
The existing
multicast-enabledfeature handles cluster-internal pod-to-pod fanout via VXLAN.CiliumEgressGatewayPolicyhandles unicast egress to external networks with SNAT. There's a gap between the two: pods publishing SSM multicast to upstream multicast-aware networks (PIM-SSM peers, AMT relays per RFC 7450, middle-mile AMT tunneling per draft-zzhang-mboned-dynamic-internet-mcast-tunnel-00) have no path. Today their multicast packets are dropped between the pod's veth and the host's eth0.Use cases include live media distribution (SSM video CDN-style), financial market data feeds (most exchanges deliver via SSM), and internal enterprise multicast trees.
Why CEGP extension vs. new CRD
Reading the existing code in
bpf/lib/egress_gateway.h,bpf/bpf_host.c:1425,bpf/bpf_overlay.c, andpkg/egressgateway/policy.goshowed the surface area is much smaller than initially scoped:cilium_egress_gw_policy_v4map is already anLPM_TRIE— a single232.0.0.0/4entry covers SSM with one map key. No schema migration.dstCIDRs []netip.Prefixwithout filtering.bpf/bpf_lxc.c:1715already has multicast detection for themulticast-enabledfeature, with fall-through to standard egress when no in-cluster subscribers exist. No pod-side hook changes needed.bpf_host.c:1425 → egress_gw_handle_request → egress_gw_handle_packetis already LPM and would already match multicast policy entries.The substantive datapath work is small: confirm
lookup_ip4_remote_endpointreturns a sensible identity for multicast destinations (probably WORLD, but verify), and add aclone_redirect-instead-of-fib_lookupbranch on the gateway-node from-overlay path. Estimated 2–3 weeks of focused work.Issue
cilium/cilium#45836
Coexistence with
multicast-enabledThe two features operate on disjoint subscriber-sets but share the policy / map infrastructure. When both are enabled and a publisher pod sends to a group with both local-pod subscribers and a CEGP egress entry, both deliveries happen — packets replicate to local pods (existing path) AND emit on the egress-gateway node's eth0 (new path). The two paths don't conflict because subscriber types are disjoint.
Asks
Comment on the design, especially:
CiliumEgressGatewayPolicyrather than create a parallelCiliumMulticastEgressPolicytype.lookup_ip4_remote_endpointreturns WORLD identity for multicast destinations today, or if we need an explicitIN_MULTICAST(daddr) → WORLDcarve-out before the existingidentity_is_cluster()check.sig-datapath?sig-policy? Both?Also: is anyone already working on multicast egress upstream? Is Isovalent's enterprise IP Multicast feature on a path to OSS contribution that would obviate this?
Happy to iterate before implementation begins.