Skip to content

Feature gap: DataPlane interface contract is BPF-shaped; userspace Manager embeds eBPF Manager #1381

@psaab

Description

@psaab

Gap

The dataplane.DataPlane interface (pkg/dataplane/dataplane.go:104-300) is BPF-shaped: ~80 methods are direct BPF map writers (SetZone, SetPolicyRule, SetSNATRule, SetDNATEntry, SetNAT64Config, SetScreenConfig, SetMirrorConfig, SetPolicerConfig, SetFlowConfig, etc., plus the matching ClearXxx / DeleteStaleXxx / ReadXxxCounter methods). The userspace Manager (pkg/dataplane/userspace/manager.go:57-123) embeds dataplane.DataPlane — its interface satisfaction comes from the embedded eBPF dataplane.Manager. Every config write call from the daemon today still flows through bpf_map_update_elem even when running on userspace-dp; the userspace XDP shim doesn't consult those maps. After Phase 4 (BPF source removal) those maps don't exist, and every embedded method becomes either undefined or a compile failure.

This isn't a forwarding feature gap — it's the Phase-3-blocking plumbing concern that all Phase 0–2 work must surface to.

eBPF implementation (source of truth)

  • pkg/dataplane/dataplane.go:104-300 — full DataPlane interface, 80+ methods, every one BPF-shaped
  • pkg/dataplane/dataplane.go:13var _ DataPlane = (*Manager)(nil) — eBPF Manager satisfies it directly
  • pkg/dataplane/userspace/manager.go:57-58:
    type Manager struct {
        dataplane.DataPlane     // <- this embed
        inner *dataplane.Manager
  • pkg/dataplane/userspace/manager.go:148-156New() returns &Manager{ DataPlane: inner, inner: inner, ... } (the embed IS the eBPF manager)
  • pkg/dataplane/userspace/snapshot.go:458-472userspaceMapPins() references BPF pin paths the userspace helper opens (Ctrl, Bindings, Heartbeat, XSK redirect map, conntrack v4/v6, DNAT tables, Trace)

Userspace-dp gap

pkg/dataplane/userspace/manager.go adds only ~5 methods of its own (Mode, EventStream, Status, SetForwardingArmed, InjectPacket, DrainSessionDeltas, etc.) — the other ~75 methods come from the embedded eBPF type. There's NO standalone DataPlane implementation for userspace.

Recommended fix

Phase 3 plan needs to address this before Phase 4 can land. Two reasonable approaches:

  1. Split the interface: define ConfigSink (high-level: ApplyConfig(*config.Config) error) and ControlPlane (read-only: Status, Counters, Sessions). Move BPF-specific methods onto *ebpf.Manager directly, not the abstract interface. Userspace Manager implements only the new high-level interface.

  2. Stub out BPF writers in userspace: keep the DataPlane interface name but replace every Set*/Clear*/Delete* method on userspace Manager with a no-op (or no-op-with-snapshot-mutation). Daemon code that builds BPF entries continues to compile but is dead code; remove at Phase 3.

(1) is cleaner; (2) is faster to land. Either way, this issue is the central plumbing problem Phase 3 must resolve before Phase 4 can rm -rf bpf/ without orphaning ~80 interface methods.

The userspace XDP shim itself (userspace-xdp/src/lib.rs) is unaffected — it stays. The XSK redirect map, userspace_bindings/ctrl/heartbeat/trace pins, and conntrack pins it uses are the boundary the user-space helper crosses regardless. Only the legacy xdp_main / xdp_screen / xdp_zone / xdp_policy etc. and their associated BPF maps are retired.

Blocker for #1373

This is the Phase 3 blocker. Without splitting or stubbing this interface, pkg/dataplane/*.go cannot shrink to userspace-only and Phase 4 cannot proceed.


Refined contract (added 2026-05-17 after triple-review of #1383)

See docs/pr/1381-dataplane-interface-split/plan.md for the full implementation contract refined through 4 rounds of Claude+Codex+Gemini Pro 3 review. New since the original issue body:

  • Import-cycle avoidance contract: SessionDeltaSource interface lives in pkg/dataplane/runtime (NOT pkg/dataplane/userspace). DTOs (SessionDelta, SessionDeltaSnapshot) live in the same neutral runtime package. Public interface must not reference any pkg/dataplane/userspace type. Backed by an import canary test.
  • ApplyResult widened with FilterIDs map[string]uint32, FilterSpans map[string]FilterCounterSpan (FilterID, RuleStart, RuleCount), NATCounterIDs map[string]uint32 — required by current server_show_firewall.go:41, cli_show_nat.go:352, cli_show_security.go:1398 callers.
  • GC migration side-effect ownership explicitly assigned: session-change telemetry (GlobalCtrSessionsNew/Closed) → Telemetry.GlobalCounter; per-IP session-limit map publish → backend-private; persistent-NAT preservation → SessionStore.Delete* atomic or BeforeDelete hook; DNAT/NAT64 reverse cleanup → backend session/NAT store owns it; v4/v6 stats counting → backend-neutral.
  • Cluster stale-reconcile path at pkg/cluster/sync.go:556,571,599 must use same SessionStore.DeleteWithCompanionsV4/V6 / ReconcileClusterBulk semantics as GC. Phase 1 acceptance gate requires tests that fail if cluster sync keeps a local DeleteDNATEntry* cleanup copy.
  • Architectural-mismatch rationale added — explicitly distinguishes this plan from killed patterns Extract PacketContext: explicit per-packet ownership state machine #961 (PacketContext), Refactor: Pipeline / Chain of Responsibility Pattern for Vector Packet Processing #946 Phase 2 (batched pipeline), Refactor: SessionTable is a 'Memory Allocator' — Missing Multi-Index Data-Oriented Design #964 Step 3 (cache-line packing) so reviewers don't reject by reflex.
  • Compatibility matrix (REST/gRPC/CLI/cluster sync/metrics/config-only mode) + Risks section (stale handles, lifetime ordering, atomic apply visibility, replay windows, backend escape hatches, failure attribution) + Phase 1 acceptance gate with explicit go/no-go criteria.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions