You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Chris Putnam wants an easy, integrated edge-gateway feature set:
multi-WAN with failover
DHCP leases registered into DNS, with DNS cleanup on lease expiry/reassignment
DNS proxy/forwarder with secure upstream DNS over TLS or HTTPS, and secure downstream service where appropriate
easy WireGuard setup
policy-based routing for multi-WAN
fq_codel/CAKE-like smart queueing that is easy to enable or default-on where safe
This issue expands that request into an implementation plan and calls out what xpf already has versus what is still missing.
Current repo status
Already present or partially present
HA chassis/failover foundation exists. README documents chassis cluster failover, VRRP, session sync, dual fabric links, fabric cross-chassis forwarding, and ISSU at README.md:111-124. HA validation docs exercise failover and traffic continuity at testing-docs/ha-cluster.md:1-45 and userspace RG move validation at testing-docs/ha-cluster.md:139-150.
Routing instances, VRFs, and PBR exist. README lists VRFs, inter-VRF leaking, GRE/XFRM/PBR at README.md:103-107; feature-gap docs summarize static routes, ECMP, VRFs, GRE/IPIP, rib-groups, next-table route leaking, and PBR at docs/feature-gaps.md:250-252. The kernel/eBPF compiler maps firewall-filter then routing-instance into route action/table IDs at pkg/dataplane/compiler_filter.go:363-371. Userspace snapshot generation imports routing instances and synthetic inter-VRF leak routes at pkg/dataplane/userspace/snapshot.go:1059-1075.
DHCP server exists. Kea-backed DHCPv4/v6 config rendering exists in pkg/dhcpserver/dhcpserver.go:183-258 and pkg/dhcpserver/dhcpserver.go:273-329; pool config already models DNS servers, lease time, and domain at pkg/config/types.go:1004-1014. Historical commits include dc3f1621 (DHCP server support), feb927fe (RA + DHCPv6), and 6de466e7 (DHCP lease display via CLI/gRPC).
DHCP lease DNS updates are already tracked separately. See DHCP server: dynamic DNS updates and stale lease cleanup #1387, "DHCP server: dynamic DNS updates and stale lease cleanup". That issue should remain the detailed DDNS implementation tracker.
Router Advertisement can advertise recursive DNS. RA config carries DNSServers at pkg/config/types.go:1655-1666, and RA sender builds RDNSS options in pkg/ra/sender.go.
DNS proxy syntax/plan exists, but runtime is missing.dns-proxy: replace systemd-resolved toggle with real firewall DNS proxy runtime #660 was the earlier DNS proxy runtime tracker, and docs/next-features/dns-proxy.md:1-80 records the plan. However current config still warns that system services dns dns-proxy has no real runtime at pkg/config/compiler.go:613-615, and docs/feature-gaps.md:407 still marks DNS proxy as missing.
WireGuard has architecture research but no implementation.docs/vpp-dataplane-assessment.md:716-849 evaluates kernel WireGuard + TC/veth, userspace WireGuard + AF_XDP, VPP WG, and DPDK-integrated WG. It recommends kernel WG + veth/XDP as the pragmatic first step, but there is no production config/runtime code for WireGuard today.
fq_codel is not implemented as an xpf feature. Current docs mention Linux tc-fq/tc-fq_codel as external/test-host mitigations (docs/per-5-tuple/even-flows-recipe.md:97-100, docs/cross-worker-flow-fairness-research.md:636-641), but there is no first-class WAN smart-queueing config/runtime. Userspace CoS has MQFQ/DRR/fairness work, but that is not the same as operator-facing fq_codel/CAKE-style AQM on WAN egress.
Product direction
This should be treated as an "edge gateway profile" rather than six unrelated knobs. The operator should be able to turn up a multi-WAN site with sane defaults:
define WAN uplinks and health probes;
define LANs/prefixes and DHCP pools;
enable local DNS service and DDNS registration;
optionally enable WireGuard remote/site access;
attach policies that select WANs by source, destination, app, or routing-instance;
get smart queueing automatically where the link is shaped or measured.
The CLI should expose the underlying knobs for advanced users, but the primary workflow should be short and hard to misconfigure.
Proposed work plan
Phase 1: Multi-WAN service model and health state
Add a first-class multi-WAN model above raw routing instances:
set services multi-wan uplink wan-a interface reth0.100
set services multi-wan uplink wan-a routing-instance ISP-A
set services multi-wan uplink wan-a priority 100
set services multi-wan uplink wan-a weight 1
set services multi-wan uplink wan-a health-check target 1.1.1.1 protocol icmp
set services multi-wan uplink wan-b interface reth0.200
set services multi-wan uplink wan-b routing-instance ISP-B
set services multi-wan policy default mode failover primary wan-a backup wan-b
Implementation notes:
Reuse existing routing instances/PBR instead of inventing a parallel forwarding plane.
Reuse/extend RPM probes for health checks where possible.
Keep health state separate from config so route changes are event-driven and observable.
Add hysteresis: consecutive-success/consecutive-fail thresholds, hold-down, min-up-time, and flap counters.
Add show services multi-wan with uplink state, selected primary, failover reason, last probe result, active sessions, and route/PBR installs.
HA behavior: only the active RG owner should publish live route/NAT changes for that RG; failover must reconstruct selected uplinks from health state or immediately re-probe.
Acceptance:
Losing primary WAN probe removes that uplink from selection within a bounded time.
Existing sessions either stay pinned when possible or are explicitly drained/reset with visible reason counters.
New sessions use backup WAN after failover.
Recovery honors hold-down before moving traffic back.
Phase 2: Policy-based routing for multi-WAN
Build a friendly policy layer on top of existing filter then routing-instance support:
set services multi-wan policy video match application [ zoom teams ] uplink wan-a
set services multi-wan policy backup match source-prefix 10.10.50.0/24 uplink wan-b
set services multi-wan policy default mode load-balance uplinks [ wan-a wan-b ] algorithm weighted-flow-hash
Implementation notes:
Compile policies into existing firewall-filter/PBR/routing-instance primitives where possible.
For userspace dataplane, carry the selected WAN/uplink in the session so return path, SNAT, and failover behavior are stable.
NAT must bind to the chosen egress uplink/pool and remain sticky for session lifetime.
Use #1387 as the detailed tracker. This umbrella issue depends on it for:
DHCPv4/v6 active lease watcher over Kea lease CSV/state.
hostname/FQDN normalization.
A/AAAA/PTR creation.
cleanup on expiry, release, reassignment, and pool removal.
xpf-owned-record state store so we never delete records we did not create.
HA-aware emission only from the active DHCP-serving owner.
Multi-WAN-specific requirements:
DDNS must work for LANs behind any WAN uplink.
DNS update transport should be able to choose an upstream routing-instance/uplink if the DNS server is reachable only through a specific WAN.
DDNS failures must not block DHCP leases by default; they must surface in counters/status and retry.
Phase 4: Real DNS proxy with secure upstream/downstream
Extend the #660 / docs/next-features/dns-proxy.md plan into a secure DNS runtime:
Config sketch:
set system services dns dns-proxy listen-interface reth1.0
set system services dns dns-proxy default-domain home.example
set system services dns dns-proxy upstream quad9 address 9.9.9.9 protocol dot tls-server-name dns.quad9.net
set system services dns dns-proxy upstream cloudflare url https://cloudflare-dns.com/dns-query protocol doh
set system services dns dns-proxy cache-size 10000
set system services dns dns-proxy downstream plain enable
set system services dns dns-proxy downstream dot enable certificate local-dns-cert
set system services dns dns-proxy downstream doh enable certificate local-dns-cert path /dns-query
Implementation notes:
First runtime target should probably still be managed unbound for plain DNS + DoT upstream and caching. DoH upstream/downstream may need CoreDNS, dnsdist, or a small purpose-built proxy if unbound is insufficient for the full product goal.
Separate host resolver behavior from client-facing firewall DNS proxy behavior. Do not use systemd-resolved as the fake DNS proxy runtime.
Bind listeners per interface/routing-instance. Avoid accidentally exposing DNS on WAN.
Support ACLs/default-deny by source prefix/zone/interface.
Add certificate management integration for downstream DoT/DoH, or clearly require an existing local certificate object for phase 1.
Clients can query firewall DNS on intended LAN interfaces.
Upstream forwarding works over plain DNS and DoT in phase 1; DoH is either implemented or explicitly tracked as phase 2.
Downstream secure DNS is either DoT first or DoH first, with a documented cert path and access control.
DNS proxy honors routing-instance/uplink selection for upstreams.
HA failover moves listener ownership cleanly.
Phase 5: Easy WireGuard setup
Use the existing WireGuard architecture decision in docs/vpp-dataplane-assessment.md:716-849.
Recommended phase-1 architecture:
Kernel WireGuard for crypto.
veth/tun plumbing so decrypted traffic re-enters xpf policy/routing/NAT path.
Zone binding for the decrypted WireGuard side.
XDP/userspace dataplane still handles physical NIC traffic and encrypted UDP on WAN.
Config sketch:
set services wireguard interface wg0 listen-port 51820
set services wireguard interface wg0 address 10.44.0.1/24
set services wireguard interface wg0 wan-uplink wan-a
set services wireguard peer phone public-key <key> allowed-address 10.44.0.2/32
set services wireguard peer phone preshared-key <secret>
set services wireguard peer phone route-mode remote-access
Operator tooling:
Generate server keypair.
Add peer with generated client config/QR output.
Optional dynamic endpoint/roaming peer support.
Automatic firewall policy template for remote-access and site-to-site modes.
Optional multi-WAN endpoint failover: publish endpoint DNS, prefer current healthy uplink, support keepalive.
Acceptance:
show services wireguard shows interface/peer handshake/bytes/endpoint/allowed IPs.
Remote-access peer can reach selected LAN through xpf policy.
Site-to-site peer can route prefixes through selected WAN/uplink.
Multi-WAN failover either preserves tunnel via endpoint roaming or reconnects within documented bounds.
Phase 6: Smart queueing / fq_codel-like default
The product ask is "fq_codel or similar easily enabled or on by default." For xpf that needs two paths:
Kernel path / management interfaces: render Linux tc qdisc (fq_codel or cake) when traffic really egresses through kernel qdisc.
Userspace dataplane / AF_XDP path: implement native AQM in the userspace CoS queues because Linux qdisc is bypassed.
Config sketch:
set class-of-service smart-queueing enable
set class-of-service smart-queueing default-profile wan
set class-of-service smart-queueing profile wan algorithm fq-codel
set class-of-service smart-queueing profile wan target 5ms
set class-of-service smart-queueing profile wan interval 100ms
set class-of-service smart-queueing profile wan ecn
set interfaces reth0 unit 100 family inet smart-queueing profile wan
Implementation notes:
For shaped WANs, default smart queueing should be enabled unless the operator disables it. For unshaped high-speed LAN/datacenter interfaces, default-off or passive telemetry-only is safer.
Userspace path should build on existing CoS per-flow fairness buckets, adding sojourn-time tracking and CoDel-style ECN/drop decisions at admission/dequeue.
Preserve current CoS exact guarantees: AQM cannot let best-effort steal from exact queues.
WireGuard kernel+veth MVP: key/config renderer, wg lifecycle, zone/interface binding, show command.
Smart queueing kernel path: tc qdisc renderer for non-AF_XDP egress.
Smart queueing userspace path: native CoDel/AQM in CoS queues, with metrics and validation harness.
Integrated edge-gateway validation: one lab profile with two WANs, DHCP+DDNS, DNS proxy, WG remote client, PBR, failover, and smart queueing under load.
Validation matrix
Single WAN baseline: DHCP, DNS proxy, DDNS, smart queueing, WireGuard all work.
Dual WAN healthy: policy chooses expected uplink; weighted mode distributes new flows by hash/weight.
Primary WAN failure: new sessions use backup; DNS upstream and DDNS update path continue if reachable.
Primary WAN recovery: hold-down prevents flap; optional preempt moves back only after stable.
DHCP lease expires/reassigns: DNS records cleaned.
DNS upstream DoT failure: fallback upstream selected; counters show failure.
WireGuard peer during WAN failover: reconnect behavior within documented bound.
Bufferbloat test: bulk download/upload under shape with ping/TCP echo latency gate.
HA RG failover during multi-WAN traffic: active owner owns DNS/DHCP/WG/listeners and route state.
Open design questions
Should multi-WAN be expressed as a new high-level services multi-wan tree, or as templates that compile into existing routing-instances/firewall filters/NAT rules?
Do we want DoT-only first for upstream secure DNS, or is DoH a phase-1 requirement?
Which downstream secure DNS mode matters first: DoT on 853, DoH on 443, or both?
For WireGuard, should phase 1 use TC on wg0 or veth re-entry into the xpf pipeline? Existing docs recommend veth/XDP for fuller pipeline reuse.
For smart queueing, should default-on apply only when an interface has an explicit shape, or should we infer shape from measured WAN speed?
How should smart queueing interact with current CoS exact/surplus/equal-flow enforcement when all are enabled?
Request
Chris Putnam wants an easy, integrated edge-gateway feature set:
This issue expands that request into an implementation plan and calls out what xpf already has versus what is still missing.
Current repo status
Already present or partially present
README.md:111-124. HA validation docs exercise failover and traffic continuity attesting-docs/ha-cluster.md:1-45and userspace RG move validation attesting-docs/ha-cluster.md:139-150.README.md:103-107; feature-gap docs summarize static routes, ECMP, VRFs, GRE/IPIP, rib-groups, next-table route leaking, and PBR atdocs/feature-gaps.md:250-252. The kernel/eBPF compiler maps firewall-filterthen routing-instanceinto route action/table IDs atpkg/dataplane/compiler_filter.go:363-371. Userspace snapshot generation imports routing instances and synthetic inter-VRF leak routes atpkg/dataplane/userspace/snapshot.go:1059-1075.pkg/dhcpserver/dhcpserver.go:183-258andpkg/dhcpserver/dhcpserver.go:273-329; pool config already models DNS servers, lease time, and domain atpkg/config/types.go:1004-1014. Historical commits includedc3f1621(DHCP server support),feb927fe(RA + DHCPv6), and6de466e7(DHCP lease display via CLI/gRPC).DNSServersatpkg/config/types.go:1655-1666, and RA sender builds RDNSS options inpkg/ra/sender.go.docs/next-features/dns-proxy.md:1-80records the plan. However current config still warns thatsystem services dns dns-proxyhas no real runtime atpkg/config/compiler.go:613-615, anddocs/feature-gaps.md:407still marks DNS proxy as missing.docs/vpp-dataplane-assessment.md:716-849evaluates kernel WireGuard + TC/veth, userspace WireGuard + AF_XDP, VPP WG, and DPDK-integrated WG. It recommends kernel WG + veth/XDP as the pragmatic first step, but there is no production config/runtime code for WireGuard today.tc-fq/tc-fq_codelas external/test-host mitigations (docs/per-5-tuple/even-flows-recipe.md:97-100,docs/cross-worker-flow-fairness-research.md:636-641), but there is no first-class WAN smart-queueing config/runtime. Userspace CoS has MQFQ/DRR/fairness work, but that is not the same as operator-facing fq_codel/CAKE-style AQM on WAN egress.Product direction
This should be treated as an "edge gateway profile" rather than six unrelated knobs. The operator should be able to turn up a multi-WAN site with sane defaults:
The CLI should expose the underlying knobs for advanced users, but the primary workflow should be short and hard to misconfigure.
Proposed work plan
Phase 1: Multi-WAN service model and health state
Add a first-class multi-WAN model above raw routing instances:
Implementation notes:
show services multi-wanwith uplink state, selected primary, failover reason, last probe result, active sessions, and route/PBR installs.Acceptance:
Phase 2: Policy-based routing for multi-WAN
Build a friendly policy layer on top of existing filter
then routing-instancesupport:Implementation notes:
Acceptance:
Phase 3: DHCP DDNS integration
Use #1387 as the detailed tracker. This umbrella issue depends on it for:
Multi-WAN-specific requirements:
Phase 4: Real DNS proxy with secure upstream/downstream
Extend the #660 /
docs/next-features/dns-proxy.mdplan into a secure DNS runtime:Config sketch:
Implementation notes:
unboundfor plain DNS + DoT upstream and caching. DoH upstream/downstream may need CoreDNS, dnsdist, or a small purpose-built proxy if unbound is insufficient for the full product goal.systemd-resolvedas the fake DNS proxy runtime.Acceptance:
Phase 5: Easy WireGuard setup
Use the existing WireGuard architecture decision in
docs/vpp-dataplane-assessment.md:716-849.Recommended phase-1 architecture:
Config sketch:
Operator tooling:
Acceptance:
show services wireguardshows interface/peer handshake/bytes/endpoint/allowed IPs.Phase 6: Smart queueing / fq_codel-like default
The product ask is "fq_codel or similar easily enabled or on by default." For xpf that needs two paths:
tcqdisc (fq_codelorcake) when traffic really egresses through kernel qdisc.Config sketch:
Implementation notes:
show class-of-service smart-queueingcommand.Acceptance:
Suggested implementation slices
Validation matrix
Open design questions
services multi-wantree, or as templates that compile into existing routing-instances/firewall filters/NAT rules?