Skip to content

feat(config): enable netfilter, bridge, VLAN and VXLAN for container networking#6

Merged
appcypher merged 2 commits into
krunfwfrom
appcypher/enable-container-networking
Apr 26, 2026
Merged

feat(config): enable netfilter, bridge, VLAN and VXLAN for container networking#6
appcypher merged 2 commits into
krunfwfrom
appcypher/enable-container-networking

Conversation

@appcypher
Copy link
Copy Markdown
Member

@appcypher appcypher commented Apr 23, 2026

Summary

  • Enable the kernel-side primitives that Linux container runtimes (Docker, Podman, CNI) and Tailscale expect to find in the guest: netfilter core + conntrack, full IPv4/IPv6 iptables (filter/mangle/nat) and nftables (inet + bridge families) with MASQUERADE/REDIRECT/REJECT, ipset, and a focused set of xtables matches and targets.
  • Turn on CONFIG_BRIDGE + CONFIG_BRIDGE_NETFILTER so docker0-style bridges can have iptables/nftables rules applied to bridged traffic, plus CONFIG_NF_CONNTRACK_BRIDGE for modern userland-proxy-free port publishing.
  • Add L2 overlay/segmentation drivers: CONFIG_VLAN_8021Q (802.1Q tagging) and CONFIG_VXLAN (L2-over-UDP overlay used by Docker Swarm, Flannel, Weave and Cilium).
  • Apply the same block identically to all three architecture configs (config-libkrunfw_x86_64, _aarch64, _riscv64) so guest behavior is uniform across builds.

Motivation: the previous configs had CONFIG_NETFILTER is not set, which blocks any container runtime or Tailscale (outside userspace-networking mode) from working inside the guest. Since CONFIG_MODULES is off, every option is built-in (=y) and make olddefconfig resolves any remaining dependencies at build time. The curated scope is intentionally tighter than a full netfilter dump — legacy helpers (FTP/IRC/H323 ALGs), flow offload, SYNPROXY/TPROXY, IPVS, XFRM-backed matches, raw/security tables, and most niche xtables matches are omitted to keep the embedded kernel bundle size increase modest (estimated ~400–700 KB added kernel text baked into the shipped .so/.dylib).

Test Plan

  • Native build succeeds on x86_64: make -j"$(nproc)" produces libkrunfw.so.5.2.1 without kconfig errors from make olddefconfig.
  • Native build succeeds on aarch64: same command on ubuntu-24.04-arm produces the aarch64 .so.
  • Cross-build succeeds for riscv64 via the existing cross-build flow.
  • Resulting .so size delta vs. the previous build is within expected range (~400–700 KB of added kernel text).
  • Boot a guest with the new kernel via libkrun and confirm the netfilter stack is live: cat /proc/net/nf_conntrack exists, iptables -L -n works (via iptables-nft compat), nft list ruleset works.
  • Docker smoke test in the guest: dockerd starts without errors, default docker0 bridge comes up, docker run --rm -p 8080:80 nginx publishes a port (exercises MASQUERADE + bridge-netfilter + conntrack).
  • VXLAN smoke test: ip link add vxlan0 type vxlan id 42 dev eth0 dstport 4789 succeeds.
  • VLAN smoke test: ip link add link eth0 name eth0.100 type vlan id 100 succeeds.
  • Tailscale smoke test (optional): tailscaled in default mode successfully programs ts-input/ts-forward/ts-postrouting chains; tailscale up brings the node online.
  • Release workflow produces all four artifacts (libkrunfw-linux-x86_64.so.5.2.1, libkrunfw-linux-aarch64.so.5.2.1, libkrunfw-macos-x86_64.5.dylib, libkrunfw-macos-aarch64.5.dylib) on a test release tag.

…networking

Turn on the kernel primitives needed for container runtimes (Docker,
Podman, CNI) and Tailscale to function inside the guest. Applied to all
three architecture configs (x86_64, aarch64, riscv64) so behavior is
uniform across builds.

Netfilter:
- Core framework, conntrack, NAT (with REDIRECT and MASQUERADE)
- nftables engine with inet and bridge families, plus CT/LOG/LIMIT/
  MASQ/REDIR/NAT/REJECT/COMPAT verbs and bridge conntrack
- Legacy iptables/ip6tables (filter, mangle, nat) with REJECT,
  MASQUERADE and REDIRECT targets
- xtables matches (addrtype, comment, conntrack, limit, multiport,
  state) and shared MARK/LOG/CHECKSUM/SET targets
- ipset with the four common hash shapes (ip, ipport, net, netport)

L2 networking:
- CONFIG_BRIDGE + CONFIG_BRIDGE_NETFILTER so docker0-style bridges
  can have iptables/nftables rules applied to bridged traffic
- CONFIG_VLAN_8021Q for 802.1Q VLAN tagging
- CONFIG_VXLAN for L2-over-UDP overlay networks used by Docker Swarm,
  Flannel, Weave and Cilium

Since CONFIG_MODULES is off, every option is built-in (=y); olddefconfig
will resolve any remaining dependencies at build time.
@ddelbondio
Copy link
Copy Markdown

Following up on the discussion in superradcompany/microsandbox#598:
These flags are required according to https://github.com/moby/moby/blob/master/contrib/check-config.sh

CONFIG_IP_NF_RAW=y
CONFIG_IP6_NF_RAW=y
CONFIG_NETFILTER_XT_MATCH_IPVS=y
CONFIG_POSIX_MQUEUE=y
CONFIG_NF_CT_NETLINK=y

All other flags I have in my local config are included in the PR already. I very quickly tested the config changes locally and docker seems to run fine.

…SIX mqueue

Round out the netfilter surface and enable POSIX message queues:

- CONFIG_IP_NF_RAW + CONFIG_IP6_NF_RAW for the raw table (NOTRACK
  rules and pre-conntrack mangling)
- CONFIG_NF_CT_NETLINK so userspace tools (conntrack-tools, conntrackd,
  systemd-networkd, libnetfilter_conntrack) can read and modify the
  conntrack table over netlink
- CONFIG_NETFILTER_XT_MATCH_IPVS for matching IPVS connections via
  iptables/nftables
- CONFIG_POSIX_MQUEUE on x86_64 (already on for aarch64 and riscv64)
  for POSIX message queue IPC

Applied uniformly across all three architecture configs where relevant.
@appcypher appcypher merged commit 502b728 into krunfw Apr 26, 2026
@appcypher appcypher deleted the appcypher/enable-container-networking branch April 26, 2026 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants