- envs
- replicas
- nodeCount
- microservicesPerEnv
- totalThroughputGbps
- configObjectCount
- churnFactor
- connectionsPerPod
- l7TrafficRatio
totalPods = microservicesPerEnv * replicas * envs
connectionsPerNode = (connectionsPerPod * totalPods) / nodeCount
- Each pod maintains N connections
- Connections are evenly distributed across nodes
throughputPerNodeGbps = totalThroughputGbps / nodeCount
- Assumes even traffic distribution across worker nodes
ztunnelCpuPerNode = MAX(
connectionsPerNode / 2000,
throughputPerNodeGbps * 0.8
)
- connectionsPerNode / 2000
→ empirical: ~1 vCPU handles ~2000 active L4 connections - throughputPerNodeGbps * 0.8
→ network-bound cost (~0.8 vCPU per Gbps with mTLS) - MAX()
→ ztunnel is constrained by whichever is higher: connections or throughput
ztunnelMemPerNodeMB = 150 + (connectionsPerNode * 0.05)
- 150 MB base → idle ztunnel footprint
- 0.05 MB per connection → connection tracking + buffers
ztunnelCpuTotal = ztunnelCpuPerNode * nodeCount
ztunnelMemTotalMB = ztunnelMemPerNodeMB * nodeCount
- ztunnel runs once per node (DaemonSet model)
l7ThroughputGbps = totalThroughputGbps * l7TrafficRatio
- Only a fraction of traffic is processed at L7
- Typical baseline: 20–40%
waypointCpuTotal = l7ThroughputGbps * 3
- L7 processing (HTTP, routing, auth) is expensive
- Empirical: ~2–4 vCPU per Gbps
- Using 3 vCPU/Gbps as midpoint baseline
waypointMemTotalMB = 300 + (l7ThroughputGbps * 1000 * 0.2)
- 300 MB base → Envoy footprint
- Traffic term reflects buffering, routing tables, telemetry
- 0.2 MB per unit traffic → heuristic from Envoy scaling patterns
istiodCpuTotal = (totalPods / 1500) * churnFactor
- ~1500 pods per vCPU baseline
- churnFactor accounts for:
- config updates
- deployments
- endpoint changes
- Higher churn → more recomputation (xDS pushes)
istiodMemTotalMB =
(totalPods * 1.5) + (configObjectCount * 0.5)
- 1.5 MB per pod → endpoint + metadata storage
- 0.5 MB per config object → routing rules, policies
Final Recommendation (Based on General Consensus and Benchmarks)
cniCpuTotal = 0.4
- (value by default) Negligible. Typically, the CNI doesn't consume much CPU directly unless network policies are complex or there’s a significant amount of pod-to-pod traffic with encryption.
cniMemTotalMB = nodeCount * 100
- ~100 MB per node. This accounts for network management, routing, and metadata tracking that the CNI handles.
These figures are based on typical Kubernetes and Istio ambient mesh deployments where the network policies aren't overly complex and where the traffic is balanced. In most Istio deployments, the CNI's role is minimal compared to the load generated by the proxy (like ztunnel) and control plane (istiod), unless there's a heavy reliance on network policies.
Note: According to Kubernetes documentation and CNI provider scaling guides, CNI can handle around 500 pods per vCPU under moderate load. This is based on Kubernetes and CNI providers like Calico, Cilium, and Flannel, where CPU utilization scales linearly with the number of pods. However, this calculation isn't used in your case due to the specific assumptions for Istio Ambient Mesh.
totalCpu =
ztunnelCpuTotal + waypointCpuTotal + istiodCpuTotal + cniCpuTotal
totalMemoryMB =
ztunnelMemTotalMB + waypointMemTotalMB + istiodMemTotalMB + cniMemTotalMB
- High throughput (Gbps)
- Moderate pod count
- Example: production traffic-heavy systems
- High pod count
- Many config objects
- High churn (CI/CD, ephemeral workloads)
- High percentage of L7 traffic
- Heavy use of:
- routing rules
- auth policies
- observability
| Component | Scaling Driver | Rule of Thumb |
|---|---|---|
| ztunnel CPU | Connections / Throughput | max(conn/2000, 0.8 CPU/Gbps) |
| ztunnel Memory | Connections | 150 MB + 0.05 MB/conn |
| waypoint CPU | L7 throughput | ~3 CPU/Gbps |
| waypoint Memory | Traffic | ~300 MB + l7ThroughputGbps * 1000 * 0.2 |
| istiod CPU | Pods + churn | pods/1500 * churn |
| istiod Memory | Pods + config | linear scaling |
- This is a baseline model, not a guarantee
- Real-world variance depends on:
- protocol mix (HTTP vs gRPC vs TCP)
- TLS settings
- telemetry volume
- Always validate with load testing
- Recommended safety margin: +30–50%
- envs = 1
- nodeCount = 10
- microservicesPerEnv = 400
- replicas = 2
- totalThroughputGbps = 10
- connectionsPerPod = 20
- configObjectCount = 3200
- churnFactor = 2
- l7TrafficRatio = 0.3
totalPods = 800
connectionsPerNode = (20 * 800) / 10 = 1600
throughputPerNodeGbps = 10 / 10 = 1 Gbps
ztunnelCpuPerNode = MAX(1600 / 2000, 1 * 0.8)
= MAX(0.8, 0.8) = 0.8 vCPU
ztunnelMemPerNodeMB = 150 + (1600 * 0.05)
= 150 + 80 = 230 MB
- ztunnelCpuTotal = 0.8 * 10 = 8 vCPU
- ztunnelMemTotal = 230 * 10 = 2300 MB (~2.3 GB)
l7ThroughputGbps = 10 * 0.3 = 3 Gbps
waypointCpuTotal = 3 * 3 = 9 vCPU
waypointMemTotalMB = 300 + (3 * 1000 * 0.2)
= 300 + 600 = 900 MB
istiodCpuTotal = (800 / 1500) * 2 ≈ 1.07 vCPU
istiodMemTotalMB = (800 * 1.5) + (3200 * 0.5)
= 1200 + 1600 = 2800 MB (~2.8 GB)
cniCpuTotal = 0.4 vCPU
cniMemTotalMB = (100 * 10) = 1000 MB (1 GB)
- Total CPU = 8 + 9 + 1.07 + 0.4 = 18.5 vCPU
- Total Memory = 2300 MB + 900 MB + 2800 MB + 1000 MB = 7000 MB (~7.0 GB)
- envs = 10
- nodeCount = 10
- microservicesPerEnv = 300
- replicas = 1
- totalThroughputGbps = 1
- connectionsPerPod = 5
- configObjectCount = 1200
- churnFactor = 2
- l7TrafficRatio = 0.3
totalPods = 3000 connectionsPerNode = (5 * 3000) / 10 = 1500 throughputPerNodeGbps = 1 / 10 = 0.1 Gbps
ztunnelCpuPerNode = MAX(1500 / 2000, 0.1 * 0.8)
= MAX(0.75, 0.08) = 0.75 vCPU
ztunnelMemPerNodeMB = 150 + (1500 * 0.05)
= 150 + 75 = 225 MB
- ztunnelCpuTotal = 0.75 * 10 = 7.5 vCPU
- ztunnelMemTotal = 225 * 10 = 2250 MB (~2.25 GB)
l7ThroughputGbps = 1 * 0.3 = 0.3 Gbps
waypointCpuTotal = 0.3 * 3 = 0.9 vCPU
waypointMemTotalMB = 300 + (0.3 * 1000 * 0.2)
= 300 + 60 = 360 MB
istiodCpuTotal = (3000 / 1500) * 2 = 4 vCPU
istiodMemTotalMB = (3000 * 1.5) + (1200 * 0.5)
= 4500 + 600 = 5100 MB (~5.1 GB)
cniCpuTotal = 0.4 vCPU
cniMemTotalMB = (100 * 10) = 1000 MB (1 GB)
- Total CPU = 7.5 + 0.9 + 4 + 0.4 = 12.8 vCPU
- Total Memory = 2250 MB + 360 MB + 5100 MB + 1000 MB = 8710 MB (~8.7 GB)
| Component | Prod CPU | Dev CPU | Prod Mem | Dev Mem |
|---|---|---|---|---|
| ztunnel | 8 vCPU | 7.5 vCPU | 2.3 GB | 2.25 GB |
| waypoint | 9 vCPU | 0.9 vCPU | 0.9 GB | 0.36 GB |
| istiod | 1.1 vCPU | 4 vCPU | 2.8 GB | 5.1 GB |
| cni | 0.4 vCPU | 0.4 vCPU | 1 GB | 1 GB |
| TOTAL | 18.5 vCPU | 12.8 vCPU | 7.0 GB | 8.7 GB |