Most Kubernetes clusters are born with a networking decision nobody revisits until it breaks: the CIDR ranges. You type –pod-network-cidr=10.244.0.0/16 once during kubeadm init, the cluster comes up, pods schedule, life is good — until eighteen months later a kubectl describe pod shows failed to allocate for range 0: no IP addresses available in range set. Now you are re-IPing a production cluster, which is roughly as fun as it sounds.
The fix is to treat cluster CIDRs as a capacity-planning problem on day one. That means understanding the three independent IP ranges every cluster carries, and doing a little subnet arithmetic before you commit. Here is the whole picture.
A Cluster has Three Separate IP Spaces
These do not overlap and they answer different questions:
- Pod CIDR — the pool every pod IP comes from. Set with –pod-network-cidr (kubeadm) or by your CNI’s config. Example: 10.244.0.0/16.
- Service CIDR — the virtual range for ClusterIP services. Set with –service-cidr. Default is 10.96.0.0/12. These IPs are not real; kube-proxy programs them into iptables/IPVS rules.
- Node CIDR mask — how big a *slice* of the pod CIDR each node gets carved off for its local pods. Controlled by the controller-manager’s –node-cidr-mask-size, default /24 for IPv4.
The trap lives in the relationship between #1 and #3.
The pod CIDR + node mask math (the 256-node ceiling)
By default, the controller-manager hands each node a /24 block out of the pod CIDR. A /24 has 256 addresses, so each node can host up to ~254 pods — usually fine. But now count how many /24 blocks fit inside your pod CIDR:
pod CIDR /16 ÷ per-node /24 = 2^(24-16) = 256 nodes maximum
That 10.244.0.0/16 you copy-pasted caps the cluster at 256 nodes, full stop, no matter how much compute you add. Hit node 257 and the scheduler simply cannot allocate it a pod range. The number of nodes a cluster can ever hold is 2^(node_mask − pod_mask), and the pods-per-node ceiling is 2^(32 − node_mask) − 2. Two knobs, one equation:
- Need more nodes? Widen the pod CIDR (/15, /14…) or shrink the per-node block (–node-cidr-mask-size=25 → 512 nodes from a /16, but only ~126 pods/node).
- Need more pods per node? Enlarge the per-node block (/23 → ~510 pods/node), which costs you node count.
Because every choice is a power-of-two trade between “how many nodes” and “how many pods each,” it pays to compute the host counts explicitly instead of guessing. Dropping the candidate prefix into a subnet calculator gives you the usable-host count and block count for each mask in one shot, so you can see that a /14 pod CIDR with a /24 node mask buys 1,024 nodes before you ever touch a cluster. Pick the masks on paper; commit them once.
Sizing the Service CIDR
The service CIDR is simpler but people still undersize it. The maximum number of ClusterIP services you can ever create equals the usable host count of that range. The default /12 (≈1,048,574 addresses) is enormous and almost never the problem. But teams that “tidy up” to a /24 service CIDR to look neat have shipped clusters that hit a hard wall at ~254 services — and a busy microservices platform blows past that fast. Unless you have a specific reason, leave the service CIDR generous; it costs nothing because the IPs are virtual.
One real constraint: the pod CIDR and service CIDR must not overlap each other, and neither should collide with the node/host network or anything reachable over your VPN. Sketch all three on the same address plan before init.
The Cloud gotcha: When pods Eat Real VPC IPs
Everything above assumes an overlay CNI (Flannel, Calico in overlay mode) where pod IPs are internal and cheap. On managed clouds with a “native” CNI — most notably the AWS VPC CNI on EKS — every pod gets a *real* VPC IP from the subnet the node sits in. Suddenly your Kubernetes pod density is bounded by your VPC subnet size, not the pod CIDR.
Do the math that actually bites here: an AWS /24 subnet has 256 addresses, minus 5 that AWS reserves, leaving 251 usable — shared across nodes, pods, load-balancer ENIs, everything in that subnet. A handful of m5.large nodes running 30 pods each will drain a /24 before lunch. The symptoms are identical to the on-prem case (no IP addresses available) but the fix is different: provision larger node subnets (/22 = ~1,019 usable, /20 = ~4,091), add a secondary CIDR to the VPC for pods, or enable prefix delegation so each ENI hands out /28 prefixes instead of single IPs.
This is where a quick host-count check earns its keep: before you carve a VPC into subnets, confirm each one is big enough for (nodes × pods_per_node) + overhead. A /24 “because that’s what the template had” is the single most common reason EKS clusters run dry.
A Pre-`init` Checklist
- Estimate peak nodes and peak pods per node for the cluster’s lifetime, then double both. Re-IPing later is far more expensive than over-allocating now.
- Choose a pod CIDR big enough that 2^(node_mask − pod_mask) ≥ peak_nodes.
- Choose a node-cidr-mask-size so 2^(32 − node_mask) − 2 ≥ peak_pods_per_node.
- Leave the service CIDR generous (/16 or the /12 default).
- On AWS VPC CNI (or any native CNI), size the node subnets for real pod IP consumption, not just nodes.
- Verify nothing overlaps the host network, the VPN, or peered VPCs.
The Takeaway
Kubernetes networking failures rarely look like networking failures — they look like pods stuck in ContainerCreating at 2 a.m. Nearly all of them trace back to a CIDR that was picked once, by default, with no host-count math behind it. Spend ten minutes with the two power-of-two equations above before kubeadm init, and you trade a future production migration for a one-line config you set correctly the first time. Subnetting is boring. Re-subnetting a live cluster is not.




