During lifecycle of a deployment using whereabouts and IP reservations it can happen that unexpected reboot (worker node crash or another unexpected incident) that could lead to IP reservations allocated to a crashed POD to fail to be freed.
If the IP pool is relatively small this could lead to IP exhaustion and pod failing to start.
TKGm 2.x
Confirm the NetworkAttachmentDefinition
kubectl get network-attachment-definitions -n nstest -oyaml
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: test
namespace: nstest
spec:
config: '{ "cniVersion": "0.3.1", "type": "ipvlan", "master": "Net2", "mode": "l2",
"ipam": { "type": "whereabouts", "range": "172.xx.xx.192/27", "range_start":
"172.xx.xx.196", "range_end": "172.xx.xx.200", "routes": [ { "dst": "172.xx.xx.110/32",
"gw": "172.xx.xx.194" }, { "dst": "10.xx.xx.235/32", "gw": "172.xx.xx.194"
}, { "dst": "172.xx.xx.200/32", "gw": "172.xx.xx.194" }, { "dst": "172.xx.xx.40/32",
"gw": "172.xx.xx.194" }, { "dst": "172.xx.xx.88/32", "gw": "172.xx.xx.194" },
{ "dst": "172.xx.xx.80/32", "gw": "172.xx.xx.194" } ] } }'
From the range we can see that there are 5 IPs in the list "172.xx.xx.196", "range_end": "172.xx.xx.200
Get the current reservations:
kubectl get ippools.whereabouts.cni.cncf.io -n kube-system
172.xx.xx.192-27
kubectl get ippools.whereabouts.cni.cncf.io -n kube-system 172.xx.xx.192-27 -o yaml
apiVersion: whereabouts.cni.cncf.io/v1alpha1
kind: IPPool
metadata:
name: 172.xx.xx.192-27
namespace: kube-system
spec:
allocations:
"4":
id: 1e88...fb5dec7c
"5":
id: 3484..........6381
"6":
id: c002...500f35b0
"7":
id: 7213.....7a16b0f67
"8":
id: 8cb....63d9cc07
range: 172.xx.xx.192/27
Information we can extract from this object
range
172.xx.xx.192/27:
The CIDR block managed by this IPPool.
A /27 subnet has a total of 32 IPs (172.xx.xx.192 to 172.xx.xx.223), of which:
A mapping of IP addresses within the range to their respective allocations. The numbers ("4", "5", etc.) correspond to host offsets within the range.
For example:
"4" corresponds to IP 172.xx.xx196 (4th usable address in the range).
"5" corresponds to IP 172.xx.xx.197.
Allocation IDs
Each allocation has an associated unique ID (typically a hashed identifier for the Pod/container using the IP). For instance:
"4": id: 1e88...fb5dec7c
This indicates that IP 172.18.178.196 is allocated and tracked by the ID.
There are two methods to correct this issue:
Method 1:
This outlines a process for clearing IP address allocations with Whereabouts manually.
Detailed steps:
1. Examined the network attachment definition file for the deployment to determine the ippools
kubectl get net-attach-def -n namespace -o yaml
2. Stop all pods which use Multus + Whereabouts
kubectl scale deployment deployName -n namespace --replicas=0
3. Clear IP allocations
kubectl get crds | grep -i whereabouts
a. This should result in two CRDs
overlappingrangeipreservations.whereabouts.cni.cncf.io
ippools.whereabouts.cni.cncf.io
b. For each of those, perform:
kubectl get overlappingrangeipreservations.whereabouts.cni.cncf.io -A
kubectl get ippools.whereabouts.cni.cncf.io -A
c. then with each item there.
kubectl delete ippools.whereabouts.cni.cncf.io x.x.x.x -n <NAMESPACE>
kubectl delete overlappingrangeipreservations.whereabouts.cni.cncf.io x.x.x.x -n <NAMESPACE>
The problem with this approach is that the application have to be scaled down to allow the clean up of the related objects.
Method 2:
Identify currently used IPs from Pod definition of the running containers
once Identify the IPs edit the object below and remove the stake records
If having above example tells us we have 5 IPs in use but only IPs
"4" corresponds to IP 172.xx.xx196 (4th usable address in the range).
"5" corresponds to IP 172.xx.xx.197.
"6" corresponds to IP 172.xx.xx.198.
We can remove the IDs 7 and 8 and this will allow the pending pods to receive the freed IPs
(in bold to be removed)
kubectl get ippools.whereabouts.cni.cncf.io -n kube-system 172.xx.xx.192-27 -o yaml
apiVersion: whereabouts.cni.cncf.io/v1alpha1
kind: IPPool
metadata:
name: 172.xx.xx.192-27
namespace: kube-system
spec:
allocations:
"4":
id: 1e88...fb5dec7c
"5":
id: 3484..........6381
"6":
id: c002...500f35b0
"7":
id: 7213.....7a16b0f67
"8":
id: 8cb....63d9cc07
range: 172.xx.xx.192/27
Source KB 374129 for TCA updated and modified with new information.