Cleanup and pull again the incomplete images from the remote repository once there is an PullBackOff/CrashLoopBackOff issue with bringing up a certain container because of pull image failures.
VMSP 9.0.0.0
Example Failure Message
When encountering a CrashLoopBackOff error, you may see an output similar to the following:
vco-app-1 0/2 Init:CrashLoopBackOff 6 (3m30s ago) 9m50s X.X.X.X automation-mrmnp <none> <none>
Set the Kubernetes Configuration:
Export the KUBECONFIG environment variable to point to the appropriate configuration file:
export KUBECONFIG=path_to_kubeconfig_file
Retrieve Node Information:
Run the following command to get detailed pod and node information:
kubectl describe pod <pod-name>
This will help identify the node on which the failing pod is running.
Access the Node via SSH:
Use the IP address obtained from the previous step and log into the node using SSH with the vmware-system-user credentials
ssh vmware-system-user@<node-ip>
If the issue is related to incomplete image downloads, execute the following script on the node to retry pulling missing images:
Please run sudo su before executing the script:
for img in $(ctr -n k8s.io images check | grep incomplete | awk '{print $1}'); do
ctr -n k8s.io images pull -k $img
done
This script continuously checks for incomplete container images and attempts to pull them again.