Containers are in Init:ErrImageNeverPull error state
book
Article ID: 306254
calendar_today
Updated On:
Products
VMware Aria Suite
Show More
Show Less
Issue/Introduction
Symptoms:
User is unable to do anything on the environment due to service not being up and running. Pods with Init:ErrImageNeverPull error on one or more nodes can be seen. Execute in order to see the states of the pods
kubectl get pods -n prelude
Example of pods with such error:assessment-service-app-##########-24nw8 0/1 Init:ErrImageNeverPull 0 5h30m 10.244.0.235 prelude-004.example.com <none> <none> symphony-logging-daemonset-7phb9 0/1 ErrImageNeverPull 0 5h12m 10.244.0.239 prelude-004.example.com <none> <none> tango-blueprint-service-app-##########-2l8xg 0/1 Init:ErrImageNeverPull 0 5h28m 10.244.0.236 prelude-004.example.com <none> <none> tango-vro-gateway-app-##########-rs7hs 0/1 Init:ErrImageNeverPull 0 5h35m 10.244.0.234 prelude-004.example.com <none> <none>
Environment
VMware vRealize Automation 8.x
Cause
There might be different causes for this issue:
Ephemeral storage in Prelude is 100% of the disk One or more of the storage Prelude disks are completely full or 80%+ full
(Disk /data is only 17% free, which is >80% used, which is a problem)
Node has been restarted due to unhealthy node status
Resolution
Steps to recover from this state:
Resize the affected disk and add to it at least 20GB, the more GBs added, the better
resizing happens through vSphere
Reboot the affected node and wait some time for things to go into normal state again (about 30-50 mins)
Alternative to the reboot is to execute the “/opt/scripts/restore_docker_images.sh ” script on the affected node(s).
Feedback
thumb_up
Yes
thumb_down
No