Troubleshooting Kubernetes pods not starting after changing the virtual machine hostname
book
Article ID: 326025
calendar_today
Updated On:
Products
VMware Aria Suite
Issue/Introduction
Identify the symptoms you may encounter if FQDN swaps to shortname.
Symptoms:
- You deploy and install a new Aria Automation 8.x instance.
- Hostname of the VM changes from FQDN to shortname
- You find a duplicate K8s node listed when running kubectl get nodes -o wide - the FQDN (original, good) & the shortname (bad)
- The original K8s node (the master / control-plane) was listed as in a NotReady status, while the bad node was listed as Ready.
- As a result of the hostname change, the critical kube-system pods were not functional.
- Pods are trying to use the bad / ghost K8s node instead of the proper, original node in FQDN format.
- The prelude pods are trying to start but are also failing.
- You try to remove the bad / ghost node, but it always returns after a VM reboot, or after restarting the kubelet service.
Environment
VMware vRealize Automation 8.x
VMware Aria Automation 8.x
Resolution
- Changed the hostname from the shortname back to the FQDN
hostnamectl set-hostname fqdn
- Stop services
/opt/scripts/deploy.sh --shutdown
- Attempt to remove the bad / ghost node:
kubectl drain shortname --ignore-daemonsets --delete-local-data && kubectl delete shortname
- Restart kubelet and check the bad / ghost node is removed.
- Rebooted the VM.
- After the VM rebooted you notice the bad / ghost node has not returned.
- The original K8s node is now listed in a Ready state
- Start services
/opt/scripts/deploy.sh
- kube-system pods are now reporting healthy.
- The kubelet service is now reporting as healthy.
- The prelude pods started coming up after the amount of time needed (about 10-15 mis)
- Aria Automation user interface is now again accessible.
Feedback
thumb_up
Yes
thumb_down
No