This is a known issue affecting VMware Integrated Openstack 7.x.
Workaround:
- Manually restart the kubelet service on the affected controller node
#viossh <controller node with problem>
#systemctl restart kubelet
A Cron job can be created to looks at logs and restart the services.
- Create a script with any desired name with .sh extension and below should be the contents of the file :
#!/bin/bash
output=$(journalctl -u kubelet -n 10 | grep "use of closed network connection")
if [[ $? != 0 ]]; then
echo "Error not found in logs"
elif [[ $output ]]; then
echo "Restart kubelet"
systemctl restart kubelet
fi
Note: In the above script -n indicates the the number of lines to be retrieved and at the time of the issue we will have a lot of entries which will let the kubelet service get restarted. Once the services are restarted these errors will not be present and we need worry about kubelet service restart at every run even though the errors are still present in the logs.
- Make the shell script executable by using the command:
#chmod 777 filenmane.sh
- Run crontab -l to check the current details. Create a cronjob to run the script every 5 minutes or less frequently if customer prefers it that way. Below is the procedure to get the same configured :
#crontab -e
add the below line :
*/5 * * * * /path_to_script/filename.sh >/dev/null 2>&1 &>>/path_for_output_file/xxx.txt
In some instances we will not have the crontab service installed.
#tdnf install -y cronie
#systemctl enable --now crond.service
If the controllers do not have internet access, download the cronie rpm installation package from this link and manually transfer it to controller :
https://packages.vmware.com/photon/3.0/photon_release_3.0_x86_64/x86_64/cronie-1.5.1-1.ph3.x86_64.rpm To install:
rpm --install cronie-1.5.1-1.ph3.x86_64.rpm --noscriptssystemctl start cron