In a Supervisor cluster, one of the nodes is not healthy. On vCenter GUI > Workload Management, you can see the following error:
Config Status
Current Status: Error
Status Messages:
System error occurred on Master node with identifier ################################. Details: Script ['/us/bin/kubectl', '-kubeconfig'. '/etc/kubernetes/admin.conf, 'get', 'node', '################################', '-o'. 'jsonpath=\'(.status.addresses[?(@.type == "InternaliP")].address)\"] failed: Command '['/usr/bin/kubectl', '-kubeconfig'. */etc/kubernetes/admin.conf, 'get', 'node', '################################', -0', jsonpath=\{.status.addresses[[email protected] == "InternaIIP")].address)\"]' returned non-zero exit status 1...
System error occurred on Master node with identifier ################################. Details: Failed to sync changes: Command '['/ust/bin/kubectl, '-kubeconfig', */etc/kubernetes/admin.conf, 'get', 'daemonset, '--namespace', 'vmware-system-logging'. '-o', 'json']' returned non-zero exit status 1. Will be retried...
From Control Plane nodes ssh session, checking the cluster nodes all nodes are ok and healthy, but the error on the vCenter GUI environment is still there.
File /etc/kubernetes/admin.conf
has been changed, or certificates are not correctly updated, or expired. Certificates on this file does not match the correct string to be validated.
grep "certificate-authority-data: " /etc/kubernetes/admin.conf | awk '{print $2}' | base64 -d | openssl x509 -noout -dates
grep "client-certificate-data: " /etc/kubernetes/admin.conf | awk '{print $2}' | base64 -d | openssl x509 -noout -dates
kubeadm certs renew all
grep "certificate-authority-data: " /etc/kubernetes/admin.conf | awk '{print $2}' | base64 -d | openssl x509 -noout -dates
grep "client-certificate-data: " /etc/kubernetes/admin.conf | awk '{print $2}' | base64 -d | openssl x509 -noout -dates