When running kubectl commands, from a jumpbox or a control plane, the command returns correct output output or, intermittently, the error: Unable to connect to the server: x509: certificate has expired or is yet to be valid
VMware vSphere with Tanzu Guest Cluster
One or more TKG Guest Cluster control plane nodes has expired certificates data in /etc/kubernetes/admin.conf .
When commands are served by these nodes you get the x509 error.
When /etc/kubernetes/admin.conf files has expired certificates data like below, it is not allowed to access the cluster using the kubectl command on that node.
# cat /etc/kubernetes/admin.conf
# echo <certificate-authority-data-from-above> | base64 -d | openssl x509 -noout -dates
notBefore=Nov 22 22:00:00 2022 GMT
notAfter=Nov 19 22:00:00 2024 GMT
and/or
# echo <client-certificate-data-from-above> | base64 -d | openssl x509 -noout -dates
notBefore=Nov 22 22:00:00 2022 GMT
notAfter=Dec 05 10:08:26 2024 GMT
Login to each Guest Cluster control plane
1. Check if the certificates in admin.conf have expired.
# ls -la /etc/kubernetes/admin.conf
# awk '$0 ~ /certificate-authority-data/ {print $2}' /etc/kubernetes/admin.conf | base64 -d | openssl x509 -noout -dates
# awk '$0 ~ /client-certificate-data/ {print $2}' /etc/kubernetes/admin.conf | base64 -d | openssl x509 -noout -dates
2. Once it has been confirmed that one or both of those certificates are expired, run below command to renew the certificates
# kubeadm certs renew all
3. Confirm those certificates are now renewed.
# ls -la /etc/kubernetes/admin.conf
# awk '$0 ~ /certificate-authority-data/ {print $2}' /etc/kubernetes/admin.conf | base64 -d | openssl x509 -noout -dates
# awk '$0 ~ /client-certificate-data/ {print $2}' /etc/kubernetes/admin.conf | base64 -d | openssl x509 -noout -dates
4. Check if the following container sare running: kube-api, etcd, kube-controller-manager and kube-scheduler.# crictl ps | awk '$0~/^CONTAINER|kube-api|etcd |kube-controller-manager|kube-scheduler /{print $0}'
Stop containers kube-api, etcd, kube-controller-manager and kube-scheduler.
5.a. Generate the stop command
#crictl ps | awk '$0 ~ /kube-api|etcd |kube-controller-manager|kube-scheduler /{print "crictl stop",$1}'
b. Run the generated "crictl stop" commands
6. After 15 seconds, Check that the kubelet Service has restarted them.# crictl ps | awk '$0~/^CONTAINER|kube-api|etcd |kube-controller-manager|kube-scheduler /{print $0}'