"x509: certificate has expired or is yet to be valid" - kubectl commands giving intermittent errors
search cancel

"x509: certificate has expired or is yet to be valid" - kubectl commands giving intermittent errors

book

Article ID: 383505

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

When running kubectl commands, from a jumpbox or a control plane, the command returns correct output output or, intermittently,  the error: Unable to connect to the server: x509: certificate has expired or is yet to be valid

Environment

VMware vSphere with Tanzu Guest Cluster

Cause

One or more TKG Guest Cluster control plane nodes has expired certificates data  in /etc/kubernetes/admin.conf .

When commands are served by these nodes you get the x509 error.

When /etc/kubernetes/admin.conf files has expired certificates data like below, it is not allowed to access the cluster using the kubectl command on that node.

# cat /etc/kubernetes/admin.conf

# echo <certificate-authority-data-from-above> | base64 -d | openssl x509 -noout -dates
notBefore=Nov 22 22:00:00 2022 GMT
notAfter=Nov 19 22:00:00 2024 GMT

and/or

# echo <client-certificate-data-from-above> | base64 -d | openssl x509 -noout -dates
notBefore=Nov 22 22:00:00 2022 GMT
notAfter=Dec 05 10:08:26 2024 GMT

Resolution

Login to each Guest Cluster control plane

1. Check if the certificates in admin.conf have expired.

# ls -la /etc/kubernetes/admin.conf
# awk '$0 ~ /certificate-authority-data/ {print $2}' /etc/kubernetes/admin.conf | base64 -d | openssl x509 -noout -dates
# awk '$0 ~ /client-certificate-data/ {print $2}' /etc/kubernetes/admin.conf | base64 -d | openssl x509 -noout -dates

2. Once it has been confirmed that one or both of those certificates are expired, run below command to renew the certificates

# kubeadm certs renew all

3. Confirm those certificates are now renewed.

# ls -la /etc/kubernetes/admin.conf
# awk '$0 ~ /certificate-authority-data/ {print $2}' /etc/kubernetes/admin.conf | base64 -d | openssl x509 -noout -dates
# awk '$0 ~ /client-certificate-data/ {print $2}' /etc/kubernetes/admin.conf | base64 -d | openssl x509 -noout -dates

4. Check if the following container sare running: kube-api, etcd, kube-controller-manager and kube-scheduler.

# crictl ps | awk '$0~/^CONTAINER|kube-api|etcd |kube-controller-manager|kube-scheduler /{print $0}'

5.
Stop containers kube-api, etcd, kube-controller-manager and kube-scheduler.

a. Generate the stop command

#crictl ps | awk '$0 ~ /kube-api|etcd |kube-controller-manager|kube-scheduler /{print "crictl stop",$1}'

b. Run the generated "crictl stop" commands

6. After 15 seconds, Check that the kubelet Service has restarted them.

# crictl ps | awk '$0~/^CONTAINER|kube-api|etcd |kube-controller-manager|kube-scheduler /{print $0}'