TKGs - TKC cluster antrea-agent Pods fail with "x509: certificate signed by unknown authority" errors
search cancel

TKGs - TKC cluster antrea-agent Pods fail with "x509: certificate signed by unknown authority" errors

book

Article ID: 345900

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vSphere Kubernetes Service

Issue/Introduction

This article outlines a workaround to recover antrea-agent Pods.

    • Antrea-agent Pods have 1/2 containers running with antrea-agent one failing its Readiness probe.

    • Pods logs show certificate errors:

      <timestamp>.542780073Z W1110 HH:MM:SS.542694    1 egress_controller.go:777] Failed to start watch for EgressGroup: Get "https://<IP>:443/apis/controlplane.antrea.io/v1beta2/egressgroups?fieldSelector=nodeName%3D<nodename>&watch=true": x509: certificate signed by unknown authority

Environment

VMware vSphere 7.0 with Tanzu

Cause

Unable to validate the certificate authority.

Resolution

  • Manually recreate antrea-ca ConfigMap certificate:

    • Take a backup of the ConfigMap

      $ kubectl get cm -n kube-system antrea-ca -o yaml > antrea-ca.yaml && cat antrea-ca.yaml
       
    •  Delete the ConfigMap

      $ kubectl delete cm -n kube-system antrea-ca
       

    • Restart antrea-controller Pod, which in turn recreates the ConfigMap

      $ kubectl delete pod antrea-controller-<>-<> -n kube-system

    • Confirm antrea-controller has been successfully restarted and antrea-ca ConfigMap recreated.

      $ kubectl get cm,pod -n kube-system | grep antrea

      configmap/antrea-agent-tweaker           1   4d19h

      configmap/antrea-ca                1   2m6s

      configmap/antrea-cluster-identity         1   4d19h


      configmap/antrea-config              3   4d17h

      configmap/antrea-config-########         3   4d19h

      configmap/antrea-resource-init-config-########  1   4d19h

      pod/antrea-agent-######                            2/2   Running  3 (4d14h ago)  4d17h

      pod/antrea-agent-######                            2/2   Running  8 (4d14h ago)  4d15h

      pod/antrea-controller-########-#####                     1/1   Running  0        2m17s

  • Manually recreate kube-root-ca.crt ConfigMap:

    • Take a backup of the ConfigMap

      $ kubectl get cm -n kube-system kube-root-ca.crt -o yaml > kube-root-ca.crt.yaml && cat kube-root-ca.crt.yaml

    • Delete the ConfigMap. This triggers an automatic recreation

      $ kubectl delete cm -n kube-system kube-root-ca.crt

    • Confirm the ConfigMap has been recreated

      $ kubectl get cm -n kube-system kube-root-ca.crt