The "pinniped supervisor" system pod is not running after a vSphere Supervisor Cluster upgrade.
The Supervisor Cluster upgrade gets stuck when trying to update the Pinniped supervisor pods and the pinniped-supervisor pod fails to reach a "Ready" state.
tkg-controller continuously restart.Getting below error in the pinniped-supervisor pod logskubectl logs -n vmware-system-pinniped pinniped-supervisor
“level”:“error”,“timestamp”:“yyyy-mm-ddThh:mm:ss.356276Z”,“caller”:“k8s.io/[email protected]/pkg/server/dynamiccertificates/tlsconfig.go:275$dynamiccertificates.(*DynamicServingCertificateController).processNextWorkItem”,“message”:“key failed with : not loading an empty serving certificate from \“supervisor-serving-cert\“\n”}
[10:16 AM] {“level”:“error”,“timestamp”:“yyyy-mm-ddThh:mm:ss.356276Z”,“caller”:“k8s.io/[email protected]/pkg/server/dynamiccertificates/tlsconfig.go:275$dynamiccertificates.(*DynamicServingCertificateController).processNextWorkItem”,“message”:“key failed with : not loading an empty serving certificate from \“supervisor-serving-cert\“\n”}
{“level”:“error”,“timestamp”:“yyyy-mm-ddThh:mm:ss.675830Z”,“caller”:“go.pinniped.dev/internal/controllerlib/controller.go:219$controllerlib.(*controller).handleKey”,“message”:“certs-manager-controller: {vmware-system-pinniped pinniped-supervisor-api-tls-serving-certificate} failed with: could not create secret: Post \“https://##.###.#
.1:443/api/v1/namespaces/vmware-system-pinniped/secrets\“: middleware request for &kubeclient.request{verb:\“create\“, namespace:\“vmware-system-pinniped\“, resource:schema.GroupVersionResource{Group:\“\”, Version:\“v1\“, Resource:\“secrets\“}, reqFuncs:[]func(kubeclient.Object) error{(func(kubeclient.Object) error)(0x1e589a0), (func(kubeclient.Object) error)(0x1dc48e0), (func(kubeclient.Object) error)(0x1dc48e0)}, respFuncs:[]func(kubeclient.Object) error(nil), subresource:\“\”} failed to mutate: request mutation failed: write attempt rejected as client is not leader\n”}
vSphere Kubernetes Service 7.x
vSphere Kubernetes Service 8.x
The supervisor upgrade got stuck during the component upgrade phase. This issue is often related to the Pinniped upgrade failing (e.g., from v0.13.0 to v0.22.0) due to a leader election lock during secret creation.
The resolution typically involves advanced troubleshooting and potential manual intervention at the Kubernetes layer to address the missing secret and resolve the leader election lock.
Please engage Broadcom Support to validate the environment and take corrective actions to fix the reported issue