Kubernetes Status Shows System Pod "Pinniped Supervisor" Not Running After Supervisor Cluster Upgrade
search cancel

Kubernetes Status Shows System Pod "Pinniped Supervisor" Not Running After Supervisor Cluster Upgrade

book

Article ID: 345693

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vSphere Kubernetes Service

Issue/Introduction

  • The "pinniped supervisor" system pod is not running after a vSphere Supervisor Cluster upgrade.

  • The Supervisor Cluster upgrade gets stuck when trying to update the Pinniped supervisor pods and the pinniped-supervisor pod fails to reach a "Ready" state.

  • Other critical pods, such as tkg-controller continuously restart.

Getting below error in the pinniped-supervisor pod logs

kubectl logs -n vmware-system-pinniped pinniped-supervisor

“level”:“error”,“timestamp”:“yyyy-mm-ddThh:mm:ss.356276Z”,“caller”:“k8s.io/[email protected]/pkg/server/dynamiccertificates/tlsconfig.go:275$dynamiccertificates.(*DynamicServingCertificateController).processNextWorkItem”,“message”:“key failed with : not loading an empty serving certificate from \“supervisor-serving-cert\“\n”}

[10:16 AM] {“level”:“error”,“timestamp”:“yyyy-mm-ddThh:mm:ss.356276Z”,“caller”:“k8s.io/[email protected]/pkg/server/dynamiccertificates/tlsconfig.go:275$dynamiccertificates.(*DynamicServingCertificateController).processNextWorkItem”,“message”:“key failed with : not loading an empty serving certificate from \“supervisor-serving-cert\“\n”}

{“level”:“error”,“timestamp”:“yyyy-mm-ddThh:mm:ss.675830Z”,“caller”:“go.pinniped.dev/internal/controllerlib/controller.go:219$controllerlib.(*controller).handleKey”,“message”:“certs-manager-controller: {vmware-system-pinniped pinniped-supervisor-api-tls-serving-certificate} failed with: could not create secret: Post \“https://##.###.#

.1:443/api/v1/namespaces/vmware-system-pinniped/secrets\“: middleware request for &kubeclient.request{verb:\“create\“, namespace:\“vmware-system-pinniped\“, resource:schema.GroupVersionResource{Group:\“\”, Version:\“v1\“, Resource:\“secrets\“}, reqFuncs:[]func(kubeclient.Object) error{(func(kubeclient.Object) error)(0x1e589a0), (func(kubeclient.Object) error)(0x1dc48e0), (func(kubeclient.Object) error)(0x1dc48e0)}, respFuncs:[]func(kubeclient.Object) error(nil), subresource:\“\”} failed to mutate: request mutation failed: write attempt rejected as client is not leader\n”}
 

Environment

  • vSphere Kubernetes Service 7.x

  • vSphere Kubernetes Service 8.x

Cause

The supervisor upgrade got stuck during the component upgrade phase. This issue is often related to the Pinniped upgrade failing (e.g., from v0.13.0 to v0.22.0) due to a leader election lock during secret creation.

 

Resolution

The resolution typically involves advanced troubleshooting and potential manual intervention at the Kubernetes layer to address the missing secret and resolve the leader election lock.

Please engage Broadcom Support to validate the environment and take corrective actions to fix the reported issue

Contact Broadcom support