VKS cluster upgrade may report failed state due to spherelet upgrade failure on node
search cancel

VKS cluster upgrade may report failed state due to spherelet upgrade failure on node

book

Article ID: 416665

calendar_today

Updated On:

Products

Tanzu Kubernetes Runtime

Issue/Introduction

  • The output of "kubectl get nodes" from the Supervisor environment shows that control plane nodes are upgraded, while the ESXi worker nodes are not.

     k get nodes
    NAME                        STATUS   ROLES                  AGE     VERSION
    <control plane node1>       Ready    control-plane,master   46h     v1.29.7+vmware.wcp.1
    <control plane node2>       Ready    control-plane,master   46h     v1.29.7+vmware.wcp.1
    <control plane node3>       Ready    control-plane,master   47h     v1.29.7+vmware.wcp.1
    <Wokernode1>                Ready    agent                  374d    v1.28.2-sph-5111a65
    <Wokernode2>                Ready    agent                  374d    v1.28.2-sph-5111a65
    <Wokernode3>                Ready    agent                  374d    v1.28.2-sph-5111a65
    <Wokernode4>                Ready    agent                  374d    v1.28.2-sph-5111a65

  • In the vSphere Client under Updates tab in the cluster, compliance status of the ESXi worker nodes shows as 'Non-Compliant' with the cluster image.
  • vCenter update manager logs indicates that the hosts are not compliant with the cluster image, which is preventing the Spherelet VIB upgrade.

    /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server.log

    YYYY-MM-DD:T:HH:MM:SS  error vmware-vum-server[######] [Originator@### sub=ClusterApplySolutionTask] [Task, ###] Task:com.vmware.vcIntegrity.lifecycle.ClusterApplySolutionTask ID:#######-###-###-###-#########. Task Failed. Error: Error:
    -->    com.vmware.vapi.std.errors.error
    --> Messages:
    -->    com.vmware.vcIntegrity.lifecycle.ClusterApplySolutionTask.IncompatibleHosts<Solution specification in the image are incompatible with hosts

    YYYY-MM-DD:T:HH:MM:SS warning vmware-vum-server[######] [Originator@#### sub=TaskStatsCollector] [taskStatsCollector ###] Task type or creation time not present
    YYYY-MM-DD:T:HH:MM:SS info vmware-vum-server[#####] [Originator@#### sub=PM.AsyncTask.ClusterApplySolutionTask{####}] [vciTaskBase ####] SerializeToVimFault fault:
    --> (vmodl.fault.SystemError) {
    -->    faultCause = (vmodl.MethodFault) null,
    -->    faultMessage = (vmodl.LocalizableMessage) [
    -->       (vmodl.LocalizableMessage) {
    -->          key = "com.vmware.vcIntegrity.lifecycle.ClusterApplySolutionTask.IncompatibleHosts",
    -->          arg = (vmodl.KeyAnyValue) [
    -->             (vmodl.KeyAnyValue) {
    -->                key = "1",
    -->                value = "<workernode1 >,<workernode2 >,<workernode3 >,<workernode4 >,<workernode5 >,<workernode6 > "

  • Capture the details of the conflicting VIB by running the below command.

    cat /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server.log | grep "are not supported" 
    Example Output: 
    default_message": "Downgrades of Components Host Based Replication Agent for ESX(HBR component for ESX - version 9.0.0 build 24556354) in Solution VMware-HBR-Agent are not supported

Environment

vSphere Supervisor 8.0

Cause

ESXi Host (worker nodes) are non-compliant with the cluster image, which is preventing the Spherelet upgrade.

Resolution

To resolve the issue,

  • Manually remove the VIB causing host non-compliant.

    • Place the host in Maintenance Mode.
    • Enable SSH service and log in to the host using SSH.
    • List the complete VIB details

      esxcli software vib list | grep -i <VIB from the command output> 
      Sample 
      esxcli software vib list | grep hbr
      vmware-hbr-agent               9.0.0-0.24556354                      VMware  VMwareCertified   YYYY-MM-DD    host

    • Validate if the VIB is compatible or in use.
    • Post validation , proceed to uninstall the VIB

      esxcli software vib remove --vibname=<VIB>
      Sample
      esxcli software vib remove --vibname=vmware-hbr-agent


  • After uninstalling the VIB, run a compliance check to verify that the host is now compliant.

Additional Information

To Remove VIBs from ESXi Host Refer Instructions for uninstalling third-party VIBs from an ESXi host.