vSphere Tanzu Guest Cluster Nodes are getting automatically recreated continuously
search cancel

vSphere Tanzu Guest Cluster Nodes are getting automatically recreated continuously

book

Article ID: 415939

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vSphere Kubernetes Service

Issue/Introduction

  • Some of the guest cluster nodes are getting recreated. This can be verified by running the following command on Supervisor and check the age.

    k get vm -n <Namespace> | grep <Guest cluster name>
    ########-Workernode1	        PoweredOn     91d
    ########-Workernode2            PoweredOn     16d
    ########-Workernode3            PoweredOn     91d
    ########-Workernode4            PoweredOn     2d12h
    ########-Workernode5            PoweredOn     2d12h
    ########-Controlplane node1     PoweredOn     2d12h
    ########-Controlplane node2     PoweredOn     3d11h
    ########-Controlplane node3     PoweredOn     91d


  • On reviewing capi-controller-manager logs on Supervisor, node status is reported as unknown.

    /var/log/pods/capi-controller-manager
    
    YYYY-MM-DD:T:HH:MM:SS stderr F I1017 YYYY-MM-DD:T:HH:MM:SS  1 machinehealthcheck_controller.go:506] "Target has failed health check, marking for remediation" controller="machinehealthcheck" controllerGroup="cluster.x-k8s.io" controllerKind="MachineHealthCheck" MachineHealthCheck="<Namespace>/<Nodename>" namespace="<Namespace>" name="<Nodename>" reconcileID="########-####-####-####-############" Cluster=###-###/###-###-###" target="###-###/###-###-###-######-##-#####/###-###-###-######-##-#####-###############-#####/###-###-###-######-##-#####-###############-#####" reason="UnhealthyNode" message="Condition Ready on node is reporting status Unknown for more than 5m0s"
    

     

  • On reviewing /var/run/log/vobd.log of the ESXi host where the affected node was registered, below errors are seen.

    YYYY-MM-DD:T:HH:MM:SS In(14) vobd[2097669]:  [netCorrelator] ##############us: [vob.net.vmknic.ip.duplicate] A duplicate IP address was detected for ##.##.##.## on interface vmk#. The current owner is <MAC address of ESXi>
    YYYY-MM-DD:T:HH:MM:SS In(14) vobd[2097669]:  [netCorrelator] ##############us: [esx.problem.net.vmknic.ip.duplicate] Duplicate IP address detected for ##.##.##.## on interface vmk#, current owner being <MAC address of ESXi>.
    YYYY-MM-DD:T:HH:MM:SS In(14) vobd[2097669]:  [netCorrelator] ##############us: [vob.net.vmknic.ip.duplicate] A duplicate IP address was detected for ##.##.##.## on interface vmk#. The current owner is <MAC address of ESXi>.
    


    See Duplicate IP address detected

Environment

  • vSphere 8.0.
  • VMware vSphere Kubernetes Service.

Cause

  • Due to an IP conflict on the ESXi host Management NIC, the Supervisor Worker node (ESXi) became unhealthy, resulting in the recreation of the Guest cluster Node. 

Resolution

Fix the IP conflict on the ESXi host and ensure the Supervisor Worker Nodes (ESXi) are in Ready state.