Cluster Diagnostics report NodePolicy and NodeProfileStatus as failed.
search cancel

Cluster Diagnostics report NodePolicy and NodeProfileStatus as failed.

book

Article ID: 382404

calendar_today

Updated On:

Products

VMware Telco Cloud Automation

Issue/Introduction

  • TCA cluster Diagnostics report the below errors when we run the diagnostics : 

             NodePolicy Diagnosis                                                  | FAIL |
             Got error messages:

    • <NPXX>doesn't work as expected
    • <NPXX> doesn't work as expected
      ------------------------------------------------------------------------------
      NodePolicyMachineStatus Diagnosis                                     | FAIL |
      Got error messages:
    • <Machine name > doesn't work as expected
    • <Machine name > doesn't work as expected
    • <Machine name > doesn't work as expected

 

  • These VMs will report as configuring under the nodepool config. 

Environment

3.1 or below.

Cause

Nodeprofilestatus is not stable when different nodeconfig CRs creation Timestamps are exactly same, nodeprofile status keeps jumping from "Running" and "Normal".

Resolution

The issue is fixed in TCA 3.2.

Run the below commands to work around the issue. 

  • kubectl get nodeconfig -A | grep -v <Nodepool name>
  • kubectl get nodeconfig -n tca-system <kubeconfig-nodeprofle-xx> -o yaml >kube.yaml
  • kubectl get nodeconfig -n tca-system <addon-np-airgap-xxx> -o yaml >airgap.yaml
  • kubectl delete -f kube.yaml
  • kubectl delete -f airgap.yaml
  • kubectl apply -f kube.yaml
  • kubectl apply -f airgap.yaml
  • watch "kubectl get nps -A | grep -v Normal"