Error: Unable to create k8s cluster with error management cluster[xxx] is not in running state

search cancel

Error: Unable to create k8s cluster with error management cluster[xxx] is not in running state

book

Article ID: 321872

calendar_today

Updated On:

Products

VMware Telco Cloud Automation

Issue/Introduction

The cluster creation fails. In the TCA-CP logs(app.log) this is what we see:

2021-04-21 14:59:44.126 UTC [InfraAutomationService_SvcThread-4, Ent: HybridityAdmin, Usr: [email protected], , TxId: ########-####-####-####-############] ERROR c.v.h.s.i.k8s.CreateK8sClusterJob- Unable to create k8s cluster with error management cluster [Management-cluster-name] is not in running status
java.lang.RuntimeException: management cluster [Management-cluster-name] is not in running status
        at com.vmware.hybridity.bootstrapper.adapter.BootStrapperAdapter.createWorkloadCluster(BootStrapperAdapter.java:497)
        at com.vmware.hybridity.service.infraautomation.k8s.CreateK8sClusterJob.doBegin(CreateK8sClusterJob.java:127)
        at com.vmware.hybridity.service.infraautomation.k8s.CreateK8sClusterJob.run(CreateK8sClusterJob.java:55)
        at com.vmware.vchs.hybridity.messaging.LoggingJobWrapper.run(LoggingJobWrapper.java:41)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

In the bootstrapperd.log this is what we see:
Apr 21 14:59:44 apiserverd[426] : [Err-ApiServer] : management cluster [Management-cluster-name] is not in running status
Apr 21 14:59:44 apiserverd[426] : [Err-ApiServer] : checking the object status before creating workload cluster[Cluster-Name: ########-####-####-####-############], err is [management cluster
 [Management-cluster-name] is not in running status]
Apr 21 14:59:44 apiserverd[426] : [Err-ApiServer] : HTTP 400 - bad_request - management cluster [Management-cluster-name] is not in running status

Environment

2.x

Cause

-The issue is caused because pod nodeconfig-operator in not healthy state in the management cluster

Resolution

Step A:

SSH into tca-cp
Run command "ccli", then it in interactive
Run command "list mc" to get the index of management cluster Management-Cluster-Name
Run command "go index_of_management_cluster"
Show the management status by "show status"

The status should show failed.

Step B:

Check the pods in management cluster by running kubectl get pods
You will see a pod named nodeconfig-operator in error state
Take snapshot on management cluster node VMs and then delete the pod
In tca-cp run kbsctl debug set-cluster-status -i UUID-of-Mgmt-Cluster -s Running

Feedback

thumb_up Yes

thumb_down No