Diagnosis Controlplane Service stuck in STARTING state
search cancel

Diagnosis Controlplane Service stuck in STARTING state

book

Article ID: 325376

calendar_today

Updated On:

Products

VMware Telco Cloud Automation

Issue/Introduction

  • TCA Diagnosis API Service (nfv-tca-diagnosis-api) required to be in running state to perform diagnosis of TKG cluster.
  • As below services are not being used by TCA
    • Diagnosis Control Plane Service (tca-diagnosis-controlplane)
    • Esxi Diagnosis Plugin Service (tca-diagnosis-esxi-plugin)
    • Node Diagnosis Plugin Service (tca-diagnosis-node-plugin)
    • Diagnosis Controlplane service  (tca-diagnosis-controlplane) within TCA is stuck at STARTING state when logged in to TCA URL:944
  • Performing a Diagnosis on TKG cluster works from Caas Infrastructure, the status is pass (all green)
  • [admin@tca-cp ~]$ systemctl status tca-diagnosis-controlplane
    tca-diagnosis-controlplane.service - TCA Diagnosis ControlPlane Service

Loaded: loaded (/etc/systemd/system/tca-diagnosis-controlplane.service; enabled; vendor preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Wed 2024-02-28 07:43:32 UTC; 3s ago
Process: 72183 ExecStart=/bin/sh -a -c POSTGRESQL_PASSWORD=$(cat /common/pgsql/passwords/tca_admin);/opt/vmware/tca-diagnosis-controlplane/tca-diagnosis-controlplane (code=exited, status=1/FAILURE)

Feb 28 07:51:03 tca-cp sh[74660]: time="28-02-2024 07:51:03.754 UTC" level=error msg="unable to bootstrap kafka topics or failed to connect with kafka brokers,diagnosis control plane will not able to communicate with plugins" error=>
Feb 28 07:51:03 tca-cp sh[74660]: time="28-02-2024 07:51:03.754 UTC" level=error msg="failed to bootstrap tca-cp diagnosis controlplane" error="failed to dial: failed to open connection to [::ffff:127.0.0.1:9092]:9092: dial tcp: loo>


Environment

2.3

Cause

  • Issue is due to kafka IPv6 [::ffff:127.0.0.1:9092]:9092 connectivity.
  • Impacts clusters with IPv6 enabled and will have no impact on TCA functionality as the disabled services are not in use.

Resolution

Resolved in TCA 3.0.

Workaround:

  • Perform a disable operation such that the service never starts:
    systemctl disable tca-diagnosis-controlplane