NSX-T Cloud Stuck in “In Progress” State Due to Cloud Connector Crash
search cancel

NSX-T Cloud Stuck in “In Progress” State Due to Cloud Connector Crash

book

Article ID: 431584

calendar_today

Updated On:

Products

VMware Avi Load Balancer

Issue/Introduction

  • In the VMware AVI Load Balancer UI, the NSX-T Data Center Cloud may remain stuck in “In Progress” state under:

Infrastructure --> Clouds

 

 

 

  • The journalctl logs for cloudconnectorgo.service show the error “signal SIGSEGV: segmentation violation code=0x1.” Please check the cloudconnectorgo.service journal logs on the Controller Leader node for further details.
##journalctl -u cloudconnectorgo.service 

Feb 25 00:59:29  cloudconnectorgo-service[232755]: Wed Feb 25 12:59:29 AM /run/systemd/system/env/cloudconnectorgo.service.env found, sourced. action: start source: local_process.py:_start
Feb 25 00:59:29 cloudconnectorgo-service[232756]: Wed Feb 25 12:59:29 AM UTC  /run/systemd/system/env/cloudconnectorgo.service.env found, sourced. action: post_start source: local_process.py:_start
Feb 25 00:59:29  cloudconnectorgo-service[232755]: init global config instance failed. If you do not use this, just ignore it.  open conf/app.conf: no such file or directory
Feb 25 00:59:29  cloudconnectorgo-service[232755]: WARNING: proto: file "auth.proto" is already registered
Feb 25 00:59:29  cloudconnectorgo-service[232755]: See https://protobuf.dev/reference/go/faq#namespace-conflict
Feb 25 00:59:30  systemd[1]: Started cloudconnectorgo.service - Cloud connector Go service.
Feb 25 01:00:17  cloudconnectorgo-service[232755]: panic: runtime error: invalid memory address or nil pointer dereference
Feb 25 01:00:17  cloudconnectorgo-service[232755]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x696bd39]
Feb 25 01:00:17  cloudconnectorgo-service[232755]: goroutine 84 [running]:

 

 

  • The cloudconnectorgo.service will be in a failed state on the controller leader node. You can verify this by running the following command

         systemctl status cloudconnectorgo.service

 

 

Environment

  • Avi Load Balancer
  • NSX-T Cloud
  • Affected Version: ALL

Cause

  • The issue occurs when the data network configuration of the cloud is deleted via CLI while Virtual Services still exist on that cloud.
  • During route programming, the cloud connector attempts to process routes, but because the data network configuration is empty, the cloud connector service crashes, causing the cloud to remain stuck in “In Progress.”

Resolution

Workaround:

Follow the steps below to remediate the issue

 

  1. Disable all Virtual Services under the affected cloud (In the AVI UI, navigate to Applications, select the required Virtual Services, and click on Disable).

  2. Reconfigure the Data Network for the cloud via CLI:

[admin:controller]: > configure cloud <name>
[admin:controller]: cloud> nsxt_configuration
[admin:controller]: cloud:nsxt_configuration> data_network_config
[admin:controller]: cloud:nsxt_configuration:data_network_config> tier1_segment_config
[admin:controller]: cloud:nsxt_configuration:data_network_config:tier1_segment_config> manual
[admin:controller]: cloud:nsxt_configuration:data_network_config:tier1_segment_config:manual> tier1_lrs
New object being created
[admin:controller]: cloud:nsxt_configuration:data_network_config:tier1_segment_config:manual:tier1_lrs> tier1_lr_id <tier1-lr-id>
[admin:controller]: cloud:nsxt_configuration:data_network_config:tier1_segment_config:manual:tier1_lrs> segment_id <segment-id>
[admin:controller]: cloud:nsxt_configuration:data_network_config:tier1_segment_config:manual:tier1_lrs> save
[admin:controller]: cloud:nsxt_configuration:data_network_config:tier1_segment_config:manual> save
[admin:controller]: cloud:nsxt_configuration:data_network_config:tier1_segment_config> save
[admin:controller]: cloud:nsxt_configuration:data_network_config> save
[admin:controller]: cloud:nsxt_configuration> save
[admin:controller]: cloud> save

    3.Once the configuration is saved successfully, re-enable all Virtual Services.

 

 

Fix:

A defensive validation check will be introduced in a future release to prevent cloud connector service from crashing.