VCF Operations deployment fails with Error: LCMVROPSSYSTEM29002 | Operations collector group operation failure
search cancel

VCF Operations deployment fails with Error: LCMVROPSSYSTEM29002 | Operations collector group operation failure

book

Article ID: 418548

calendar_today

Updated On:

Products

VCF Operations

Issue/Introduction

The VCF stack deployment is stuck as VCF Operations deployment stage is in failed status with Error: 

"messageId": "LCMVROPSSYSTEM29002",
"message": "Operations collector group operation failure",
"exceptionMessage": "Wait timeout, Cloud Proxy node didn't join the Operations cluster, retry after sometime."
 

Environment

VCF Operations 9.x

Cause

When the HAProxy service starts, it attempts to resolve all FQDNs specified in its configuration. In this case, resolution of the hostnames below is attempted:

  • api-devlvn.broadcom.net
  • eapi-gcpstg.broadcom.com
  • eapi.broadcom.com

Normally, DNS should quickly return an NXDOMAIN response (even with recursive DNS enabled). However, in this situation, the DNS resolution instead timed out.

This indicates an issue with DNS infrastructure, network firewall configuration, or overall DNS responsiveness, rather than recursion itself. 

Resolution

Engineering are aware of this issue and it will be resolved in an upcoming release.

Workaround:

As a temporary workaround to unblock VCF deployment, add these entries to the end of /etc/hosts file of the Cloud Proxy (replacing <GW IP Addr> with the IP address of the Default Gateway).

<GW IP Addr> api-devlvn.broadcom.net 
<GW IP Addr> eapi-gcpstg.broadcom.com
<GW IP Addr> eapi.broadcom.com