Node status changing to offline /Going Online /waiting on analytics after cluster initialization
book
Article ID: 310712
calendar_today
Updated On:
Products
VMware Aria Suite
Issue/Introduction
Symptoms:
The Aria Operations Cluster Status shows as Online/ Going Online after Aria Operations Upgrade
Node statuses will constantly swap between Online, Waiting on Analytics, and Offline while services remain online and functional
If vRealize Operations HA is enabled, the HA status will show as Degraded
The /storage/vcops/log/api.log file contains errors similar to the beneath:
2019-09-09 17:42:05,767 [ajp-nio-12x.xx.xx.x-8010-exec-1, amhIkx04tLi48M2HNeneSagVW7DmAewX] ERROR service.impl.ControllerFacadeFactory$ControllerFacadeInvocationHandler - Controller API call 'getShardRemoveProgress' has failed! Reason = [Federation call is not implemented.]
The latest analytics log in /storage/vcops/log/ contains errors similar to the beneath:
2022-03-23T20:10:45,338+0000 ERROR [ServerConnection on port 10000 Thread 1] [ohYuLzDeu1a5OKEQtyUPIbBlTEZBUqp9] com.vmware.vcops.controller.BaseExecutor.executeWithPagination - getCollectors failed: Federation call is not implemented. 2022-03-23T20:10:45,594+0000 ERROR [ServerConnection on port 10000 Thread 1] [ekIGRfJ7xGD1cGZYrStzB7IxdjwJFWHs] com.vmware.vcops.controller.BaseExecutor.executeWithPagination - getCollectors failed: Federation call is not implemented.
Note: The above message is observed after the Analytics startup complete message
This issue is caused by a misconfiguration of the $ALIVE_BASE/user/conf/cis.properties file. The MY_LDUID property must have the same value across all nodes in the vRealize Operations cluster.
Resolution
This requires database intervention and it is adviced to contact Broadcom Support over a SR ticket for further investigation