When deploying or expanding a VMware Aria Operations cluster using VMware Aria Suite Lifecycle (LCM), the request fails at the "Initializing Cluster" or "Cluster Configuration" stage.
Error Code: LCMVROPSYSTEM25001
Error Message: VMware Aria Operations initializing cluster failure. VMware Aria Operations cluster configurations failed.
The Aria Operations admin UI may show a status of "Failed" or remain stuck at "Waiting for Analytics."
In /storage/log/vcops/log/analytics/Analytics-<UUID>.log you can see the following errors indicating communication breakdowns between nodes:
Port 6061 (Locator) Error:
INFO [ajp-nio-127.0.0.1-8010-exec-4, Info] client.internal.AutoConnectionSourceImpl - locator /<Node-IP>:6061 is not running.
java.net.ConnectException: Connection refused (Connection refused)
Port 10008 (Internal Data Communication) Error:
WARN [Membership Messenger Sender Non-Blocking] com.vmware.gemfire.tcpmessenger.internal.ClientHandler.exceptionCaught - Asynchronous Messaging Client (local addy: /<Source-IP>:59268, remote addy: /<Destination-IP>:10008) got an I/O exception communicating with server: java.io.IOException: Connection reset by peerVMware Aria Operations 8.18.x
VMware Aria Suite Lifecycle 8.x
This issue is typically caused by environmental restrictions preventing the nodes from communicating or synchronizing correctly:
Network/Port Restrictions: Firewalls or network security groups are blocking internal communication ports between the nodes, resulting in "Connection refused" or connection timeouts in the logs.
Time Synchronization (NTP): A clock skew between nodes (typically >60 seconds) causes JWT (JSON Web Token) authentication failures. The logs will show VcopsJwtAuthenticationFilter errors indicating the token has expired or is not yet valid.
To resolve this issue, you must ensure that all nodes can communicate over the required ports and share a synchronized clock.
Step 1: Verify and Open Required Network Ports Ensure that your network allows unrestricted communication between all vROps nodes (Primary, Replica, and Data nodes).
Test specific connectivity from the node console using the curl command. For example:
curl -v <Target-Node-IP>:6061
curl -v <Target-Node-IP>:10008
Important: While ports 6061 and 10008 are critical for GemFire cluster membership, vROps requires a wider range of ports for full functionality. Please ensure your firewalls are configured according to the master port list found in: TCP and UDP ports required to access VMware vRealize Operations Manager.
Step 2: Correct NTP Synchronization
Verify that all deployed nodes (as well as the Aria Suite Lifecycle appliance) are configured to use the same, reachable NTP server.
Check the time on all nodes via the command line to ensure there is zero or near-zero clock skew.
Address any Receive timed out errors between the vROps appliances and the NTP server.
Once the network blockages are removed and the time is synchronized, retry the deployment or cluster expansion workflow from Aria Suite Lifecycle.