Cluster unable to come online after restore
search cancel

Cluster unable to come online after restore

book

Article ID: 415594

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  • While investigating a cluster outage it is verified that one of the nodes is unable to participate in the Primary Election or come online. 
  • You are unable to enter emergency mode or boot past the GRUB.
  • Within the logs you are unable to trace the issue at the OS level. 
  • No services are able to come online. 
  • In the /usr/lib/vmware-vcops/user/log/analytics-*.log you see the following error:

 ERROR [Analytics Main Thread]  com.integrien.analytics.AnalyticsMain.uncaughtException - Thread Analytics Main Thread threw an uncaught exception. Exception was:  java.lang.NoClassDefFoundError: org/apache/geode/StatisticsFactory


Environment

Aria Operations 8.X

Cause

This can be due to corruption from storage failures or incomplete restoration.

Resolution

Note: If this issue is experienced in a single node cluster with no other resolution possible, a fresh single node deployment will be required and all historic data lost.

 

  1. Within the Admin console of Aria, select the node to be removed.
  2. Once the node is highlighted, select the red X in the navigation pane to remove the node.
  3. While the node is being removed from the cluster deploy it's replacement.  Be sure to copy the CPU, Memory and Storage exactly.
  4. Once deployment is complete, log into the nodes UI or complete the registration from the clusters admin console.
  5. Verify the node is completely added to the cluster.
  6. Allow 24 hours for the cluster to balance and perform a sizing analysis for verification.

 

Additional Information