vRealize Log Insight 8.10 upgrade hung due to Cassandra bouncing
search cancel

vRealize Log Insight 8.10 upgrade hung due to Cassandra bouncing

book

Article ID: 315942

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Symptoms:
You may experience one or more of the following Symptoms.
  • During an upgrade to vRealize Log Insight 8.10 the upgrade appears to fail or hang due to the Cassandra serving continuously restarting.
  • After upgrading to vRealize Log Insight 8.10, the Cassandra service on one or more nodes is continuously restarting.
  • In either case, the /storage/var/loginsight/runtime.log shows errors similar to:
java.lang.StackOverflowError: null
 at com.vmware.loginsight.piql.PIQLQueryHelper.extractIndex(PIQLQueryHelper.java:317) ~[analytics-lib.jar:?]
 ....
 at com.vmware.loginsight.piql.PIQLQueryHelper.extractIndex(PIQLQueryHelper.java:317) ~[analytics-lib.jar:?]
 at com.vmware.loginsight.piql.PIQLQueryHelper.extractIndex(PIQLQueryHelper.java:317) ~[analytics-lib.jar:?]
 at com.vmware.loginsight.piql.PIQLQueryHelper.extractIndex(PIQLQueryHelper.java:317) ~[analytics-lib.jar:?]
 at com.vmware.loginsight.piql.PIQLQueryHelper.extractIndex(PIQLQueryHelper.java:317) ~[analytics-lib.jar:?]
 at com.vmware.loginsight.piql.PIQLQueryHelper.extractIndex(PIQLQueryHelper.java:318) ~[analytics-lib.jar:?]
 at com.vmware.loginsight.piql.PIQLQueryHelper.extractIndexField(PIQLQueryHelper.java:296) ~[analytics-lib.jar:?]
 at com.vmware.loginsight.analytics.MultiLogSearcher.search(MultiLogSearcher.java:411) ~[analytics-service.jar:?]
 at com.vmware.loginsight.analytics.distributed.AbstractLogSearchService.runSearch(AbstractLogSearchService.java:105) ~[analytics-service.jar:?]


Environment

VMware vRealize Log Insight 8.10.x

Cause

Starting in vRealize Log Insight 8.10, the default stack allocation size has been decreased from 1024k to 256k.  More complex alerts configured, then triggered in vRealize Log Insight can hit the limit, causing a stack overflow.

Resolution

This issue was resolved in the latest patched release of VMware Aria Operations for Logs 8.12 released 05/01/2023 (Displayed date 04/20/2023), build number 21696970 available on the Broadcom Support Portal

If you are attempting the upgrade using the original release (21618456) it is recommended to revert to snapshots and attempt the upgrade again using the newly released 8.12 build.

If you are unable to use the patched release, you can instead use the resolution below.


To resolve this issue, increase the stack allocation size.

  1. Log into the target node as root via SSH or Console.
  2. Run the following command to generate a new configuration file:
/usr/lib/loginsight/application/sbin/li-utility.sh --generate_new_config --force

Note: Once the command completes, it will output the file name to be used in step 5.
Example:
  1. Run the following command to output the <service_runtime> configuration section from the /usr/lib/loginsight/application/etc/loginsight-config-base.xml file:
cat /usr/lib/loginsight/application/etc/loginsight-config-base.xml | grep "<service-runtime>" -A 8
  1. Copy the outputted configuration from <service_runtime> to </service_runtime>.
  2. Open the latest configuration file in a text editor:
vi /storage/core/loginsight/config/loginsight-config.xml#number

Note: Replace number with the number of the configuration file that our output in step 2.
Examplevi /storage/core/loginsight/config/loginsight-config.xml#43
  1. Paste the information copied in step 4 in a new line after the </distributed> line.
Note: Press i to enter insert mode, navigate to the end of </distributed> line and press Enter, then paste.
  1. Change the value of thread-stack-size to 1024.
Example:  The file should now read similar to the following:
  1. Press Escape, then run the following command to save and close the file:
:wq!
  1. Run the following command to restart the loginsight service:
systemctl restart loginsight
  1. Repeat steps 1-9 on all other nodes in the cluster.