After changing elastic disk paths 3 pods started to crash after restart
search cancel

After changing elastic disk paths 3 pods started to crash after restart

book

Article ID: 247612

calendar_today

Updated On:

Products

DX Application Performance Management

Issue/Introduction

We changed our nfs path configuration elasticsearch. https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/dx-platform-on-premise/21-3/Administrating/Reconfigure-NFS-path-for-existing-ad-hoc-elasticsearch-nodes.html#concept.dita_1c3762a9-8271-42c9-8b19-7e6e4a34e00a_DeleteHotandWarmNodes

After starting all system jarvis-verifier-5d995889d4-jf5f8,  deployment jarvis-lean-jarvis-indexer, jarvis-indexer-77654b46d9-99gjz is crashing. We cannot login DX platform.

 

In the jarvis indexer log, we see this error

SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 2 out of bounds for length 1
 at com.ca.jarvis.es.service.ClientService.constructDefaultHttpHostsWithTcpHosts(ClientService.java:253)
 at com.ca.jarvis.es.service.ClientService.<init>(ClientService.java:122)
 at com.ca.jarvis.es.service.ESClusterManagerInstanceHolder.getESClusterManager(ESClusterManagerInstanceHolder.java:12)
 at com.ca.jarvis.jmetrics.util.JMetricsUtil.createClusterManager(JMetricsUtil.java:64)
 at com.ca.jarvis.jmetrics.util.JMetricsUtil.<init>(JMetricsUtil.java:60)
 at com.ca.jarvis.indexer.utils.JMetricWrapper.<init>(JMetricWrapper.java:41)
 at com.ca.jarvis.indexer.JarvisIndexerMain.init(JarvisIndexerMain.java:68)
 at com.ca.jarvis.indexer.JarvisIndexerMain.<init>(JarvisIndexerMain.java:52)
 at com.ca.jarvis.indexer.JarvisIndexerMain.main(JarvisIndexerMain.java:168)

 

Environment

Release : 21.3

Component : Introscope

Resolution

Customer has reconfigured NFS path to add more disk spaces. Restarted the cluster afterwards.

Then they found multiple jarvis pods keep crashing with "ArrayIndexOutOfBoundsException' error, including jarvis-indexer, jarvis-verifier, kron, etc. And pretty much all doi pods in Init state.

We finally figured out the jarvis ConfigMap has ES entries that are obviously wrong, containng empty elements and duplicated elements. Most likely it's not updated properly when removing/adding warm nodes to reconfigure NFS.


kubectl edit cm jarvis-configmap -ndxi

Remove duplicate from 
ES_TRANSPORT_URLS:
ES_HTTP_URL:
ES_HTTP_HOSTS:

jarvis-elasticsearch:9300, jarvis-elasticsearch-2:9300, jarvis-elasticsearch-3:9300,,, jarvis-elasticsearch-warm1:9300, jarvis-elasticsearch-warm2:9300 jarvis-elasticsearch-warm1:9300, jarvis-elasticsearch-warm2:9300 jarvis-elasticsearch-warm1:9300, jarvis-elasticsearch-warm2:9300

to

jarvis-elasticsearch:9300, jarvis-elasticsearch-2:9300, jarvis-elasticsearch-3:9300, jarvis-elasticsearch-warm1:9300, jarvis-elasticsearch-warm2:9300


After fixing the CM, and restart impacted pods, all is good. Verified AXA and DOI UI shows up OK with data.

 

 

Additional Information

The file containing the info is jarvis_cm