AIOps - dxi-es-admin.sh fails with error "Jarvis Elasticsearch Pod is not running"
search cancel

AIOps - dxi-es-admin.sh fails with error "Jarvis Elasticsearch Pod is not running"

book

Article ID: 221399

calendar_today

Updated On:

Products

DX Operational Intelligence DX Application Performance Management CA App Experience Analytics

Issue/Introduction

Unable to add elasticsearch WARM/HOT nodes using the dxi-es-admin.sh script, while attempting to execute the script to add HOT or WARM nodes it returns the message:

**** Jarvis Elasticsearch Pod is not running

Environment

DX Platform 20.2.x ONLY

 

Cause

This problem is related to defect DE511341, fixed in DX Platform 21.3

 

IMPORTANT:

DO NOT attempt to use the dxi-es-admin.sh script from a DX Platform 21.3 in a 20.2.1 environment, the script is not backwards compatible as it required specific templates that are bundled with DX Platform 21.3 only

Resolution

 

How to add additional HOT/WARM nodes?

  1. Download dxi-es-admin-fix_20210923.sh
  2. Copy the script to <DX-Platform Installer>/tools folder
  3. chmod +x dxi-es-admin-fix_20210923.sh
  4. Execute the script:

To add hot elastic nodes : ./dxi-es-admin.sh --add HOT 
To add hot elastic nodes : ./dxi-es-admin.sh --add WARM 

NOTES

- For Openshift setups: you can use this script to add HOT or WARM nodes 
- For Kubernetes setups: you can use this script to add HOT nodes only, there is a remaining issue in the attached script to add WARM nodes, if you need a fix contact Broadcom Support. 

      5. Example:

configuring 001497example.com as additional Elastic HOT node:

./dxi-es-admin-fix.sh --add HOT
Please enter namespace
dxi        
Enter the number of nodes to be labeled to deploy ES
1
##### List of available nodes in cluster #####
001493example.com
001494example.com
001495example.com
001496example.com
001497example.com
001498example.com
##############################################
Please enter the ES Node 1: 001497example.com
##############################################
escount ====== 4
***** Labelling ES node 001497example.com *****
node/001497example.com labeled
***** Checking Memory Setting on ES node *****
vm.max_map_count=262144 already exists
***** Checking Max System Open Files Limit (65536) on ES node *****
fs.file-max=6296074 already exists
***** Checking Max User Open Files Limit (65536) on ES node *****
001497example.com Already has nproc=65536 and nofiles=65536 values set !!!
deployment.extensions/jarvis-elasticsearch-4 created
Error from server (AlreadyExists): error when creating "ES-service-001497example.com.yml": services "jarvis-elasticsearch-4" already exists
escount value ...  4
ingress.extensions/jarvis-es configured

 

VERIFICATION


kubectl get nodes --show-labels | grep master-data
001494example.com   Ready     compute                5d        v1.11.0+d4cacc0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dxi-es-node=master-data-1,kubernetes.io/hostname=001494example.com,node-role.kubernetes.io/compute=true
001495example.com   Ready     compute                5d        v1.11.0+d4cacc0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dxi-es-node=master-data-2,kubernetes.io/hostname=001495example.com,node-role.kubernetes.io/compute=true
001496example.com   Ready     compute                5d        v1.11.0+d4cacc0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dxi-es-node=master-data-3,kubernetes.io/hostname=001496example.com,node-role.kubernetes.io/compute=true
001497example.com   Ready     compute                5d        v1.11.0+d4cacc0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dxi-es-node=master-data-4,kubernetes.io/hostname=001497example.com,node-role.kubernetes.io/compute=true


kubectl get pods | grep elastic

jarvis-elasticsearch-2-56989f9675-5jtdh               1/1       Running     0          4d
jarvis-elasticsearch-3-76959ccf5f-cpjsq               1/1       Running     0          4d
jarvis-elasticsearch-4-69cf97f4db-hccx6               1/1       Running     0          52s
jarvis-elasticsearch-7b57545f9b-q4j5v                 1/1       Running     0          4d

 

How to delete the additional HOT/WARM nodes?

You cannot use dxi-es-admin.sh to remove the additional HOT or WARM nodes, follow the below process to manually delete the node(s):

For HOT nodes:

1) show the hot nodes
kubectl get nodes --show-labels | grep master-data

2) remove the label
kubectl label nodes <target-node-name> dxi-es-node-

3) show the hot dployments
kubectl get deployments -n <namespace> | grep jarvis-elasticsearch

4) delete the ES HOT deployment
kubectl delete deploy <jarvis-elasticsearch-#> -n<namespace>


For WARM nodes:

1) show the warm nodes
kubectl get nodes --show-labels | grep eswarm

2) remove the label
kubectl label nodes <node-name> dxi-es-node-

3) show the warm dployments
kubectl get deployments -n <namespace> | grep warm

4) delete the ES WARM deployment
kubectl delete deploy <jarvis-elasticsearch-warm#> -n<namespace>

In the below example, 4 additional ES nodes have been added using dxi-es-admin.sh but they need to be removed:

- 2 x ES HOT node : 001497 and 001498 
- 2 x WARM nodes : 001499 001500

SOLUTION:

a) Remove HOT nodes:

kubectl get nodes --show-labels | grep master-data
kubectl label nodes 001497example.com dxi-es-node-
kubectl label nodes 001498example.com dxi-es-node-
kubectl get deployments -ndxi | grep jarvis-elasticsearch
kubectl delete deploy jarvis-elasticsearch-4 -ndxi
kubectl delete deploy jarvis-elasticsearch-5 -ndxi
kubectl delete svc jarvis-elasticsearch-4 -ndxi
kubectl delete svc jarvis-elasticsearch-5 -ndxi

b) Remove WARM nodes:

kubectl get nodes --show-labels | grep eswarm
kubectl label nodes 001499example.com dxi-es-node-
kubectl label nodes 001500example.com dxi-es-node-
kubectl get deployments -ndxi | grep warm
kubectl delete deploy jarvis-elasticsearch-warm1 -ndxi
kubectl delete deploy jarvis-elasticsearch-warm2 -ndxi
kubectl delete svc jarvis-elasticsearch-warm1 -ndxi
kubectl delete svc jarvis-elasticsearch-warm2 -ndxi

Additional Information

DX AIOPs Troubleshooting, Common Issues and Best Practices

Attachments

1633354960533__dxi-es-admin-fix_20210923.sh get_app