DX Platform - unable to add elasticsearch WARM/HOT nodes using the dxi-es-admin.sh script

book

Article ID: 221399

calendar_today

Updated On:

Products

DX Operational Intelligence DX Application Performance Management CA App Experience Analytics

Issue/Introduction

While attempting to add elasticsearch warm nodes to a kubernetes cluster using the command:

./dxi-es-admin.sh --add WARM

or

./dxi-es-admin.sh --add HOT

This returns an error:

Please enter namespace
**** Jarvis Elasticsearch Pod is not running

Cause

This problem is related to defect DE511341 

 

Environment

DX Operational Intelligence 20.x
DX Application Performance Management 20.x
DX AXA 20.x

Resolution

Fix is part of coming DX OI 21.3

 

Current workaround: use attached fixed script.

1. Download and copy attached dxi-es-admin-fix.sh to <DX-Platform Installer HOME directory>

2. Run : ./dxi-es-admin.sh --add HOT|WARM  (are required)

NOTE: You can only add 2 WARM nodes and this is only possible for Medium Elastic deployments (that has already 3 Elastic nodes)

 

Below is an example of adding a 4th Elastic HOT node:

[[email protected] tools]# ./dxi-es-admin-fix.sh --add HOT
Please enter namespace
dxi        
Enter the number of nodes to be labeled to deploy ES
1
##### List of available nodes in cluster #####
munqa001493.bpc.broadcom.net
munqa001494.bpc.broadcom.net
munqa001495.bpc.broadcom.net
munqa001496.bpc.broadcom.net
munqa001497.bpc.broadcom.net
munqa001498.bpc.broadcom.net
##############################################
Please enter the ES Node 1: munqa001497.bpc.broadcom.net
##############################################
escount ====== 4
***** Labelling ES node munqa001497.bpc.broadcom.net *****
node/munqa001497.bpc.broadcom.net labeled
***** Checking Memory Setting on ES node *****
vm.max_map_count=262144 already exists
***** Checking Max System Open Files Limit (65536) on ES node *****
fs.file-max=6296074 already exists
***** Checking Max User Open Files Limit (65536) on ES node *****
munqa001497.bpc.broadcom.net Already has nproc=65536 and nofiles=65536 values set !!!
deployment.extensions/jarvis-elasticsearch-4 created
Error from server (AlreadyExists): error when creating "ES-service-munqa001497.bpc.broadcom.net.yml": services "jarvis-elasticsearch-4" already exists
escount value ...  4
ingress.extensions/jarvis-es configured

 

VERIFICATION: I can see the label and instance


[[email protected] tools]# kubectl get nodes --show-labels | grep master-data
munqa001494.bpc.broadcom.net   Ready     compute                5d        v1.11.0+d4cacc0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dxi-es-node=master-data-1,kubernetes.io/hostname=munqa001494.bpc.broadcom.net,node-role.kubernetes.io/compute=true
munqa001495.bpc.broadcom.net   Ready     compute                5d        v1.11.0+d4cacc0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dxi-es-node=master-data-2,kubernetes.io/hostname=munqa001495.bpc.broadcom.net,node-role.kubernetes.io/compute=true
munqa001496.bpc.broadcom.net   Ready     compute                5d        v1.11.0+d4cacc0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dxi-es-node=master-data-3,kubernetes.io/hostname=munqa001496.bpc.broadcom.net,node-role.kubernetes.io/compute=true
munqa001497.bpc.broadcom.net   Ready     compute                5d        v1.11.0+d4cacc0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dxi-es-node=master-data-4,kubernetes.io/hostname=munqa001497.bpc.broadcom.net,node-role.kubernetes.io/compute=true
[[email protected] tools]# kubectl get pods | grep elastic
jarvis-elasticsearch-2-56989f9675-5jtdh               1/1       Running     0          4d
jarvis-elasticsearch-3-76959ccf5f-cpjsq               1/1       Running     0          4d
jarvis-elasticsearch-4-69cf97f4db-hccx6               1/1       Running     0          52s
jarvis-elasticsearch-7b57545f9b-q4j5v                 1/1       Running     0          4d

 

 

IMPORTANT

The dxi-es-admin.sh script doesn't provide any option to remove the additional HOT or WARM nodes added by this script

If you need to undo the changes you would need to remove the labels and deployments manually, you can find below an example:

Use case:

Using dxi-es-admin.sh you have deployed the below 3 additional ES nodes and you need to remove them:

- 2 x ES HOT node in munqa001497 and munqa001498 

- 2 x WARM nodes in munqa001499 munqa001500

Solution: 

- Here are the list of command you would need to use to undo this allocation:

For HOT nodes:

1) show the hot nodes

kubectl get nodes --show-labels | grep master-data

2) remove the label

kubectl label nodes <node-name> dxi-es-node-

3) show the hot dployments

kubectl get deployments -n <namespace> | grep jarvis-elasticsearch

4) delete the ES HOT deployment

kubectl delete deploy <jarvis-elasticsearch-#> -n<namespace>


For WARM nodes:

1) show the warm nodes

kubectl get nodes --show-labels | grep eswarm

2) remove the label

kubectl label nodes <node-name> dxi-es-node-

3) show the warm dployments

kubectl get deployments -n <namespace> | grep warm

4) delete the ES WARM deployment

kubectl delete deploy <jarvis-elasticsearch-warm#> -n<namespace>

- for this example:

Removing HOT nodes:

kubectl get nodes --show-labels | grep master-data
kubectl label nodes munqa001497.bpc.broadcom.net dxi-es-node-
kubectl label nodes munqa001498.bpc.broadcom.net dxi-es-node-
kubectl get deployments -ndxi | grep jarvis-elasticsearch
kubectl delete deploy jarvis-elasticsearch-4 -ndxi
kubectl delete deploy jarvis-elasticsearch-5 -ndxi

Removing WARM nodes:

kubectl get nodes --show-labels | grep eswarm
kubectl label nodes munqa001499.bpc.broadcom.net dxi-es-node-
kubectl label nodes munqa001500.bpc.broadcom.net dxi-es-node-
kubectl get deployments -ndxi | grep warm
kubectl delete deploy jarvis-elasticsearch-warm1 -ndxi
kubectl delete deploy jarvis-elasticsearch-warm2 -ndxi

Additional Information

DX AIOPs Troubleshooting, Common Issues and Best Practices

Attachments

1630674256441__dxi-es-admin-fix.sh get_app