AIOps - How to gracefully restart all DX services and servers?

book

Article ID: 202717

calendar_today

Updated On:

Products

DX Operational Intelligence DX Application Performance Management CA App Experience Analytics

Issue/Introduction

We need to reboot the servers for maintenance, what are the correct steps to gracefully restart all servers

 

Environment

DX Operational Intelligence 1.3.x, 20.x
DX Application Performance Management 11.x, 20.x
DX App Experience Analytics (AXA)  20.x

Resolution

DX Platform 20.x

1) Stop DX Platform services 
 
cd <DX-HOME>/bin
./dx-admin.sh stop
 

2) Wait until all the pods are down. Run the following command to check the status of the pods:

kubectl get -n<your project name> pods
 
You can ignore the pods in "Completed" status
 
 
3) Perform a graceful shutdown of all the openshift or kubernetes servers
 
reboot
 

4) Once all the servers are back and running, verify the cluster, make sure all nodes are in "Ready" status:

kubectl get nodes
 
For example:
NAME                             STATUS    ROLES                  AGE       VERSION
lvntest010772.bpc.net   Ready     compute,infra,master   177d      v1.11.0+d4cacc0
lvntest010773.bpc.net   Ready     compute                177d      v1.11.0+d4cacc0
lvntest010774.bpc.net   Ready     compute                177d      v1.11.0+d4cacc0
...
 
5) Start DX Platform
 
cd <DX-HOME>/tools
./dx-admin.sh start
 
this process can take several minutes
 

6) Wait until all the pods are up. Run the following command to check the status of the pods:

kubectl get -n<your project name> pods
 

7) (Optional) You can safely delete all the evicted pods (if any available): 
for evicted in $(kubectl get pods | grep "Evicted" | awk '{print $1}'); do kubectl delete pod ${evicted}; done
 
 

DOI 1.3.2

1) Stop DOI services 
 
cd <DOI-HOME>/bin
./stopServices.sh
 
 
2) Wait until all the pods are down. Run the following command to check the status of the pods::
oc get pods -n <your project name>
 
You can ignore the pods in "Completed" status
 

3) Perform a graceful shutdown of the servers :
 
reboot
 

4) Once all the servers are back and running, verify the cluster, make sure all nodes are in "Ready" status:
 
oc get nodes
 
For example:
NAME                             STATUS    ROLES                  AGE       VERSION
lvntest010772.bpc.net   Ready     compute,infra,master   177d      v1.11.0+d4cacc0
lvntest010773.bpc.net   Ready     compute                177d      v1.11.0+d4cacc0
lvntest010774.bpc.net   Ready     compute                177d      v1.11.0+d4cacc0
 

5) Start OI
 
cd <OI-HOME>/bin
./startServices.sh
 

6) Wait until all the pods are up. Run the following command to check the status of the pods::
oc get pods -n <your project name>
 

7) (Optional) You can safely delete all the evicted pods (if any available): 
for evicted in $(oc get pods | grep "Evicted" | awk '{print $1}'); do oc delete pod ${evicted}; done
 
 

Additional Information

DX AIOPs - Troubleshooting, Common Issues and Best Practices
https://knowledge.broadcom.com/external/article/190815