ALERT: Some images may not load properly within the Knowledge Base Article. If you see a broken image, please right-click and select 'Open image in a new tab'. We apologize for this inconvenience.

DX OI - doi-situations-0 pod in Terminating status

book

Article ID: 225626

calendar_today

Updated On:

Products

DX Operational Intelligence

Issue/Introduction

We noticed that doi-situations-0 pod is not running and always in Terminating status. 

How can we fix the issue?

Cause

doi-situations is a stateful set and this will be recreated by Kubernetes just like any other k8s object in case this is Terminated but in case this is in "Terminating" state i.e its waiting for something to happen before this Pod gets terminated.
 
Sometimes, the pod does not come out from the Terminating state and this can happen either the pod has a finalizer associated with it that is not completing, or the pod is not responding to termination signals sent by Kubernetes API Server .
 

This is also expected behaviour if the node on which a stateful-set pod was running is unreachable

The following link from Kubernetes explains this

https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/ 

"A Pod is not deleted automatically when a node is unreachable. The Pods running on an unreachable Node enter the 'Terminating' or 'Unknown' state after a timeout. Pods may also enter these states when the user attempts graceful deletion of a Pod on an unreachable Node. The only ways in which a Pod in such a state can be removed from the apiserver are as follows:

The Node object is deleted (either by you, or by the Node Controller).
The kubelet on the unresponsive Node starts responding, kills the Pod and removes the entry from the apiserver.
Force deletion of the Pod by the user."

 

 
 

Environment

DX Platform 20.2.x

DX Operational Intelligence 20.2.x

Resolution

You can invoke the following command to remove the Pod from the system and a new Pod will get created on the Application node:
 
kubectl delete pod <PODNAME> --grace-period=0 --force --namespace <NAMESPACE>
 
 
NOTE:
that actual reason of the Pod in terminating state can be can be found under the Events section of the output of the below command 
 
kubectl describe pod <PODNAME> --namespace <NAMESPACE>
 

Additional Information

DX AIOPs - Troubleshooting, Common Issues and Best Practices

Attachments