CNF- pod deletion fails- pod remains Ready on cluster node and cannot be stopped
search cancel

CNF- pod deletion fails- pod remains Ready on cluster node and cannot be stopped

book

Article ID: 400809

calendar_today

Updated On:

Products

VMware Telco Cloud Automation VMware Telco Cloud Platform

Issue/Introduction

- While decommissioning the CNF its observed that the pod cannot be deleted and remain on cluster node forever.

- Pod status:

 

sudo crictl ps
CONTAINER           IMAGE               CREATED             STATE               NAME                    ATTEMPT             POD ID              POD
49123a3fce0ef       5dde1f80b4d4c       2 days ago          Running             paps                    0                   e7287b63125ed       testpod
---
This pod no longer exist in k8s!
crictl pod statsp:
POD                 POD ID              CPU %               MEM
testpod   e7287b63125ed       0.41                777.3MB

 

Command to delete the pod:

sudo crictl stopp $podid

Error:

E0520 13:32:03.181573  774987 remote_runtime.go:205] "StopPodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" podSandboxID="e7287b63125ed"
FATA[0002] stopping the pod sandbox "e7287b63125ed": rpc error: code = DeadlineExceeded desc = context deadline exceeded 
Therefore this pod cannnot be removed:
[email protected] [ ~ ]$ sudo crictl rmp $podid
pod sandbox "e7287b63125ed" is running, please stop it first

 

Environment

3.2

Cause

This issue can occur due to stuck or unresponsive container.

Resolution

Workaround:

Identify and delete stuck/unresponsive pods from cli:

 

  1. SSH to Control Plane Node and execute the command to find stuck/unresponsive pods and which host they are on:
    --- kubectl get node --no-headers | awk '{print $1}' | xargs -I{} ssh -o StrictHostKeyChecking=no {} 'hostname; sudo crictl pods | grep <K8sNameSpace>'
  2. SSH to each host with stuck/unresponsive pod
  3. ‘sudo top’
  4. Find the stuck/unresponsive containerd process --- <shift + L>, paste pod ID from first command, <enter> (example: e7287b63125ed)
  5. Kill the process with ‘k <enter> <enter>’
  6. The pod will disappear from ‘sudo crictl pods’ output in a minute