etcd and kube-apiserver pods are in CrashLoopBackOff on Guest Cluster after a power outage event
search cancel

etcd and kube-apiserver pods are in CrashLoopBackOff on Guest Cluster after a power outage event

book

Article ID: 409549

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

  • etcd and kube-apiserver pods are in CrashLoopBackOff in Guest cluster.
  • This is observed after a power outage. This can also happen after a node power off.
  • etcd container keep restarting. kube-apiserver container also keep restarting.
  • etcd container logs shows below error

YYYY-MM-DDTHH:MM:SS.092110285Z stderr F panic: assertion failed: Page expected to be: <#123#>, but self identifies as <#1234567#>

    • To retrieve container logs:

crictl logs <container id>

Environment

vSphere with Tanzu

VKS

VKS Guest Cluster

Cause

This error suggests that a specific page within the database file is expected to have a certain identifier or state, but it self-identifies differently, leading to an assertion failure and a program panic.

Resolution

Please reach out to Broadcom VCF Support for recovering the node state.
Troubleshooting steps are decided based on thorough examination of the current etcd database health along with other etcd member health. Remediation may include, but not limited to, recreation of the node/s, repair etcd database etc.