ESXi host in "not responding" state
search cancel

ESXi host in "not responding" state

book

Article ID: 398420

calendar_today

Updated On:

Products

VMware vSphere ESX 7.x VMware vSphere ESX 8.x

Issue/Introduction

Symptoms:

  • An ESXi host shows as "not responding" in vCenter Server.
  • Cannot connect the ESXi host to vCenter Server.
  • Virtual machine running on the hosts are still accessible.
  • The esxi will get into not responding state due to ram disk getting full with the below log snippets in vmkernel.log 

YYYY-MM-DDTHH-MM-SS In vmkernel: cpu5o: 2101467)Activating Jumpstart plugin nicd.
YYYY-MM-DDTHH-MM-SS In vmkernel: cpu63:2101481) Activating Jumpstart plugin vmfstraced.
YYYY-MM-DDTHH-MM-SS In vmkernel: cpu37:2101485) Activating Jumpstart plugin lbtd.
YYYY-MM-DDTHH-MM-SS In vmkernel: cpu22:2101882) Admission failure in path: host/system/visorfs/ramdisks/etc:etc
YYYY-MM-DDTHH-MM-SS In vmkernel: cpu22:2101882) etc (270) requires 4 KB, asked 4 KB from etc (269) which has 28672 KB occupied and 0 KB available.
YYYY-MM-DDTHH-MM-SS In vmkernel: cpu22:2101882) Admission failure in path: host/system/visorfs/ramdisks/etc:etc
YYYY-MM-DDTHH-MM-SS In vmkernel: cpu22:2101882) etc (270) requires 4 KB, asked 4 KB from etc (269) which has 28672 KB occupied and 0 KB available.
YYYY-MM-DDTHH-MM-SS Wa vmkwarning: cpu22:2101882) WARNING: VisorFSRam: 220: Cannot extend visorfs file /etc/vmsyslog. conf.d/portlldpd.conf because its ramdisk (etc) is full.

  • The Ram disk usage will be at 100% for /etc.

  • Compare to other config file under NSX directory this controller file alone will have a large size. The expected size of the file should be around 8 to 10 MB in size.

Environment

  • VMware vSphere ESXi 7.0.x
  • VMware vSphere ESXi 8.0.x

Cause

The issue is caused by a corrupted controller-info.xml file located in /etc/vmware/nsx. The corruption may include unwanted whitespace or additional invalid inputs, resulting in excessive file size (often >10 MB).

Resolution

To resolve the issue, follow the steps below:

Step 1: Clear Old Log Files

Step 2: Then Validate controller-info.xml File

  • If the issue persists after clearing log files, check the size of the NSX configuration file:
    ls -lh /etc/vmware/nsx/controller-info.xml
  • If the controller-info.xml file is larger than 10 MB, proceed with the below steps.

Step 3: Remove and Recreate controller-info.xml

  1. Stop NSX Proxy Service:
    /etc/init.d/nsx-proxy stop
  2. Remove the Corrupted File:
    rm /etc/vmware/nsx/controller-info.xml
  3. Restart Critical Services:
    /etc/init.d/hostd restart /etc/init.d/vpxa restart
  4. Start NSX Proxy Again:
    /etc/init.d/nsx-proxy start
  5. Verify File Re-Creation:
    • Ensure that the controller-info.xml file is automatically regenerated:
      ls -lh /etc/vmware/nsx/controller-info.xml
  6. Check NSX Controller Connectivity:
    • Confirm that the host has re-established communication with the NSX Controller.

Step 4: Reboot the Host

  • Perform a clean reboot of the ESXi host.

Step 5: Post-Reboot Validation

  • After reboot, verify that all services are running properly.
  • Check the RAM disk status using vdf -h RAM disk usage is within normal limits.
  • The controller-info.xml file is reset and correctly populated.
  • NSX connectivity is restored.

Additional Information