Reducing the ESXi host's memory by half triggered a PSOD.
search cancel

Reducing the ESXi host's memory by half triggered a PSOD.

book

Article ID: 439989

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

 

  • An ESXi 8.0.x host fails to boot and experiences a Purple Screen of Death (PSOD).

  • The PSOD screen indicates a Page Fault exception related to the Fault Tolerance network module: #PF Exception 14 in world [ID]: jumper2 IP [Address] addr [Address] Module(s) involved in panic: [ftcpt]

  • The PSOD backtrace shows the crash occurring during network initialization: FTCptNetGetListenerNetStackForBind FTCpt_Init

  • At the bottom of the PSOD screen or in the console logs prior to the crash, the following error messages are observed: Unable to restore Resource Pool settings for host/user/pool[ID]. It is possible hardware or memory constraints have changed. Please verify settings in the vSphere Client. Default TCP/IP stack missing. Aborting ESX Firewall. Jumpstart plugin network-support activation failed.

  • Attempting to restart or power cycle the ESXi host via BMC (e.g., iDRAC, iLO, XClarity) does not resolve the issue; the host consistently encounters the exact same PSOD on every boot attempt.

 

Environment

VMware ESXi 8.0

VCF 5.x

Cause

This issue occurs when a significant amount of physical memory (RAM) is removed from the ESXi host (e.g., reducing from 32 DIMMs to 16 DIMMs) without modifying the system's resource reservations prior to shutdown.

ESXi strictly manages memory reservations for both virtual machines and internal system services (System Resource Pools) via the esx.conf file. When physical memory is drastically reduced, the available RAM falls below the previously configured hard reservations. Consequently, the host's resource scheduler fails to build the resource tree, leaving core components like the TCP/IP stack without the allocated memory required to initialize.

During the boot sequence, the ftcpt (Fault Tolerance Checkpoint) module attempts to bind to the TCP/IP network stack. Because the stack failed to load due to the memory constraint, the module attempts an invalid memory access, triggering a Page Fault (Exception 14) and forcing a kernel panic.

Resolution

To resolve this issue, you must force the ESXi host to recalculate its resource pools based on the new physical hardware footprint. This is achieved by performing an in-place upgrade/repair using the ESXi installation ISO.

  1. Mount or insert the ESXi 8.0.x installation media (ISO) to the affected host via virtual media (iLO/iDRAC/IPMI) or USB.

  2. Boot the host from the installation media.

  3. Follow the ESXi installer prompts until it scans the local disks and detects the existing ESXi installation.

  4. When prompted to select an installation type, select the following option: Upgrade ESXi, preserve VMFS datastore (Caution: Do not select "Install and overwrite VMFS datastore" as this will result in data loss).

  5. Complete the installation wizard and reboot the host.

  6. Upon reboot, the system will generate a fresh configuration file with the correct system resource pool allocations matching the current 16-DIMM hardware state, allowing the network stack to load and preventing the PSOD.

Workaround:

If an in-place repair is not immediately feasible, the issue can be bypassed by temporarily reinstalling the removed hardware:

  1. Power off the host and reinstall the removed physical memory (reverting to the original 32 DIMM configuration).

  2. Boot the ESXi host. It should boot successfully without the PSOD.

  3. Once booted, remove any strict memory reservations configured on Resource Pools or Virtual Machines via the vSphere Client.

  4. Place the host into Maintenance Mode and gracefully shut it down.

  5. Remove the physical memory again. The host will now boot normally with the reduced memory footprint.