During an NSX Edge node upgrade, the process fails at the install_os step, and an alarm similar to the following is observed in the UI or API:
Fail Message:
Edge group upgrade status is FAILED for group <UUID> eXXXXXdg01: install OS task failed on edge TransportNode <UUID>: return status Polling install_os failed to complete in time.
System top output shows high disk I/O wait:
top - 07:24:49 ...
%Cpu(s): 3.1 us, 3.2 sy, 15.3 id, **78.0 wa**, 0.0 hi, 0.1 si, 0.0 st
load average: **37.03, 27.01, 13.94**
VMware NSX-T
The upgrade failed due to poor storage performance, resulting in:
Excessive disk I/O wait time (%wa)
install_os step within the allowed time windowInvestigate storage performance on the Edge VM.
Switch to a healthier datastore policy or migrate the Edge VM to faster-performing storage.
Include disk I/O health checks as part of your upgrade readiness validation, especially in shared or constrained environments.