NSX Edge Upgrade Fails with “Polling install_os failed to complete in time” Due to High Disk Latency
search cancel

NSX Edge Upgrade Fails with “Polling install_os failed to complete in time” Due to High Disk Latency

book

Article ID: 403911

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

During an NSX Edge node upgrade, the process fails at the install_os step, and an alarm similar to the following is observed in the UI or API:

Fail Message:

Edge group upgrade status is FAILED for group <UUID> eXXXXXdg01: install OS task failed on edge TransportNode <UUID>: return status Polling install_os failed to complete in time.

System top output shows high disk I/O wait:

top - 07:24:49 ...
%Cpu(s): 3.1 us, 3.2 sy, 15.3 id, **78.0 wa**,  0.0 hi,  0.1 si,  0.0 st
load average: **37.03, 27.01, 13.94**

Environment

VMware NSX-T

Cause

The upgrade failed due to poor storage performance, resulting in:

  • Excessive disk I/O wait time (%wa)

  • Failure to complete the install_os step within the allowed time window

Resolution

Investigate storage performance on the Edge VM.

Switch to a healthier datastore policy or migrate the Edge VM to faster-performing storage.

Additional Information

Include disk I/O health checks as part of your upgrade readiness validation, especially in shared or constrained environments.