When attempting to run the VCF Edge resizing utility during an infrastructure maintenance window or upgrade, the script halts and blocks further automated cluster operations. The utility log explicitly outputs errors pointing to node validation states failing due to active resource constraints or alarms.
The following error string is observed in the edge_node_resizer_<TIMESTAMP>.log file:
[ERROR resize.py::_getEdgeNodeStates::<LINE>] Edge node <HOSTNAME> state is not 'success'
Simultaneously, execution terminal output displays:
Edge cluster's nodes are not all healthy, stopping now. Resizing utility is stopping now.
In the NSX Manager UI, the affected Edge transport node shows a Configuration State of Success and Node Status of Up, but possesses active memory or resource utilization alarms (indicated by a yellow warning/alarm badge).
The exact cause of this issue is not known at this time.
If the script fails and there is a conflict between the output of the resizing script and the NSX UI regarding the Edge node status/health, please open a support case with Broadcom Support. For more information, see Creating and managing Broadcom support cases.