In VMware Cloud Foundation (VCF) and Telco Cloud environments utilizing Mellanox InfiniBand adapters, NSX Manager may report an ESXi host status as Degraded. This occurs even when the InfiniBand adapters are not being used for NSX-managed traffic
Symptoms:
VMware NSX
NSX doesn't naturally distinguish between a port used for NSX traffic (like your VM networks) and a port used for specialized tasks (like your InfiniBand SR-IOV). Because Mellanox adapters require a vSwitch/VDS uplink to facilitate SR-IOV Virtual Functions (VFs), the "Link Down" state of the InfiniBand port (due to lack of RoCE/Ethernet link) is erroneously treated by NSX as a failure in the host's networking fabric, regardless of whether the pNIC is part of an NSX Transport Node Profile.
Currently, this is expected behavior based on the NSX health monitoring architecture. Follow the below steps to verify that the port NSX is reporting about is indeed the InfiniBand port and not a real Ethernet failure.
Steps to do this: