vSAN Proactive Network Performance Test fails during 2-node Direct-Connect failover validation
search cancel

vSAN Proactive Network Performance Test fails during 2-node Direct-Connect failover validation

book

Article ID: 441518

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

When simulating or validating network redundancy in a vSAN 2-node Direct-Connect topology (e.g., by administratively downing an physical uplink/vmnic), the following symptoms are observed:

  • The vSAN Proactive Network Performance Test returns a Failed status (Red).
  • The bandwidth throughput reports as 0 Mb/s.
  • The vSAN Health UI may display "Network Performance Test: Failed" despite active data paths.
  • The vmkping command between hosts over the remaining active link is successful.
  • Manually running iperf3 to verify paths may result in the following error: iperf3: error - unable to start listener for connections: Operation not permitted

Environment

  • VMware vSAN 7.x or 8.x (OSA and ESA).
  • 2-node Direct-Connect architecture.
  • Uplink failure simulated via esxcli network nic down or physical cable disconnection.

Cause

The vSAN Proactive Network Performance Test uses a circular list logic to initiate iperf3 sessions between all hosts in the cluster.

  1. The test expects every configured VMkernel interface in the vSAN network to be reachable to complete the aggregate bandwidth calculation.
  2. In a 2-node direct-connect environment where an uplink is down, the specific iperf3 session mapped to that hardware path fails.
  3. Because the test design treats any interrupted session as a total test failure, the result returns 0 Mb/s, even if the partner node remains reachable over the redundant failover path.
  4. The Operation not permitted error is caused by a security policy (appDom) that prevents applications from binding to listening ports.

This behavior is a limitation of the proactive test's reporting logic and does not necessarily indicate a total loss of vSAN data path connectivity.

Resolution

During a degraded network state (failover validation), rely on CLI-based verification rather than the Proactive Network Performance Test.

  1. Verify Redundant Connectivity: Manually verify that the vSAN VMkernel adapters can reach each other over the surviving path. Review this command before running it.
    # Verify reachability via specific VMkernel adapter
    vmkping -I vmkX [PARTNER_VMK_IP]
  2. Manual iperf3 Testing: If manual iperf3 testing is required and it returned the "Operation not permitted" error, temporarily disable the security policy
    # Disable the security policy
    localcli system secpolicy domain set -n appDom -l disabled
    
    # Run the iperf3 test
    /usr/lib/vmware/vsan/bin/iperf3 -s
    
    # Re-enable the security policy after testing
    localcli system secpolicy domain set -n appDom -l enforcing
  3. Confirm Teaming Policy: Ensure the vSAN Distributed Portgroup is configured with Active/Unused (or Active/Standby) teaming and that each VMkernel is pinned to its respective direct-connect uplink.
  4. Perform Post-Restoration Testing: Only run the vSAN Proactive Network Performance Test once the simulated failure has been reverted and all uplinks are in an Up state. The test will then accurately report the aggregate throughput of both links.
  5. Reference Design: Ensure the environment aligns with the vSAN 2-Node Direct Connect Network Design Guide.

Additional Information

Related articles
vSAN -- Health Service -- Proactive Tests -- Network Performance Test
Network Performance Test reporting Health Status as Failed