This is a known issue with Intel X710 NIC confirmed by Intel and VMware.
VxRail vSAN hosts running Intel X710 NIC goes into not responding state randomly in the cluster.
You may see the following errors related to NICS Ethernet Controller X710 for 10GbE SFP+ ( ex. vmnic1 and vmnic0 in below logs ) in the vmkernel logs during which time the issue occurred with the host -:
2018-10-20T01:04:55.604Z cpu21:140896553)NetPort: 1662: enabled port 0x4000011 with mac ##:##:##:##:##:df
2018-10-20T01:05:15.214Z cpu42:678436159)WARNING: NetPort: 1934: failed to disable port 0x4000028 on DvsPortset-0: Busy
2018-10-20T01:05:15.214Z cpu42:678436159)netschedHClk: NetSchedHClkPortQuiesce:4918: vmnic1: received a force quiesce for port 0x4000028
2018-10-20T01:05:15.214Z cpu42:678436159)netschedHClk: NetSchedHClkPortQuiesce:4918: vmnic0: received a force quiesce for port 0x4000028
2018-10-20T01:05:15.214Z cpu42:678436159)netschedHClk: NetSchedHClkHashQuiesceHierarchyIter:396: vmnic0: dropped 506 pkts from queue netsched.pools.vm.67108904 while quiescing port 0x4000028
2018-10-20T01:05:15.215Z cpu42:678436159)Vmxnet3: 15916: There is still packet not transmitted when the device isdisabled, port:0x4000028, queue: 2
2018-10-20T01:05:15.215Z cpu42:678436159)NetPort: 1881: disabled port 0x4000028
2018-10-20T01:05:15.227Z cpu42:678436159)Vmxnet3: 17293: Disable Rx queuing; queue size 1024 is larger than Vmxnet3RxQueueLimit limit of 64.
2018-10-20T01:05:15.227Z cpu42:678436159)Vmxnet3: 17651: Using default queue delivery for vmxnet3 for port 0x4000028
2018-10-20T01:05:15.227Z cpu42:678436159)NetPort: 3208: resuming traffic on DV port 28679
2018-10-20T01:05:15.227Z cpu42:678436159)Team.etherswitch: TeamESPolicySet:5942: Port 0x4000028 frp numUplinks 2 active 2(max 2) standby 0
2018-10-20T01:05:15.227Z cpu42:678436159)Team.etherswitch: TeamESPolicySet:5950: Update: Port 0x4000028 frp numUplinks 2 active 2(max 2) standby 0
2018-10-20T01:05:15.227Z cpu42:678436159)NetPort: 1662: enabled port 0x4000028 with mac ##:##:##:##:##:14
2018-10-20T01:05:15.228Z cpu42:678436159)NetPort: 1881: disabled port 0x4000028
2018-10-20T01:05:15.259Z cpu42:678436159)Vmxnet3: 17293: Disable Rx queuing; queue size 1024 is larger than Vmxnet3RxQueueLimit limit of 64.
2018-10-20T01:05:15.259Z cpu42:678436159)Vmxnet3: 17651: Using default queue delivery for vmxnet3 for port 0x4000028
2018-10-20T01:05:15.259Z cpu42:678436159)NetPort: 3208: resuming traffic on DV port 28679
2018-10-20T01:05:15.259Z cpu42:678436159)Team.etherswitch: TeamESPolicySet:5942: Port 0x4000028 frp numUplinks 2 active 2(max 2) standby 0
2018-10-20T01:05:15.259Z cpu42:678436159)Team.etherswitch: TeamESPolicySet:5950: Update: Port 0x4000028 frp numUplinks 2 active 2(max 2) standby 0
--------------------------------------------------------------------------------------------------------------------------------------------------------
vmkernel.0:2018-10-30T02:51:47.141Z cpu58:672058372)netschedHClk: NetSchedHClkHashQuiesceHierarchyIter:396: vmnic0: dropped 8 pkts from queue netsched.pools.persist.default while quiescing port 0x300003f
vmkernel.0:2018-10-30T02:51:47.141Z cpu58:672058372)netschedHClk: NetSchedHClkHashQuiesceHierarchyIter:396: vmnic0: dropped 503 pkts from queue netsched.pools.vm.50331711 while quiescing port 0x300003f
vmkernel.0:2018-10-30T02:51:52.066Z cpu53:74667289)netschedHClk: NetSchedHClkPortQuiesce:4918: vmnic0: received a force quiesce for port 0x3000012
--------------------------------------------------------------------------------------------------------------------------------------------------------
vmkernel.0:2018-10-30T02:51:47.140Z cpu58:672058372)netschedHClk: NetSchedHClkPortQuiesce:4918: vmnic1: received a force quiesce for port 0x300003f
vmkernel.0:2018-10-30T02:51:52.066Z cpu53:74667289)netschedHClk: NetSchedHClkPortQuiesce:4918: vmnic1: received a force quiesce for port 0x3000012
--------------------------------------------------------------------------------------------------------------------------------------------------------
2018-10-30T02:51:47.141Z cpu58:672058372)netschedHClk: NetSchedHClkHashQuiesceHierarchyIter:396: vmnic0: dropped 8 pkts from queue netsched.pools.persist.default while quiescing port 0x300003f
2018-10-30T02:51:47.141Z cpu58:672058372)netschedHClk: NetSchedHClkHashQuiesceHierarchyIter:396: vmnic0: dropped 503 pkts from queue netsched.pools.vm.50331711 while quiescing port 0x300003f
--------------------------------------------------------------------------------------------------------------------------------------------------------
grep "while quiescing port" vmkernel.* | wc
1128 15792 228683
--------------------------------------------------------------------------------------------------------------------------------------------------------
grep "time since last heartbeat" vpxd-*.log
vpxd-681.log:2018-10-30T02:38:32.961Z info vpxd[7F086F66C700] [Originator@6876 sub=HostCnx opID=CheckforMissingHeartbeats-7460ee5a] [VpxdHostCnx] No heartbeats received from host; cnx: ########-####-####-####-########8547, h: host-115, time since last heartbeat: 63271ms
Outdated drivers and firmware running on Intel X710 NIC as per VMware HCL. Please install the latest supported drivers and firmware.
Intel(R) Ethernet 10G 4P X710 Compatibility link
Recommended Driver -: i40en version 1.7.11 & above
Recommended firmware -: 18.5.0 & above