Packet loss and intermittent connectivity on Network Extensions can be caused by various factors that are only identifiable through detailed packet analysis:
Packet captures can help make it easier to determine which of these factors is causing the intermittent connectivity issues on specific VLANs.
a. Connect to your ESXi host via SSH.
b. Find your HCX Network Extension appliance's World ID:
esxcli network vm list | grep HCX
c. Map the Network Extension's switchports to their corresponding VLANs by following the detailed correlation process in KB 388893: Correlating HCX Network Extension Switchports to VLANs
d. Document the switchport ID for the VLAN experiencing issues:
VLAN ID: 222
Switchport ID: 78453###
MAC Address: 00:50:56:00:74:22
a. Connect to the ESXi host via SSH and navigate to a suitable datastore for storing captures:
cd /vmfs/volumes/<datastore_name>
mkdir hcx_packet_captures
b. Execute a focused packet capture on the specific switchport:
pktcap-uw --switchport <switchport_ID> -o /vmfs/volumes/<datastore_name>/hcx_packet_captures/vlan_capture.pcap
c. For detailed information about additional pktcap-uw options, refer to KB 341568: Packet capture on ESXi using the pktcap-uw tool.
a. Capture at the VM switchport level to see traffic from the HCX Network Extension appliance:
pktcap-uw --switchport <switchport_ID> --capture VnicTx,VnicRx -o /vmfs/volumes/<datastore>/switchport_capture.pcap
b. Simultaneously capture at the physical uplink to determine if packets are being dropped between the virtual and physical layers:
pktcap-uw --uplink vmnic<#> --capture UplinkSndKernel,UplinkRcvKernel -o /vmfs/volumes/<datastore>/uplink_capture.pcap
c. Add filtering to focus on relevant traffic:
# Only capture traffic to/from a specific endpoint
pktcap-uw --switchport <switchport_ID> --ip 10.#.#.# -o /vmfs/volumes/<datastore>/filtered_capture.pcap
# Limit packet size to just capture headers (reduces file size)
pktcap-uw --switchport <switchport_ID> -s 128 -o /vmfs/volumes/<datastore>/headers_only.pcap
# Capture only a specific protocol (e.g., ICMP)
pktcap-uw --switchport <switchport_ID> --proto 0x01 -o /vmfs/volumes/<datastore>/icmp_capture.pcap
Download the capture files and analyze them in Wireshark, looking specifically for:
a. TCP retransmissions indicating packet loss:
tcp.analysis.retransmission
b. Round-trip time anomalies showing latency issues:
tcp.analysis.ack_rtt
c. Duplicate ACKs signaling packet loss:
tcp.analysis.duplicate_ack
d. Zero window conditions indicating buffer issues:
tcp.analysis.zero_window
e. TCP errors and connection resets:
tcp.flags.reset == 1
While the packet capture is running, collect system performance data:
a. Run esxtop to monitor real-time network statistics:
# Press 'n' after starting esxtop to view network statistics
esxtop
b. Look specifically for these indicators:
c. Take screenshots of esxtop output at regular intervals or when users report issues.
a. If you see packet loss at the virtual switchport but not at the physical uplink:
# Capture at both points at the same time
pktcap-uw --switchport <switchport_ID> -o /vmfs/volumes/<datastore>/switchport_capture.pcap &
pktcap-uw --uplink vmnic<#> -o /vmfs/volumes/<datastore>/uplink_capture.pcap &
b. If you see packet loss at both virtual and physical layers:
c. Important note on packet loss diagnosis:
e. If you see intermittent complete traffic disruption:
If your packet capture analysis reveals performance-related issues, refer to KB 389281: HCX Network Extension Performance Tuning for implementing performance optimization techniques, including:
a. Enabling Generic Receive Offload (GRO) to improve inbound traffic performance
b. Configuring Application Path Resiliency (APR) to create multiple transport tunnels
c. Enabling TCP MSS Clamping to optimize transport performance
d. Optimizing CPU Thread Allocation for high-density environments
e. Configuring Network Extension Appliance Scale Out for high-traffic environments
If escalating to Broadcom Support is necessary:
a. Provide the following evidence:
b. Include: