You cannot typically predict when an ESX/ESXi host will become Not Responding in vCenter Server. This makes it difficult to collect a network trace, which have to be captured over a long period of time until the problem occurs.
It is possible to setup network traces that limit the size of the capture file and rotate the network traces so that the space requirements are predictable(that is, size of file for x number of files).
ESX hosts
To capture packets on the vmkernel interface vmk0, use the tcpdump-uw command with these options:
tcpdump-uw -i vmk0 -C 64 -W 10 -s 0 -w /scratch/log/packets.cap udp port 902
- -W limits the number of files created to the specified number
- -C limits the filesize to this number of MBs
- -w specifies the datastore to write the trace to
- udp port 902 captures only udp traffic on port 902
This will allow a trace to run for a long period of time as the trace logs rotate on the hosts.
vCenter Server
Using Wireshark:
- Select Capture > Options > Use multiple files.
- Select Next file every 10MB.
- Select Ring buffer with: 10 files.
In this example, the capture creates a 10MB file, then moves on and creates a new file and rotates back to the original file after 10 files are created.
When a host alarm is triggered for a Not Responding state, the network trace files should be copied off both vCenter Server and the ESX/ESXi host before they rotate.