A virtual machine loses network connectivity to expected destinations while running on a particular ESXi host, and migrating the VM to a different host restores connectivity. The condition is intermittent, often appearing after the VM is relocated to a given host by DRS or by a manual migration, and is difficult to reproduce on demand.
Common observations reported with this issue:
This article is the starting point for isolating where the traffic is dropped and which known cause applies. The Resolution section walks the dataplane checks that confirm whether the DFW is the drop point, then routes to the article that matches the confirmed cause. Because migrating the VM clears the runtime state on the originating host, the verification steps and any packet captures are collected while the VM is still on the affected host, before the migration workaround is applied.
The DFW filter applied to the VM's vNIC is not enforcing the expected ruleset on the affected host, so traffic falls through to a default deny or reject action and is dropped. Several distinct root causes produce this same symptom, and they are distinguished by the dataplane state observed during the failure: whether the filter has rules at all, whether the VM's current IP is present in the relevant address set, and whether the rule was applied and then removed. The verification steps in the Resolution identify which root cause applies, and each is addressed in its own article.
Run the following while the VM is still on the affected host and the condition is active. Do not migrate the VM until verification and any captures are complete, because migration clears the failure state.
Open an SSH session to the affected ESXi host as root and identify the slot-2 filter:
summarize-dvfilter | grep -i <VM-Name> -A 16
Note the slot-2 filter name in the form nic-XXXXXXXX-eth0-vmware-sfw.2, the world ID, the decimal port ID, and the failurePolicy value. A failurePolicy of failClosed means traffic is dropped whenever the filter has no valid ruleset. For the command and example output, see NSX-T DFW rules are not applied to VMs in security only environments.
vsipioctl getrules -f nic-XXXXXXXX-eth0-vmware-sfw.2 vsipioctl getaddrsets -f nic-XXXXXXXX-eth0-vmware-sfw.2 vsipioctl getfwconfig -f nic-XXXXXXXX-eth0-vmware-sfw.2
These commands and their use against a VM's slot-2 filter are documented in NSX-T DFW rules not getting applied to virtual machines in NSX-T Security Only prepared cluster. Interpret the output as follows, then go to the matching article in Step 4.
| Observation | What it indicates |
|---|---|
getrules returns No rules or No root rule set | The filter has no ruleset on this host. Traffic is dropped by failClosed. Continue to the rule-realization causes in Step 4. |
| Rules are present, but the VM's current IP is missing from the expected address set | An address set / IP discovery binding problem. Continue to the IP discovery cause in Step 4. |
| Rules and address sets look correct, but a default reject or drop rule is taking the hits | Traffic is not matching an allow rule. Continue to the default-rule cause in Step 4. |
To prove whether the DFW is dropping the traffic rather than a forwarding or overlay problem, capture before and after the filter while a continuous ping runs from the affected VM to the unreachable destination. Direct output to a datastore, not /tmp, using a case-specific subfolder on a non-vSAN datastore.
pktcap-uw --dvFilter nic-XXXXXXXX-eth0-vmware-sfw.2 --capture PreDVFilter --ng --count 200 -o /vmfs/volumes/<datastore>/<case>/vm_pre_dfw.pcapng pktcap-uw --dvFilter nic-XXXXXXXX-eth0-vmware-sfw.2 --capture PostDVFilter --ng --count 200 -o /vmfs/volumes/<datastore>/<case>/vm_post_dfw.pcapng
Packets present before the filter but absent after it confirm the DFW is dropping the traffic. The --capture PreDVFilter and --capture PostDVFilter syntax is documented in How to Capture Packets at DVFilter Level, and the requirement to write captures to a datastore rather than /tmp is noted in Packet capture on ESXi using the pktcap-uw tool. For the full data-path capture procedure (vNIC, switchport, uplink, and kernel capture points), see Datapath capture to Diagnose Datapath Connectivity Issues in NSX Environments.
Use the dataplane evidence from Step 2 to select the matching article. Each contains its own verification detail, affected versions, and resolution.
Collect an NSX-generated support bundle while the condition is still present, including all NSX Manager nodes, the NSX Edge nodes, the affected ESXi host, and the destination ESXi host, plus a vCenter Server support bundle. A bundle collected during the active condition captures the configuration agent state and local control-plane state from the affected host, which is not present in a bundle collected after migration. After collection, migrate the VM to restore service and record the restoration time.
If none of the matching articles resolve the issue, contact Broadcom Support and provide the dataplane command output, the packet captures, the support bundle, and the start and restoration timestamps.
Cause-specific articles:
Capture and data-collection procedures: