Troubleshooting Distributed Firewall (DFW)
search cancel

Troubleshooting Distributed Firewall (DFW)

book

Article ID: 379438

calendar_today

Updated On:

Products

VMware NSX VMware vDefend Firewall

Issue/Introduction

We need a guide to do basic troubleshooting to see if DFW is causing any issues in the environment.

Resolution

Below are some basic steps to troubleshoot DFW:

1. Check if the rule has the appropriate source, destination, services, profiles, and actions. Ensure that the firewall rule is correctly defined with the appropriate sources, destinations, service types, and specified actions (allow or block).

2. Check if the rule has an appropriate Applied To field. Check that the rule is applied to the correct targets, such as specific VMs, networks, or security groups.

3. Check if the DFW rules are in the correct order. Firewall rules are processed in a top-down approach, so their order can affect which rules are executed. Make sure they are ordered logically to function as intended.

4. Ensure that the intended rule is actually enabled.

5. Use Traceflow to diagnose packet flow. Perform a Traceflow between the source and destination to observe the path of the traffic, pinpoint where packets are being dropped, and validate the effect of firewall rules. Refer https://techdocs.broadcom.com/us/en/vmware-cis/nsx/vmware-nsx/4-2/administration-guide/network-monitoring/advanced-monitoring-tools/traceflow.html
 
6. Review the dfwpktlogs 
 
7. Check if DFW rules for the VM are realized on the ESXi host using the summarize-dvfilter and vsipioctl getrules commands.
https://techdocs.broadcom.com/us/en/vmware-security-load-balancing/vdefend/vdefend-firewall/4-2/vdefend-distributed-firewall/monitoring-and-troubleshooting-dfw/troubleshooting-distributed-firewall-on-esx-hosts.html
 
8. Check if the DFW modules are correctly installed and loaded on the hosts. Confirm that the DFW modules are properly installed on the ESXi hosts and that they are active and running.



9. Check the connectivity to NSX management and control planes. Check that there is proper communication between the firewall components and the NSX management and control planes, ensuring that configurations and updates are synchronized. 
 
 
10. Check NSX agents status. Check the status of the NSX agents on the hosts to make sure they are operational and correctly reporting the status to the management plane.
Several other NSX agents run on the ESXi hosts.    This can also by done via API calls; refer to https://techdocs.broadcom.com/us/en/vmware-cis/nsx/vmware-nsx/4-2/installation-guide/troubleshooting-installation-issues/troubleshooting-host-transport-nodes/transport-node-is-down-as-agent-service-is-down.html 
 
 
 
11. Check if adding the VM or a group of VMs to the Firewall Exclusion List remediates the issue 
 
12. Capture traffic at the VM vNIC. Monitor the virtual network interface card (vNIC) of the VM to capture and analyze the traffic, which can help identify whether traffic is being incorrectly blocked or allowed. https://knowledge.broadcom.com/external/article?articleNumber=425834
 
13. Check for scale-related issues. Investigate if there are any scale-related issues, such as too many rules or connections, which could affect the performance and functionality of the DFW. Refer https://knowledge.broadcom.com/external/article/371437https://knowledge.broadcom.com/external/article/372896
 
 
15. If just a few packets are briefly hitting the default rule for some unknown reason, check that the state of the VM was up and ready during the time of the incident.   For example, if the VM was being restarted around the time of the incident, it is likely that the full DFW configuration was not yet realized from the NSX Manager, except for the default rule.    Under these circumstances, you might see any of the following log messages in the ESXi host under /var/run/log:
hostd.log
Registering virtual machine 
 
vmkernel.log
connected <VM name>.eth.<index> to eth.<index> vDS, portID <ID>
Whitelist empty, IP/ARP traffic may get black holed
 
 
Ex:
connected VM123.eth0 eth0 to vDS, portID 0x6000043
Filter 0x6000043 Whitelist empty, IP/ARP traffic may get black holed

Such messages rule out the DFW as the root cause and require further investigation regarding the state of the VM itself

16. For L7 FQDN Filtering issues, please review the configuration guide to ensure the rules are appropriately configured