Userworld "vsfwd" generating Core-Dumps due to array overflow.[No business impact] NSX-V 6.4.6.
search cancel

Userworld "vsfwd" generating Core-Dumps due to array overflow.[No business impact] NSX-V 6.4.6.

book

Article ID: 338628

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
  • You may see core-dumps generated under /var/core from the ESXI in question

var/core$ ls
total 10M
drwxrwxr-x 2 svc.datamover support   33 May 11 22:03 .
drwxrwxr-x 6 svc.datamover support   85 May 11 22:03 ..
-rw-rw-r-- 2 svc.datamover support 8.7M May 11 22:03 vsfwd-zdump.000
  • You may start getting  alerts from "vRealize Log Insight" 3 of 4 times a week  it rotates across several  host on the cluster. 

Alert example: 

[UserWorldCorrelator] 9064747351462us: [esx.problem.application.core.dumped] An application (/usr/lib/vmware/vsfwd/bin/vsfwd) running on ESXi host has crashed (4 time(s) so far). A core file may have been created at /var/core/vsfwd-zdump.000.


NOTE : The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware NSX Data Center for vSphere 6.4.x

Cause

vsfwd Crash due to array count out-of-bound in rule hit processing

Customer experienced vsfwd crash due to rule hit processing.

The situation occurs when counts are aggregated with mismatching rule counts.

Resolution


This is a known issue that will be fixed on NSX 6.4.7

Workaround:
Disable NSX Distributed Firewall rule statistics collection, follow the steps below:

1. Retrieve the current DFW global configurations:
GET /api/4.0/firewall/config/globalconfiguration

Example of expected output:
<globalConfiguration>
  <layer3RuleOptimize>false</layer3RuleOptimize>
  <layer2RuleOptimize>true</layer2RuleOptimize>
  <tcpStrictOption>false</tcpStrictOption>
  <ruleStatsDisabled>false</ruleStatsDisabled>
</globalConfiguration>

2. Push the DFW global configuration with "<ruleStatsDisabled>true</ruleStatsDisabled>"

PUT /api/4.0/firewall/config/globalconfiguration

Example of expected input:
<globalConfiguration>
  <layer3RuleOptimize>false</layer3RuleOptimize>
  <layer2RuleOptimize>true</layer2RuleOptimize>
  <tcpStrictOption>false</tcpStrictOption>
  <ruleStatsDisabled>true</ruleStatsDisabled>
</globalConfiguration>

Additional Information

Impact/Risks:
NONE , we validated an vsfwd service is able to start again and no service interruption is experienced.