NSX VDPI service crashing on ESXi host and logs flooded with "L7DPIDNS: duplicate transaction".
search cancel

NSX VDPI service crashing on ESXi host and logs flooded with "L7DPIDNS: duplicate transaction".

book

Article ID: 317691

calendar_today

Updated On:

Products

VMware vDefend Firewall

Issue/Introduction

  • IDS/IPS or L7 firewall rules are in use.
  • FQDN context profiles are being ignored resulting is firewall rules not working as expected.
  • The NSX UI alarms for an application crash on an NSX node. This can be confirmed checking var/log/syslog of the NSX manager support bundle:
    ######.####.local NSX 12145 MONITORING [nsx@6876 alarmId="4fa34ef3-####-####-####-ae8b3d02b0be" alarmState="OPEN" comp="nsx-manager" entId="aca279ee-####-####-####-##########13" errorCode="MP701099" eventFeatureName="infrastructure_service" eventSev="CRITICAL" eventState="On" eventType="application_crashed" level="FATAL" nodeId="aca279ee-####-####-####-##########13" subcomp="monitoring"] Application on NSX node ######.####.local has crashed. The number of core files found is 1. Collect the Support Bundle including core dump files and contact VMware Support team.
  • Also In /var/run/log/vobd.log on the ESXi Node you can see the following error for VDPI crashing:
    [UserWorldCorrelator] 3417085232754us: [esx.problem.application.core.dumped] An application (/usr/lib/vmware/nsx-vdpi/bin/vdpi) running on ESXi host has crashed (1 time(s) so far). A core file may have been created at /var/core/vdpi-zdump.000
  • In /var/core of the ESXi you can see vdpi-zdump core files:
    -rw-rw-r-- 1 ######## support 65M Jul 18 09:22 1.vdpi-zdump.000
    -rw-rw-r-- 1 ######## support 65M Jul 18 09:22 vdpi-zdump.000
  • In /var/run/log/nsx-syslog or the ESXi Node support  bundle mentioned in the NSX alarm the log has repeating "L7DPIDNS: duplicate transaction" errors:
    vdpi[95757027]: NSX 95757027 - [nsx@6876 comp="nsx-esx" subcomp="nsx-vdpi" tid="95757057" level="ERROR" invalid="true"] L7DPIDNS: duplicate transaction : 14371 : 855515354825


NOTE: The preceding log excerpts are only examples. Date, time and environmental variables may vary depending on your environment.

Environment

VMware NSX 4.x

Cause

This issue is due to a memory leak in DNS Txns that can occur when clients are sending multiple DNS requests in a single connection using the same port. This results in long standing flows that do not time out. 


Resolution

This issue is resolved in VMware NSX 4.1.1, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.