The DNS Forwarder service fails to start when the cache size is set to 0.
search cancel

The DNS Forwarder service fails to start when the cache size is set to 0.

book

Article ID: 422826

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • The DNS Forwarder service remains in an ERROR state, and there is one DNS Forwarder DOWN alarm raised on the NSX UI.
  • The DNS Forwarder datapath becomes non-functional, and subsequent updates do not recover the service.
  • The DNS Forwarder backend datapath container is in "Exited" status by checking `docker ps -a | grep service_dns` on edge.

    Impact to customer
    1. The customer experiences DNS query failures to the DNS Forwarder service.
    2. The DNS Forwarder service status remains in the ERROR state, and the alarm is present.

edge var/log/dns/prestart.log

  4 2025-04-28 10:46:28,353 14 dns.dns_utils ERROR Failed to run cmd /opt/vmware/nsx-edge/bin/dns/dnsconf_gen.py ####c20d-####-4d5f-####-767a854b#### with error Traceback (most recent call last):
 15 2025-04-28 10:46:28,355 14 dns.dns_fdr_prestart ERROR Failed to generate dnsmasq/iptable config file with cmd: /opt/vmware/nsx-edge/bin/dns/dnsconf_gen.py ####c20d-####-4d5f-####-767a854b####
 16 2025-09-22 03:04:26,528 13 dns.dns_utils ERROR Failed to run cmd /opt/vmware/nsx-edge/bin/dns/dnsconf_gen.py ####c20d-####-4d5f-####-767a854b#### with error Traceback (most recent call last):
136 2025-09-22 03:06:23,224 13 dns.dns_utils ERROR Failed to run cmd /opt/vmware/nsx-edge/bin/dns/dnsconf_gen.py ####c20d-####-4d5f-####-767a854b#### with error Traceback (most recent call last):
145 KeyError: 'cache_size'

var/log/proton/nsxapi.log

./var/log/proton/nsxapi.4.log:2025-09-22T09:13:27.084Z INFO providerTaskExecutor-1-7 DNSForwarderProviderNsxT 77770 POLICY [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] set tenancy context in updateDnsForwarder() with policyDnsForwarder DnsForwarder [listenerIp=##.##.##.##, logLevel=INFO, cacheSize=0, defaultForwarderZonePath=/infra/dns-forwarder-zones/####c20d-####-4d5f-####-767a854b####, conditionalForwarderZonePath=[], enabled=true, getForwardRelationShips()=[RelationshipInfo{targetPath=/infra/dns-forwarder-zones/####c20d-####-4d5f-####-767a854b####, relationshipType=DEFAULT_DNS_FORWARDER_ZONE_RELATIONSHIP}]][policyPath=/infra/tier-0s/Provider-LR/dns-forwarder, markedForDelete=false] for dnsForwarderModel DnsForwarder [logicalRouter=LogicalRouter/####1abc-####-465a-####-cfd66e85####, srClusterId=null, cacheSize=0, listenerIp=##.##.##.##, defaultZone=DnsForwarderZone [sourceIp=null, domainNames=[], upstreamServers=[##.##.##.##]], conditionalZones=null, logLevel=INFO, enabled=true, msgTimestamp=0, serviceGroupId=null, isStandbySite=false] 

Environment

VMware NSX

Cause

  • The Policy service received a DNS Forwarder CREATE API request that specified a cache size of 0.
  • which is an invalid value that the API validation logic should have rejected.
  • The DNS Forwarder backend did not expect to receive an invalid cache size from the Policy, resulting in a Datapath crash.

Resolution

  • Perform the following action during the maintenance Window.
  • Delete the existing DNS Forwarder service and recreate it with a non-zero cache size.

 

Additional Information

The issue is fixed in version 9.1