Transport Nodes fail to resolve the NSX Manager FQDN and generates DNS lookup failures in its logs. The error message indicates trouble resolving the NSX Manager FQDN via DNS and generating following alarms for DNS lookup and reverse DNS lookup.
Alarm Example #1: DNS lookup failed for Manager node <uuid> with FQDN <fqdn>.
Alarm Example #2: Reverse DNS lookup failed for Manager node <uuid> with IP address <ip-address>.
And/or
The nslookup <fqdn>
generates correct output but dig
fails while the output of dig <fqdn>
and
dig -x <ip>
may not contain "Answer".
Generally, DNS resolution failure issues typically occur due to one the following reasons:
1. Run the /usr/bin/dig <fqdn>
command, and verify the output in the "Answer" section is correct:
Example of a correct output:
root@edge01:~# /usr/bin/dig nsx-mngr-01.#.#
; <<>> DiG 9.18.28-0ubuntu0.22.04.1-Ubuntu <<>> nsx-mngr-01.#.#
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15328
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;nsx-mngr-01.#.#. IN A
;; ANSWER SECTION:
nsx-mngr-01.#.#. 3600 IN A 192.#.#.#
;; Query time: 0 msec
;; SERVER: #.#.#.10#53(192.#.#.#) (UDP)
;; WHEN: Thu Feb 20 14:44:32 UTC 2025
;; MSG SIZE rcvd: 67
2. Run the /usr/bin/dig -x <IP-Address>
command, and verify the output in the "Answer" section is correct:
Example of a correct output:
root@edge01:~# /usr/bin/dig -x 192.#.#.#
; <<>> DiG 9.18.28-0ubuntu0.22.04.1-Ubuntu <<>> -x 192.#.#.#
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30823
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;10.#.#.#.in-addr.arpa. IN PTR
;; ANSWER SECTION:
10.#.#.#.#. 3600 IN PTR controlcenter.#.#.
;; Query time: 4 msec
;; SERVER: #.#.#.10#53(#.#.#.#) (UDP)
;; WHEN: Thu Feb 20 15:02:32 UTC 2025
;; MSG SIZE rcvd: 94
3. If the outputs in the first 2 steps are not showing or are showing incorrect values in the "Answer" section, then it points to an issue with DNS server configuration or the underlying connectivity.
4. Next step will be to trace the queries from and to the DNS server to confirm the DNS server is receiving and replying to the queries.
When the publish_fqdns flag is set to True, the /usr/bin/dig
command runs periodically for FQDN resolution. If the command fails, it will generate errors in the logs which will raise the DNS failure alarm in the NSX Manager UI.
While ESXi hosts use nslookup,
the "dig
" command has been introduced in NSX-T Data Center 3.2.3 onwards. Depending on the code, the edge nodes can use usr/bin/getent hosts
or nslookup
commands to resolve the FQDN if dig is not present. Please note that the hosts file is not in use in NSX-T Data Center 3.2.3 or higher.
The NSX nodes will try to resolve FQDNs using the commands in the same sequence as below:
1. The usr/bin/dig
command.
2. The nslookup
command.
3. The usr/bin/getent hosts
command.