Symptoms:
- BFD tunnels on the Edge are down.
- Edge TEP is not reachable.
- Larger AppHA packets can be verified by checking the existence of entries like the following in syslog where it indicates the packet size (here it says 1976b) is larger than 1472b. Anything above 1472b size is a problem.
File path - /var/log# grep "AppHA-tx-Bridge" syslog
2023-02-13T23:15:12.799Z 10-172-23-51 NSX 17 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="db-config" level="INFO"] AppHA-tx-Bridge(00085,00000): ANNO.REQ.0000000000:0000000000,peer=0c0f0304-####-####-####-##########4b,1976b
2023-02-13T23:15:12.803Z 10-172-23-51 NSX 17 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="db-config" level="INFO"] AppHA-tx-Bridge(00086,00000): ANNO.REQ.0000000000:0000000000,peer=0c0f0304-####-####-####-##########4b,1976b
2023-02-13T23:15:13.121Z 10-172-23-51 NSX 17 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="db-config" level="INFO"] AppHA-tx-Bridge(00087,00000): ANNO.REQ.0000000000:0000000000,peer=0c0f0304-####-####-####-##########4b,1976b
2023-02-13T23:15:13.160Z 10-172-23-51 NSX 17 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="db-config" level="INFO"] AppHA-tx-Bridge(00088,00000): ANNO.REQ.0000000000:0000000000,peer=0c0f0304-####-####-####-##########4b,1976b
- No response to edge datapath commands like "get logical-routers" would not work, as an external symptom.
2022-10-23T22:45:42.329Z <edge FQDN> NSX 6534 - [nsx@6876 comp="nsx-edge" subcomp="cli" username="admin" level="INFO"] CMD: get logical-routers
Error logged following command in /var/log/syslog on Edge:
2022-10-23T22:45:42.444603+00:00 <edge FQDN> NSX 6536 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="edge-appctl" s2comp="unixctl" level="WARN"] failed to connect to /var/run/vmware/edge/dpd.ctl
- dp-ipc threads in blocked state can be seen in /var/log/syslog: the blocked state keeps incrementing - For example, in the below log lines, the thread = urcu2 keeps incrementing from 4000ms to 8000ms to 16000ms.
2022-10-23T21:51:05.725Z <Edge FQDN> NSX 4468 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 4000 ms waiting for dp-ipc31 to quiesce
2022-10-23T21:51:09.724Z <Edge FQDN> NSX 4468 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 8000 ms waiting for dp-ipc31 to quiesce
2022-10-23T21:51:17.725Z <Edge FQDN> NSX 4468 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 16000 ms waiting for dp-ipc31 to quiesce
2022-10-23T21:51:24.979Z <Edge FQDN> NSX 4468 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="dp-si-purge5" level="WARN"] blocked 1000 ms waiting for dp-ipc31 to quiesce
2022-10-23T21:51:25.978Z <Edge FQDN> NSX 4468 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="dp-si-purge5" level="WARN"] blocked 2000 ms waiting for dp-ipc31 to quiesce
2022-10-23T21:51:27.978Z <Edge FQDN> NSX 4468 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="dp-si-purge5" level="WARN"] blocked 4000 ms waiting for dp-ipc31 to quiesce