Symptoms:
Aria Operations for Networks' Virtual Infrastructure Latency feature uses NSX enabled hosts’ BFD service to establish tunnels between hosts. The PSOD occurs when NSX kernel module is responding to a BFD tunnel detailed query from the control plane agent with all the BFD sessions’ states maintained by the kernel.
Note: PSOD is not observed if the number of BFD tunnels are in few hundreds, only if the number tunnels are above 900.
To determine the number of BFD tunnels in the environment, use the below formula:
For example, in a cluster of 4 hosts with 2 VTEPs each The number of tunnels each host would see is:
This issue is resolved in VMware NSX for vSphere 6.4.8
Use this option if the Virtual Infrastructure Latency feature is enabled through VRNI
Use this option if the VRNI appliance is not accessible or if the Virtual Infrastructure Latency feature is enabled through NSX API
GET /api/2.0/vdn/bfd/configuration/global
Response:
<bfdGlobalConfiguration>
<enabled>true</enabled>
<pollingIntervalSecondsForHost>180</pollingIntervalSecondsForHost>
<bfdIntervalMillSecondsForHost>120000</bfdIntervalMillSecondsForHost>
</bfdGlobalConfiguration>
PUT /api/2.0/vdn/bfd/configuration/global
Request Body:
<bfdGlobalConfiguration>
<enabled>false</enabled>
<pollingIntervalSecondsForHost>180</pollingIntervalSecondsForHost>
<bfdIntervalMillSecondsForHost>120000</bfdIntervalMillSecondsForHost>
</bfdGlobalConfiguration>
Steps to block BFD API in NSX manager
Take an FTP backup of the manager before executing the below steps
cd /usr/appmgmt-webserver/webapps/ROOT/WEB-INF/ mkdir temp cp mapping.conf /usr/appmgmt-webserver/webapps/ROOT/WEB INF/temp/mapping.conf.orig
/etc/init.d/app-mgmt restart
Request:
root@vmware:~# curl -H "Content-Type: text/xml" -k -u username:password -X PUT https://<NSX manager IP>/api/2.0/vdn/bfd/configuration/global
Response:
<!doctype html><html lang="en"><head><title>HTTP Status 403 – Forbidden</title><style type="text/css">h1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} h2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} h3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} body {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} b {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} p {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;} a {color:black;} a.name {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 403 – Forbidden</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Description</b> The server understood the request but refuses to authorize it.</p><hr class="line" /></body></html>root@vmware:~#
Request:
root@vmware:~# curl -H "Content-Type:application/xml" -k -u username:password -X GET https://<NSX manager IP>/api/2.0/vdn/bfd/configuration/global
Response:
<!doctype html><html lang="en"><head><title>HTTP Status 404 – Not Found</title><style type="text/css">h1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} h2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} h3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} body {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} b {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} p {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;} a {color:black;} a.name {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 404 – Not Found</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Message</b> /api/2.0/vdn/bfd/configuration/global</p><p><b>Description</b> The origin server did not find a current representation for the target resource or is not willing to disclose that one exists.</p><hr class="line" /></body></html>root@vmware:~#