This article contains the basic information for troubleshooting the NSX Native Load Balancer and the data required when opening a support request with Broadcom.
VMware NSX
Enable DEBUG logging on the Load Balancer page and Access Logs on the Virtual Server page in the NSX-T UI.
SSH into the Edge (Where the LB is active) as root user and change to: cd /var/log/lb/<LB ID>/logsgrep -a "502" error.logerror.log:2019/09/25 11:54:16 [debug] 32409#0: *11258672 HTTP/1.1 502 Bad Gatewayerror.log:2019/09/25 11:54:24 [debug] 32406#0: *11292889 HTTP/1.1 502 Bad Gateway
Virtual server access logs can be found in /var/log/syslog as shown below, search for subcomp="lb" and s2comp="access":
LOAD-BALANCER [nsx@6876 comp="nsx-edge" subcomp="lb" s2comp="access" level="INFO"] [######-####-####-####-############][########-####-####-####-############] Operation.Category: 'LbAccessLog', Operation.Type: 'Http', Lb.UUID: '#######-####-####-####-############', Lb.Name: 'LB-NAME', Vs.UUID: '########-####-####-####-############', Vs.Name: '####', Vs.Ip: '##.###.#.#', Vs.Port: '443', Pool.UUID: '########-####-####-####-##########ab', Pool.Name: 'POOL-NAME', PoolMember.Ip: '##.###.#.#', PoolMember.Port: '443', Client.Ip: '##.##.##.#', Client.Port: '####', Snat.Ip: '##.###.#.#', Snat.Port: '63069', HttpRequest.Method: 'POST', HttpRequest.UserAgent:, HttpRequest.X-Fwd-For: '-', HttpRequest.Uri: '/auth/access_token', HttpRequest.Host: '#######', HttpResponse.Status: '502', HttpResponse.StatusCategory: '5xx', HttpResponse.Size: '0', HttpResponse.ServerTime: '31.624', HttpResponse.TotalTime: '31.628', Error.Reason: '-'.
In the /var/log/lb/<LoadBalancer-UUID>/logs/error.log, specifically look for the 502 error, then it should clearly indicate the header is too big.
Note: Usually the header too big issue is due to a large cookie added.
If this is the case, the headers can be changed in the UI.
Manager interface only supports request header size change, while Policy Interface supports both response and request.
If only a request is required, a new policy profile can be created and attached to the manager load balancer.
Make sure to revert debug logging when done.
NOTE: Debug logs are deleted the moment DEBUG is turned off. So always gather logs from the edge BEFORE disabling debug logging.
Check the edge /var/log/lb/<loadbalancer-UUID>/lbconf_gen.log
2020-08-03 09:01:35,301 204 lb ERROR failed to build nginx config
2020-08-03 09:01:35,301 204 lb ERROR 'ascii' codec can't encode character '\u2013' in position 6950: ordinal not in range(128)
Check for Error in edge /var/log/syslog:
<5>1 2020-08-03T09:26:27.355660+00:00 w1-dmz-edge04 kernel - - - [ 3789.995524] grsec: [e01e2f34####] denied RWX mmap of <anonymous mapping> by /opt/vmware/nsx-edge/bin/lbconf_gen.py[lbconf_gen.py:12455] uid/euid:134/134 gid/egid:140/140, parent /opt/vmware/nsx-edge/bin/nginx[nginx:8167] uid/euid:134/134 gid/egid:140/140
<25>1 2020-08-03T09:01:35.370199+00:00 e01e2f34#### NSX 61 LB [nsx@6876 comp="nsx-edge" subcomp="nsx-edge-lb.lb" level="FATAL"] [bba2bc6c-7b00-402e-84af-############] cfg: failed to signal config change to engine (Connection refused).<25>1 2020-08-03T09:01:35.370406+00:00 e01e2f3486a5 NSX 61 LB [nsx@6876 comp="nsx-edge" subcomp="nsx-edge-lb.lb" level="FATAL"] [bba2bc6c-7b00-402e-84af-############] cfg: failed to generate Lb configurationOn the edge node, the nginx.conf file is empty, as root user, run and note the size of the file is 0:
ls -l /config/vmware/edge/lb/etc/bba2bc6c-7b00-402e-84af-############/nginx.conf
-rw-r----- 1 lb lb 0 Aug 3 09:26 /config/vmware/edge/lb/etc/bba2bc6c-7b00-402e-84af-############/nginx.conf
|
Check services used by LB (nsxcli commands) |
get service dataplane get service dispatcher get service nsx-control-plane-agent get service nestdb |
|
Check LB configuration (nsxcli commands) |
get load-balancers get load-balancer <lb-uuid> get load-balancer <lb-uuid> virtual-servers get load-balancer <lb-uuid> virtual-server <vs-uuid> get load-balancer <lb-uuid> virtual-server <vs-uuid> lbrules get load-balancer <lb-uuid> pools get load-balancer <lb-uuid> pool <pool-uuid> get load-balancer <lb-uuid> monitors get load-balancer <lb-uuid> monitor <monitor-uuid> |
|
Check LB status (nsxcli commands) |
get load-balancers status get load-balancer <lb-uuid> status get load-balancer <lb-uuid> virtual-servers status get load-balancer <lb-uuid> virtual-server <vs-uuid> status get load-balancer <lb-uuid> pools status get load-balancer <lb-uuid> pools <pool-uuid> status get load-balancer <lb-uuid> monitor <monitor-uuid> status |
|
Check LB statistics (nsxcli commands) |
get load-balancer <lb-uuid> stats get load-balancer <lb-uuid> stats verbose get load-balancer <lb-uuid> virtual-servers stats get load-balancer <lb-uuid> virtual-server <uuid> stats get load-balancer <lb-uuid> pools stats get load-balancer <lb-uuid> pool <pool-uuid> stats clear load-balancer <lb-uuid> stats clear load-balancer <lb-uuid> pools stats clear load-balancer <lb-uuid> pool <pool-uuid> stats clear load-balancer <lb-uuid> virtual-servers stats clear load-balancer <lb-uuid> virtual-server <vs-uuid> stats |
|
Check LB HA (nsxcli commands) |
get load-balancer <lb-uuid> high-availability-state |
|
Check LB logs (nsxcli commands) |
get load-balancer <lb-uuid> error-log get load-balancer <lb-uuid> virtual-server <vs-uuid> access-log set load-balancer <lb-uuid> rule-log |
|
Check kni interface is created (only L7) (root commands) |
#ifconfig | grep kni-lrport |
|
Check namespaces (root commands) |
#set debug #get namespaces |
|
Check LB status (CPU, memory) |
GET /api/v1/loadbalancer/services/<LB-Service_UUID>/status |
docker ps
Example :root@edge:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9564####bd78 nsx-edge-frr:current "/opt/vmware/edge/fr…" 12 days ago Up 12 days service_frr
d102####036c nsx-edge-base:current "/bin/sleep infinity" 12 days ago Up 12 days 22/tcp plr_sr
476f####67bb nsx-edge-mdproxy:current "/opt/vmware/edge/md…" 12 days ago Up 12 days service_md_proxy
abcb####8c91 nsx-edge-base:current "/bin/sleep infinity" 12 days ago Up 12 days 22/tcp mdproxy
a681####9352 nsx-edge-dhcp:current "/opt/vmware/edge/dh…" 12 days ago Up 12 days service_dhcp
044c####82dc nsx-edge-mdproxy:current "/opt/vmware/edge/md…" 12 days ago Up 12 days service_md_agent
c957####f87b nsx-edge-dispatcher:current "/opt/vmware/edge/lb…" 12 days ago Up 12 days service_dispatcher
5451####63ef nsx-edge-datapath:current "/opt/vmware/edge/dp…" 12 days ago Up 12 days service_datapath
1071####ab35 nsx-edge-nsxa:current "/opt/vmware/edge/ns…" 12 days ago Up 12 days service_nsxa
If a load balancer is not owned by NSX-T and cannot be resized through the UI, use below API calls:
Get the current LB configuration:
curl -X GET -H Content-Type:application/json -ku username:password https://NSXManagerIPAddress/api/v1/loadbalancer/services/<LoadBalancerUUID> > lb.json
Edit the lb.json file and modify "size" entry to desired form factor, matching supported form factors NSX Edge VM System Requirements:
Push new modified configuration:
curl -X PUT -H Content-Type:application/json -H X-Allow-Overwrite:True -ku username:password https://NSXManagerIPAddress/api/v1/loadbalancer/services/<LoadBalancerUUID> -d @lb.json
Note: Ensure to take a backup prior to modifications.
To review the Certificates details applied to a Virtual Server, follow these steps:
cd /config/vmware/edge/lb/etc/<Load-Balancer ID>/certs/"get firewall [Logical interface UUID] ruleset rulesget firewall [Logical interface UUID] ruleset statsget firewall [Logical interface UUID] interface statsget load-balancer [Load balancer UUID] pool [pool UUID] statusget dataplane cpu statsrestart service dataplane (try this to restart datapathd)Check the size of the LB, and make sure that it is not exceeding the maximums for each size.
Refer to the Maximums section in this document.