- Load Balancer (LB) is reconfigured, and an nginx core file is generated. We can find these core files on the edge node, typically under the root directory, with names similar to the following:
root@edge_name:/var/dump# ls
total 454M
-rw-rw-rw- 1 root root 321M Jun 26 12:39 core.nginx.####.gz
-rw-rw-rw- 1 root root 321M Jun 26 12:37 core.nginx.####.gz
- Pool members may report
"Connect to Peer Failure" or "TCP Handshake Timeout".
- In
var/log/syslog of the Edge Node you see log entries for "all pool members are down":
2022-12-27T01:22:23.064227+00:00 <edge_name> NSX 6552 LOAD-BALANCER [nsx@6876 comp="nsx-edge" subcomp="lb" s2comp="lb" level="ERROR" errorCode="EDG1200000"] [########-####-####-####-##########34] Operation.Category: 'LbEvent', Operation.Type: 'StatusChange', Obj.Type: 'Pool', Obj.UUID: '####9c89-########-####-####-##########95', Obj.Name: 'cluster:<name>', Lb.UUID: '########-####-####-####-##########34', Lb.Name: '<LB_LBname>', Vs.UUID: '########-####-####-####-##########f8', Vs.Name: '<name>', Status.NewStatus: 'Down', Status.Msg: 'all pool members are down'.
2022-12-27T01:22:23.064913+00:00 <edge_name> NSX 6552 LOAD-BALANCER [nsx@6876 comp="nsx-edge" subcomp="lb" s2comp="lb" level="ERROR" errorCode="EDG9999999"] [########-####-####-####-##########34] Operation.Category: 'LbEvent', Operation.Type: 'StatusChange', Obj.Type: 'VirtualServer', Obj.UUID: '########-####-####-####-##########f8', Obj.Name: 'cluster:<name>', Lb.UUID: '########-####-####-####-##########34', Lb.Name: '<LB_LBname>', Status.NewStatus: 'Down', Status.Msg: 'all pool members are down'.
- The LB CONF process for the LB instance is not running, this can be confirmed by following the below steps:
1. Execute the below command from the root CLI of the Edge Node, this requires the UUID of the LB.
#ps -ef | grep lb | grep nginx | grep <LB UUID>
example:
root@edge_name:~# ps -ef | grep lb | grep nginx | grep ########-####-####-####-##########a8
lb 9568 9481 0 Jun23 ? 00:00:00 /opt/vmware/nsx-edge/bin/nginx -u ########-####-####-####-##########a8 -g daemon off;
Note: Execute get load-balancer from the admin CLI of the active Edge Node, to retrieve the LB UUID. In the above example the LB UUID is ########-####-####-####-##########a8.
2. Use the nginx process ID (9568, as highlighted above) in the following command to confirm it has a LB CONF process running, if there is no output to the above command, there is no process running and the issue has been encountered.
ps -ef | grep <nginx process ID>| grep CONF
example:
Impacted
root@edge02:~# ps -ef | grep 9568 | grep CONF
Not impacted
root@edge02:~# ps -ef | grep 9568 | grep CONF
lb 9572 9568 0 Jun23 ? 00:00:06 nginx: LB CONF process
NOTE: The preceding log excerpts are only examples. Date, time and environmental variables may vary depending on your environment.