Within AWI we try to switch the PWP from the first node to a WP on the second node.
No error message is displayed, just after about 20-30 seconds the message "Server mode successfully changed" is displayed, but the PWP is not actually switched and remains on the first node.
If the PWP on the first node is switched to another WP on the first node, this completes without a problem.
Release : 12.3.4
Network issues were causing communication issues between the two AE servers.
The logs show hundreds of the following
U00003413 Socket call 'recv(47)' returned error code '104'
U00003413 Socket call 'recv(115)' returned error code '110'
U00003413 Socket call 'send(1)' returned error code '32'
U00003413 Socket call 'bind' returned error code '98'
At the time the switch is being attempted to the second node, trace logs show the following:
mqsrv_get_primary(2267): rslt = 0,msqh = 340078
mqsrv_get_primary <-- (no primary)
try2be_pwp(7559): an older PWP found, let's bind PWP port(s)
bind_primary_ports() -->
bind_primary_ports(2407): retry(cnt=3,wait=30)
U00003413 Socket call 'bind' returned error code '98'.
Address already in use
U00003487 ListenSocket with port number '49502' could not be created.
U00003413 Socket call 'bind' returned error code '98'.
Address already in use
U00003487 ListenSocket with port number '49502' could not be created.
U00003413 Socket call 'bind' returned error code '98'.
Address already in use
U00003487 ListenSocket with port number '49502' could not be created.
bind_primary_ports <-- (PWP port couldn't be binded)
try2be_pwp <-- (not able to bind PWP port)
Full shut down of AE on both nodes - for Windows systems reboot the servers.
Clean out the logs in the /temp folder and restart AE one node at a time.
Watch the logs for U00003413 messages and engage your network team if these persist.
Socket errors do not come from Automic, these are OS level/Network error messages being reported by the OS to Automic.