Most of the time, the IIS Web Agent fails to start and gives a 500 error when the client tries to connect.
There is no error available in the Web Agent log or trace, and the only trace found is that there is an
Sm_AgentApi_Init Failed error
message reported in the Windows Event Viewer.
When the problem occurs, the LLAWP process attempts to start but fails.
Policy Server all versions;
Web Agent R12.52 SP1 CR06 on IIS 7.5 (Windows 2008 R2);
IIS 7.5 running a mix of Classic and Integrated pipeline mode application pools (32 and 64 bit);
This issue is due to the difference in the time it takes for the OS and LLAWP shutdown processes to proceed.
In a scenario (for instance in testing or frequent and unstable environments) where reboots of the Web Agent machine are unexpected and quite frequent, this problem may happen due to the mechanism used by the IIS Web Agent to manage the connections between the IIS process, LLAWP, and the Policy server:
However, if the Operating System shuts down before these 20 seconds, LLAWP never closes the connections to the Policy server. This leaves connections open in the Policy Server machine, which does not know that the other side is no longer there. In the Policy Server machine, if running a 'netstat' command, the connections coming from the Web Agent IP and port appear 'ESTABLISHED'. Assuming the process of restarting the Web Agent machine is happening very frequently, this mechanism will cause a lot of connections to be left as 'ESTABLISHED' in the Policy Server machine, even if there is no corresponding LLAWP process on the other side.
It may happen that during the frequent reboots, at one point the Web Agent tries to create a connection but it is assigned a port number that corresponds to one of these abandoned connections. So when it tries to connect, the Policy Server machine will see this as irregular, and it won't send an SYN/ACK packet in response to the SYN, which will cause a TCP sequence mismatch, resulting in the Web Agent machine discarding this packet as it is out of sequence and believing it is not meant to be a response to the SYN.
The Web Agent then waits for the SYN/ACK for 2 seconds and times out, failing to initialize.
This is not really an error or defect, but the expected behavior in the scenario described (very frequent reboots of the IIS agent machine). Thus, it is unlikely to happen in a production environment, which should be very stable by design.
However, if there is a concern that such a situation may happen, there are several possible corrective actions available to mitigate or eliminate it: