Upon rebooting a system, during or immediately after the reboot process, a Windows crash event (blue screen of death) may occur.
A memory dump/minidump related to the crash may show the following:
FAILURE_BUCKET_ID: 0xEF_wininit.exe_BUGCHECK_CRITICAL_PROCESS_TERMINATED_BY_controller.exe_c9f84880
BUCKET_ID: 0xEF_wininit.exe_BUGCHECK_CRITICAL_PROCESS_TERMINATED_BY_controller.exe_c9f84880
PRIMARY_PROBLEM_CLASS: 0xEF_wininit.exe_BUGCHECK_CRITICAL_PROCESS_TERMINATED_BY_controller.exe_c9f84880
Windows - any version
robot versions prior to 9.33
When a robot restarts, it tries to shut down all the probes which are running. If they do not shut down within 10 seconds, the controller will issue a 'kill' command based on the PID. At this time the controller also records the PIDs of these processes, and when it restarts, it checks if these PIDs are still active, and if so, it kills those processes before starting up the probes again. Below is an example of what this looks like in the controller.log file:
Aug 27 19:07:15:050 [139974114551552] Controller: Stopping processes from previous run
Aug 27 19:07:15:050 [139974114551552] Controller: ProcessControl: Sending SIGTERM signal to spooler (24711)...
Aug 27 19:07:15:050 [139974114551552] Controller: ProcessControl: Unable to send stop signal to process spooler (24711)
Aug 27 19:07:16:050 [139974114551552] Controller: ProcessControl: Child exited
Aug 27 19:07:16:050 [139974114551552] Controller: ProcessControl: Sending SIGTERM signal to hdb (24745)...
Aug 27 19:07:16:050 [139974114551552] Controller: ProcessControl: Unable to send stop signal to process hdb (24745)
Aug 27 19:07:17:050 [139974114551552] Controller: ProcessControl: Child exited
Aug 27 19:07:17:050 [139974114551552] Controller: ProcessControl: Sending SIGTERM signal to snmptd (24771)...
Aug 27 19:07:17:050 [139974114551552] Controller: ProcessControl: Unable to send stop signal to process snmptd (24771)
Aug 27 19:07:18:051 [139974114551552] Controller: ProcessControl: Child exited
Sometimes during a reboot, one or more probes can take longer to shut down and the reboot interrupts this process, so that after the reboot, a new process has taken a PID that was previously owned by a probe, and the controller terminates this process. If this is a system critical process it will cause a BSOD.
Functionality was released to prevent this issue from occurring starting in robot 9.33. Deploy this robot version (or any later version) to deploy the fix.
For robot 7.80HF21 and versions up to 9.33, the following can be done to work around the issue. Keep in mind that adding these settings may cause robot restarts to take longer than usual.
There is no fix for robot versions prior to 7.80HF21.