Sometimes the MACHINE is not displayed along the "CAUAJM_I_40245 EVENT: ALARM ALARM: JOBFAILURE" message.
Is there a reason for this?
Sometimes, the following message is printed out in the event demon log:
CAUAJM_I_40245 EVENT: ALARM ALARM: JOBFAILURE JOB: TEST_JOB MACHINE: xxxxxxx
While other times:
CAUAJM_I_40245 EVENT: ALARM ALARM: JOBFAILURE JOB: TEST_JOB
Root-cause:
=======
This happens when one issues a KILLJOB event to a box that has a child job with the attribute 'job_terminator:1' and that child job had not run on any machine yet (i.e. it was in ACTIVATED state). The KILLJOB terminates the child job and, as the child job was not RUNNING on any machine, the Scheduler cannot populate the machine for the alarm.
The same scenario where the child job was already in STARTING/RUNNING would terminate the job and the machine would be mentioned in the alarm.