We are seeing many processes probe going down in a group of servers that run jobs. Often, but not always, an Internal Error is generated on the server.
Sometimes, cdm probe also reports 100% swap memory used.
The processes probe may restart or fail.
Often simply restarting the process probe works, and it remains green/active. Other times, the processes probe cannot be activated, and the processes probe log contains Cannot allocate memory. For example:
Controller: Probe 'processes' FAILED to start (fork) (command = processes) error = (0) Cannot allocate memory
Note:
When the system runs out of memory, the probe should not go down. Instead, it should continue checking at regular intervals and resume monitoring once sufficient memory becomes available.
Tested and confirmed in large environment.