We are experiencing 200+ intermittent job failures with exit code 11. When attempting to troubleshoot these jobs there are no agent logs for them.
03/11/2024 10:00:29.926-0400: Preparing job 503539.50598967_1/WAAE_WF0.1/MAIN
03/11/2024 10:00:29.929-0400: Job 503539.50598967_1/WAAE_WF0.1/MAIN starting
03/11/2024 10:00:30.014-0400: Preparing job 498528.50598974_1/WAAE_WF0.1/MAIN
03/11/2024 10:00:30.017-0400: Job 498528.50598974_1/WAAE_WF0.1/MAIN starting
03/11/2024 10:00:30.042-0400: Preparing job 503448.50598975_1/WAAE_WF0.1/MAIN
03/11/2024 10:00:30.044-0400: Cannot fork a new process to execute the job:503448.50598975_1/WAAE_WF0.1/MAIN, reason:
Resource temporarily unavailable
03/11/2024 10:00:30.044-0400: Job 503448.50598975_1/WAAE_WF0.1/MAIN failed - Submission error
Platform: Linux
Agent Version: 12.x
The process could not create a new task. The system default is 512 for a user. Changed it to 2048 with the command below and verified that the change is persistent after a reboot.
systemctl set-property waae_agent-WA_AGENT.service TasksMax=2048
## ROOT ##> sudo systemctl show waae_agent-WA_AGENT.service | grep -i Task
TasksCurrent=145
TasksAccounting=yes
TasksMax=2048