one of autosys users reported following issue:
The files mentioned in the output below are deleted but kept open by Java. The user has a process that sends alerts for these types of file, So when he starts his process with Autosys they are getting alerts for these deleted files. The same issue is happening to them in all the instances [Wxx and Wxx].
The user restarted the server to see if the issue gets resolved – It did not work. Still these files are popped up when starting the process through Autosys.
[email protected] ~]$ /gfi/newMerit/resources/scripts/fatjars/cbw_mt/gfi-stop.sh
Applying /home/user/gfiInit/gfienv.rdeploy
[[email protected] ~]$ fsj WMA_GFI_5322_DEV4_New_Merit_START_CBW_MT_2969
[[email protected] ~]$ ps -ef | grep cbw
user 281468 1 99 08:38 ? 00:00:10 /etc/alternatives/jre_1.8.0/bin/java -server -Djava.security.egd=file:///dev/urandom -XX:NativeMemoryTracking=summary -Xmx1024m -XX:MaxMetaspaceSize=512m -XX:CompressedClassSpaceSize=256m -Dspring.profiles.active=dev4 -agentlib:jdwp=transport=dt_socket,address=8804,suspend=n,server=y -jar merit2-7.5.0-SNAPSHOT.jar --spring.config.additional-location=/gfi/newMerit/app_env.cfg
user 281503 278812 0 08:38 pts/1 00:00:00 grep --color=auto cbw
[[email protected] ~]$ lsof | grep deleted | grep 281468
lsof: WARNING: can't stat() cgroup file system /opt/sentinelone/cgroups/memory
Output information may be incomplete.
lsof: WARNING: can't stat() debugfs file system /opt/sentinelone/mount
Output information may be incomplete.
java 281468 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
java 281468 281471 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
java 281468 281472 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
java 281468 281473 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
java 281468 281474 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
java 281468 281475 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
java 281468 281476 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
java 281468 281477 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
java 281468 281478 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
java 281468 281479 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
VM 281468 281480 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
Reference 281468 281481 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
Finalizer 281468 281482 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
Signal 281468 281483 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
JDWP 281468 281484 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
JDWP 281468 281485 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
C2 281468 281486 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
C2 281468 281487 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
C2 281468 281488 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
C1 281468 281489 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
Service 281468 281490 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
VM 281468 281491 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
Log4j2-TF 281468 281498 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
AsyncAppe 281468 281499 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
GC 281468 281510 user 4w REG 253,1 392 730402 /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9 (deleted)
[[email protected] ~]$ ls -la /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9
ls: cannot access /app/CA/WorkloadAutomationAE/SystemAgent/WA_AGENT/spool/WUD_SCH/MAIN/WAAE_WF0.1/80569.12776646_9: No such file or directory
Release : 11.5
Component : Workload Automation System Agent
Leaked File Descriptor not getting closed after process exit at OS level when an script is executed by system agent
Suggested Below Workaround
1. Turn off below parameters in agentparm.txt
oscomponent.lookupcommand=False
oscomponent.cmdprefix.force=False
This will avoid processes hanging after spool files deleted.
2. Increase ulimit value for "user" for open files at either OS level or job level.
3. Re-Direct the spool output to /dev/null within the script you are running for job command which could be a long term solution so that it will not hold any of such files deleted ones.