A scan profile to discover AD accounts on an hourly schedule stopped running last week. What can be done about it?
The discovery scan job in question was hung over an unavailable lock specific to this job. Other jobs, either other scan jobs or account update or verify jobs, were not affected. By the time the logs were collected, they had rolled over already and root cause could not be determined.
While there is no published solution, collect the system logs (logs.bin) from the Configuration > Diagnostics > Diagnostic Logs > Download page by clicking the Download button to the right of the "Download System Diagnostics" label, after setting the "Past Days" parameter to a value that guarantees inclusion of log files from the last day the job ran successfully. In a cluster environment do this on all primary site nodes. While the first node in the primary site is the most likely one to launch scheduled jobs, the job leader role can move to any other primary site node, if there are temporary communication or synchronization problems while the cluster is turned on. Open a case with PAM Support and attach the logs.
As a workaround, create a new discovery job with a different name, but otherwise the same properties as the hung one, and verify that it runs and completes successfully. If no problem is observed over multiple runs, the old job should be deleted. Otherwise you risk having both jobs run in parallel and potentially interfere with each other after any activity that restarts the tomcat service on the leader node, e.g. when a hotfix is applied that requires a restart, when the node reboots for any reason, or when the cluster is turned off and on again.