We had an outage of one of the ISP (internet service provider) around 9AM. We were alerted on the ISP monitor we have setup but we were not alerted until around 3:30pm for the site that was only configured on that ISP. When looking at the ASM monitor logs we noticed several "(-98) no more monitoring stations available to perform the check" and "(6004) Cannot connect to destination". This monitor is configured to use several stations across the USA and should not have a shortage of stations to run check on. We also observed the error where monitor timeout with (-97) internal error.
ASM SAAS
The issue is that if the initial page load fails because connection to the monitored server takes longer to establish than the monitor timeout, the check fails with an error "(-97) com.github.kklisura.cdt.services.exceptions.ChromeDevToolsInvocationException: Timeout expired while waiting for server response."
This is an internal error that does not trigger any alerts and in most cases is not displayed in ASM logs.
The error "(-98) no more monitoring stations available to perform the check" means that all of the available monitoring stations returned an internal error and the check won't be retried until the next period.
ASM DEV has created a fix so that in this specific case an unconfirmed error "(1043) Pageload timed out" will be reported instead of "(-97) internal error".
This fix has been deployed to the ASM Production Environment.