Windows OS
- Application crashes via the ntevl probe
Linux/Unix OS
dirscan probe:
- Can be used to monitor for the presence of cores/core dump files if core dumps are enabled
Operator Console
Then, QoS can be gathered via the jvm_monitor probe or a third-party app such as VisualVM:
https://visualvm.github.io/
http://docs.oracle.com/javase/6/docs/technotes/guides/visualvm/jmx_connections.html
UIM Gateways - spectrumgtw probe
Use the processes probe and monitor java.exe using the associated command line for spectrumgtw
Use logmon to monitor the spectrumgtw.log for "exception" and other errors
spectrumgtw probe sync process
Here is the recommended way to monitor the UIM<->Spectrum synchronization process. This approach below would be assuming that the UIM-Spectrum integration and configuration is correct, required versions are all compatible and correct, and that the sync was previously working as expected.
Steps:
1. Set spectrumgtw loglevel to 5 and logsize to 100000 for each logfile.
The number of individual spectrumgtw logs written is dependent on a setting in the log4j.xml file in the following location:
...\Program Files (x86)\Nimsoft\probes\gateway\spectrumgtw directory.
<param name="MaxBackupIndex" value="5"/>
2. Deactivate - Activate spectrumgtw
3. Configure logmon watchers for the following strings in the spectrumgtw.log:
- error
- Exception
- Failed
- lock
- OutofMemoryError
- Got NOT OK response
General probe failures
To generate an alarm if and when a probe turns red, monitor the probe logfile for errors/failures/Max. restarts using logmon. Do this for the probe and/or the controller as well.
When a probe fails, the most common errors that may occur include:
Controller: Probe '<probe_name>' FAILED to start (command = <probe_name>.exe) error = (5) Access is denied.
In the nas, an alarm message filter could be used to take an action on the alarm, e.g.,
/.*FAILED to start.*/
or an error such as:
Controller: Max. restarts reached for probe '<probe_name>' (command = <startup java>)
/.*Max. restarts.*/
vmware probe connection monitoring
vmware probe configuration having issues pulling data from Virtual Center, e.g., vCenter is not able to read configuration and discover any information from the host even though that ESXi was reported as connected to the vCenter.
A Warning alarm is generated to prompt the vCenter admin to verify whenever there were timeout issues while collecting data from the vCenter. The alarm provides information that something might be wrong in the vCenter and vCenter admin need to verify it.
Using the logmon probe:
In logmon, monitor for the log entry *VMWare API is unavailable* in the vmware.log and send an alarm. Also using a nas Auto Operator rule message filter like-> /.*VMWare API is unavailable.*/ send an EMAIL for notification.
[Connection tester - 0, vmware] (12) login failed, VMWare API is unavailable: com.vmware.vim25.InvalidLogin: null at com.nimsoft.probe.application.vmware.sdk.VmwareEnvAdaptor.login(VmwareEnvAdaptor.java:273)
or
"vNNNxxxxx is not responding (reason: Unexpected fatal error in data collection. Collection will not resume until probe is restarted. See log for details.)"
Another option for monitoring connectivity related errors/issues is explained here using a CLI/script:
VMware PowerCLI Blog - Back to Basics: Connecting to vCenter or a vSphere Host
Using the net_connect probe:
monitor vCenter reachability via ping
Logins attempts Monitoring:
Monitor user login attempts in IM (Infrastructure Manager ) - UIM (broadcom.com)
Monitor user login attempts in Operator Console (OC) - UIM (broadcom.com)