ASM OPMS - Documentation on the popstat metrics for monitoring. Can we get some documentation on what each of the metric means for ASM OPMS monitoring?
|
Data Metrics Explained |
|
|
Job Acceptance Metrics |
|
|
Counter |
Description |
|
accepted |
Jobs successfully accepted for synchronous execution |
|
accepted_async |
Jobs successfully accepted for asynchronous execution with callbacks |
|
reused |
Jobs that reused recent cached results (within max-age) |
|
Error Metrics |
|
|
Counter |
Description |
|
bad_request |
Malformed requests (400 errors) - missing parameters, invalid JSON, etc. |
|
agent_missing |
Requested agent not available (403 errors) |
|
monitor_missing |
Monitor definition not found in Redis (404 errors) |
|
monitor_stale |
Monitor ETag mismatch, client needs to resend (412 errors) |
|
monitor_unreadable |
Stored monitor corrupted/unreadable (500 errors) |
|
Queue Rejection Metrics |
|
|
Counter |
Description |
|
reject_result_q |
Job rejected because result queue is full (503 errors) |
|
reject_agent_q |
Job rejected because agent queue is full (503 errors) |
|
reject_timeout |
Job expired before completion (500 errors) |
|
Timeout Metrics |
|
|
Counter |
Description |
|
timeout_refused |
Agent refused job because giveup time already passed |
|
timeout_abandoned |
API abandoned job during grace period (checker didn't pick up in time) |
|
timeout_agent |
Agent didn't respond within monitor timeout (504 errors) |
|
Monitor Management Metrics |
|
|
Counter |
Description |
|
monitor_read |
Monitor definitions retrieved (GET operations) |
|
monitor_updated |
Monitor definitions updated (PUT operations) |
|
monitor_deleted |
Monitor definitions deleted (DELETE operations) |