Smarts SAM: Notifications and trap processing are delayed
search cancel

Smarts SAM: Notifications and trap processing are delayed

book

Article ID: 315786

calendar_today

Updated On:

Products

VMware Smart Assurance

Issue/Introduction

Symptoms:
Smarts SAM notifications are delayed
Smarts SAM trap processing is delayed
Smarts SAM notifications are not received in a timely manner
Smarts SAM trap processing takes a long time

Environment

VMware Smart Assurance - SMARTS

Cause

This issue is typically seen because of high network latency. High network latency has a negative impact on the reporting time for Smarts SAM notifications and trap processing. The following sections of this Cause statement explain why is so.

Smarts SAM notification messaging latency and maximum throughput
When Smarts SAM receives a notification from an underlying server, it sends a message back to that server to get additional information. When that additional information is returned, SAM may find that it needs more. This can happen as many as 3 times per notification one way, and assuming that the network latency is bidirectional, this means that the network latency will be encountered 6 times. For example, if there is a network latency of 39 ms, Smarts SAM has a minimum .234 seconds (39 ms * 6) delay caused by latency alone to process just one notification. Even if no other factors are taken into consideration, that limits SAM to a maximum of just over 4 notifications/second for this particular data path.

Factors limiting actual SAM notification throughput
Actual throughput on a system with a network latency of 39 ms will probably be much lower than 4 notifications/second. This is because a SAM server often has to do other tasks such as calculating impacts, getting notifications through other data paths, executing scripts, scheduling escalations, topology synchronizations, and database saves. And, a SAM server may not be connected directly to the underlying server. If there are other servers such as aggregating SAM servers in between, then there will be even more network traffic and latency to contend with.

Impact of server queueing on notification delays
A notification throughput rate of less than 4/second will probably acceptable and may not even be noticeable as long as the notifications are coming IN at a rate that is lower than this throughput rate. However, as soon as the rate of incoming notifications exceeds the throughput rate, SAM and the underlying server or servers start placing items in queues. If the notifications are coming in faster than 4/second with this scenario, the queue will grow. The longer the queue grows, the longer it will take for a notification to reach SAM. Consider that if the SAM is only handling 4 notifications/second, a queue with only 10,000 items in it (which is not usually considered particularly large) will take 2500 seconds or just over 41 minutes to be completely processed.

Resolution

To mitigate this issue, VMware recommends that the Smarts Trap Adapter and OI domains used for processing traps and performing critical notification processing should be co-located in the same data center and have as little network latency as possible.