AWS probe discovers Accounts then after a few minutes starts sending alarms that it cant contact the account, and EC2 instances are not being discovered.

book

Article ID: 195841

calendar_today

Updated On:

Products

NIMSOFT PROBES DX Infrastructure Management

Issue/Introduction

The AWS probe automation we have adds the accounts to the AWS probe and then marks them active. The accounts come up and are discovered and seem to work properly. Then after the first or second polling cycle we start to get errors in accounts. When I go into the raw config and look at the accounts that are erroring, it seems that their configuration is not complete anymore. As if something is overwriting part of the configuration and from that point on the probe just either goes red and wont start or it stays green and just sends alarms that it cant contact the account anymore. Also EC2 instances are discovered on some accounts but NOT all and we do not understand why, as it should be discovering them all? We have over 40 production accounts we cant monitor at this time because the probe keeps going down like described above.

Cause

- Scalability limitation - limit of 5 AWS accounts per aws probe instance

Environment

Release : 20.1

Component : UIM - AWS 5.41

Resolution

As per Development, configuration through any kind of script is not tested nor recommended. As we witnessed during the webex it seems that the probe CFG becomes corrupted at some point when using scripts to configure the probe.

As per the webex, manual configuration works as expected. There are no issues on the probe end, therefore if you are using any kind of scripted configuration, customers will have to manage it on their own.

If the aws probe is configured manually it should work as expected and in that case we don't see any specific limitations as per our experience with other customers.

There can be hardware resource limitations which can be handled according to the number of profiles you want to monitor. Also adjusting the Java memory parameters like Xmx and Xms etc might help to some extent - the same goes for adding virtual processors.

MCS and probe configuration packages are the currently-supported means of bulk config change for deployments - services needs to be engaged for any custom scripting solution for probe configuration, e.g., via API - this remains outside the scope of support.

That said, when using a scripted approach to AWS configuration we recommend only up to 5 AWS accounts per aws probe instance. The aws probe stores a copy of CFG keys & values in-memory other than the CFG file, so there might be chances to override the CFG values when the probe is busy in doing its core functionality during regular poll cycles. You can try deactivating the AWS probe and update the values from the API and then Activate the probe. This will ensure that the probe will pick the latest values from the CFG and run the probe functionality accordingly without any unexpected overwriting or corruption.