After upgrade to Aria Operations 8.18HF5, Aria Operations graphs from Telegraf agents shows gaps in data collection.
Aria Operations 8.18 HF 5 and later
The issue is caused by limitation on the buffer limit in Aria Operations.
Currently, the metric buffer is "dynamic" in the sense that it changes based on the plugins used in the server:
linux:
os: 100
mysql: 600
oracledb: 600
postgresql: 400
Windows:
os: 100
mysql: 600
mssql: 1000
msexchange: 400
msiis: 500
oracledb: 600
To determine what plugins are running in the environment, the following commands can be ran on the telegraf endpoint server:
1. sudo /opt/vmware/ucp/ucp-minion/bin/ucp-minion.sh --config /opt/vmware/ucp/salt-minion/etc/salt/grains --action xtract_config --dest_dir=/tmp/telegraf-bkp
2. opt/vmware/ucp/ucp-telegraf/usr/bin/telegraf -config /tmp/telegraf-bkp/telegraf.conf --test
in certain environments a specific plugin may cause the buffer limit to be trespassed, hence exceeding the dynamically set limit in the configuration.
with output from the above commands, it was determine that the Tegraf server was running an execution script that was generating a number of metrics that was exceeding the dynamically set buffer limit.
To resolve the issue:
Putty to the cloud proxy collecting the data from the Telegraf agent and edit:
ucp/ucp-config-scripts/salt/pillar/metric_buffer_limit.sls
Currentt metric_buffer_limit :
default: 300
max_size: 4000
change the "default" value to : 10000 (10K)
change the "max_size" value to: 10000 (10K)
modifying the metric_buffer_limit.sls file unblocks the configuration such that more agents can be deployed with the updated buffer limit configuration , but is not a permanent solution.
Updates to the metric_bufer_limit.sl, will not persist an upgrade of Aria Operations.