We received reports from our users that they are unable to view hourly or daily resolution metrics on interfaces.
As polled data is available fine, but the rollups do not appear to be occurring since then.
We did have a node in the Vertica cluster go offline at around the same exact time this issue appears to start which seems like suspicious timing but might be unrelated.
DX NetOps Performance Management
A node has restarted, resulting in a stuck communication thread with the Data Repository.
To determine this requires analysis of a stack trace by Broadcom Engineering.
To collect:
1) On the (active if Fault Tolerant DA setup) Data Aggregator run:
kill -3 $(ps -ef | awk '/[o]rg.apache.karaf.main.Main/ {print $2} )
2) Collect a CA Remote Engineer from the same Data Aggregator.
Open a support issue for analasys.
Restart the Data Aggregator this will create new connections and rollups will start.
The rollups will not roll up data for the time the rollups were not running.
It will take some time to see the newly rolled up data as the rollup processes run.