The Data Collector is showing "Failed" and Polling Status as "Not Connected". Although Overall Status shows "Connected"
Restarting all the Data Collector services does not help.
Despite of this, data should be collected and showing in reports. Can confirm by running a report and check there are no gaps since the issue started.
Release : 21.2
Component : NetOps Data Collector
The Data Collector and Data Aggregator times are not in sync.
For example, here the times are out of sync of more than 30 secs. DA is ~35 seconds ahead of DC:
Run the "date" command and make sure the time on the Data Collector(s) and Data Aggregator systems are in sync with each other.
The Data Collector(s) and Data Aggregator times need to be fully in sync (or within 1 second), or the Polling Status will show as Not Connected.
To resolve, correct the time on the system(s) and restart the services on Data Collector(s). Polling Status will change to "Collecting Data".
To help showing the time on Data Aggregator and all Data Collector(s) all together, could use a command like the following and run it on one of the systems.
The "date +%s" will dump a timestamp instead of date string to avoid TZ difference. The timestamps should all be the same or within 1 second.
[[email protected] ~]# for x in host1 host2 host3; do ssh -q $x 'hostname;date +%s'; done
It should return something like:
for x in host1 host2 host3; do ssh -q $x 'hostname;date +%s'; done
host1.xxxxx.yyyyy.net
1651593154
host2.xxxxx.yyyyy.net
1651593154
host3.xxxxx.yyyyy.net
1651593154