Issues with collection and sending metrics to the Wavefront proxy from a Telegraf agent. Metrics sent from the Telegraf agent via a proxy are not seen when queried in the Wavefront User Interface (UI).
If there are a small number of missing metrics from a particular plugin, check that the configuration for the plugin is not configured to exclude those metrics. The log output may return more detailed information for guidance. Refer to the applicable Telegraf GitHub for documentation related to input and output plugins for examples: Telegraf on GitHub
For example, in the vSphere input plugin configuration in the telegraf.conf (typically located in /etc/telegraf/telegraf.conf) file shown below, there are virtual machine and host metrics excluded on purpose:
# Read metrics from one or many vCenters
[[inputs.vsphere]]
vm_metric_exclude = ["cpu.idle.summation","cpu.readiness.average","cpu.ready.summation","cpu.run.summation"]
host_metric_exclude =["cpu.idle.summation","cpu.readiness.average","cpu.ready.summation","cpu.wait.summation"]
The Linux agent does not log to /var/log/telegraf.log by default, but to syslog (/var/log/syslog), or equivalent depending on the OS distribution. You can override this setting by uncommenting the log file line in telegraf.conf, as seen below to send Telegraf related log information to its own log file. Debug mode can also be enabled to increase verbosity.
## Name of the file to be logged to when using the "file" log target. If set to
## the empty string then logs are written to stderr.
logfile = "/var/log/telegraf/telegraf.log"
## Run telegraf in debug mode
debug = false
## Run telegraf in quiet mode
quiet = false
If the plugins appear to be loading as expected but the metrics are not being seen, enable capturing of all points being sent from telegraf to a file on the local drive to verify the data is being captured and, in the format, expected by adding the following in the telegraf.conf file.
[[outputs.file]]
## Files to write to, "stdout" is a specially handled file.
files = ["stdout", "/tmp/metrics.out"]
In the example, the file "/tmp/metrics.out", will have all the Telegraf data being sent to the proxy captured for review; this can be used to confirm the Telegraf input plugins are gathering the metric data with appropriate tagging prior to being sent.
Not all errors in the logs may be actionable, and not all plugins may be required, for example the error message below is not a proxy related error. If this error is not desirable and the outputs.influxdb plugin is not needed, you can disable the outputs.influxdb plugin by commenting (#) out the corresponding outputs.influxdb plugin lines in the telegraf.conf and restarting the Telegraf agent.
<date> hostname telegraf[22421]: <date> E! [outputs.influxdb] When writing to [http://localhost:8086]: Post "http://localhost:8086/write?db=telegraf": dial tcp 127.0.0.1:8086: connect: connection refused
<date> hostname telegraf[22421]: <date> E! [agent] Error writing to outputs.influxdb: could not write any address.
Error writing to outputs.wavefront: Wavefront sending error: unable to connect to Wavefront proxy
<date> hostname telegraf[22421]: <date> E! [agent] Error writing to outputs.wavefront: Wavefront sending error: unable to connect to Wavefront proxy at address: <wavefrontIPAddress>:2878, err: "dial tcp 10.10.100.100:2878: i/o timeout"
<date> hostname telegraf[22421]: <date> E! [outputs.influxdb] When writing to [http://localhost:8086]: Post "http://localhost:8086/write?db=telegraf": dial tcp 127.0.0.1:8086: connect: connection refused
Confirm the accuracy of hostname or IP address configured in telegraf.conf for the proxy and then troubleshoot the environment for connectivity issues between the Telegraf agent and the wavefront proxy (nslookup, ping, telnet, packet capture etc.).
Intermittent broken pipe messages logged from Telegraf are typically due to firewalls closing a socket connection see below for example:
telegraf[xxx]: <date> E! [agent] Error writing to outputs.wavefront: Wavefront sending error: write tcp <telegrafIP>:36948-><proxyIP>:2878: write: broken pipe
telegraf[xxx]: <date> I! connected to Wavefront proxy at address: <proxyIP>:2878
In these cases, enabling the wavefront output plugin to use the HTTP protocol instead of the socket connection may be necessary, see Telegraf connection errors when using load balancer with VMware Aria Operations for Applications proxy for more details.
Should no errors be seen in the Telegraf logs, ensure the data being captured from the Telegraf plugin is in an acceptable format for the proxy. The latest versions of the proxy will log any issues with the metric data quality sent and indicate issues with its format.
Check /var/log/wavefront/wavefront.log on the proxy machine for errors in the logs, for example an error message like below may be recorded:
<date> INFO [AbstractReportableEntityHandler:reject] [2878] blocked input: [WF-300 Cannot parse metric: ""Update" source="rules-service"
If further assistance is needed, please open a ticket through Support @ Broadcom