Proxy backlog (points) has been accumulating
search cancel

Proxy backlog (points) has been accumulating

book

Article ID: 370770

calendar_today

Updated On:

Products

Wavefront by VMware Aria Operations for Applications Observability DX OpenExplore

Issue/Introduction

Alert firing indicating a proxy has backlog however reviewing the Tanzu Observability Service and Proxy Data Dashboard you find 

  • Ingestion from Proxy is successful
  • Metrics Points Per Second (PPS) is flowing with no indications of buildup on other charts.
  • Not receiving "No Data" alerts from the system.

Backlog means the proxy is queuing metric points due to various reasons in the spool files on the proxy.  If there is an incorrect reading of the spool files, the Proxy Backlog Size Chart will indicate points backlog where there is none.

Environment

  • Impacted Versions - Wavefront Proxy v12.x
  • Fixed Version Wavefront Proxy 13.0 [GA Release]
  • Issue regression reported on versions of Wavefront Proxy version 13.1 and newer. 

Resolution

Confirm if the proxy(s) in question actually has data backlogged by:

Dashboard Charts

Proxy Logs

Proxy Spool Files

 


Dashboard and Charts - Various charts on the below dashboard track pps flowing from the proxies to Wavefront back-end 

Dashboard: Tanzu Observability Service and Proxy Data 
Section: Proxies: Overview,
Charts:  Proxy Backlog Size, Queuing Reasons

  1. Proxy Backlog Size Chart shows "Points" backlog with little to no change.
  2. Proxy Backlog Size Chart does not show a backlog of "Tasks".
    1. Confirm on the Proxy Backlog chart shows Tasks = 0
  3. Queuing Reasons Chart, does not show; Pushback, Retries, Memory pressure.
    1. Pushback on points, retries, memory pressure is 0 or changing in value where the Proxy Backlog Size is not.

The above findings indicate a false reading of backlog, continue below for additional investigations

 


Proxy Logs - Review the proxy logs /var/log/wavefront/wavefront.log, for entries that indicate problems:

  • Connection or retransmit errors.
  • Errors in sending or blocked Metrics.
  • Errors in sending or blocked Histograms

Review the proxy logs for changing tasks and points values. Confirming they are changing with no errors regarding being unable to transmit/retransmit indicates the proxy is processing data. 

<epoch time> INFO [QueueController:printQueueStats] [2878.central] points backlog status: 14 tasks, 2487020 points
<epoch time> INFO [QueueController:printQueueStats] [2878.central] points backlog status: 31 tasks, 2526310 points
<epoch time> INFO [QueueController:printQueueStats] [2878.central] points backlog status: 26 tasks, 2589618 points
<epoch time> INFO [QueueController:printQueueStats] [2878.central] points backlog status: 39 tasks, 2546892 points

 

With no errors seen in the proxy logs and the number of tasks and points changing the proxy is able to communicate with Wavefront backend and deliver its data. Continue below for additional investigations.


Review of Pod Proxy /var/spool/wavefront-proxy/ spool files size shows they are empty. 

Empty buffer (spool) files sizes for each type of proxy deployment.

      • 4096 file size on a standalone proxy,
      • 80k file size on a container proxy (pod).

Spool files on a Standard Proxy 

  • Log in to the Proxy Server and run the command to show spool file size. du -ah /var/spool/wavefront-proxy/

If the spool files(s) are 4096 in size, this indicates no data is stored within the file.

Some or All spool files are larger in size than 4096, the file(s) contain data. Next Action - Investigate to discover other issues why backlog is not clearing. 

 

Spool files in a Container Proxy.

  • Identify proxy pod:     kubectl get pods -n observability-system
  • Log into the proxy:     kubectl exec -it <wavefront-proxyPODNAME> -n observability-system -- /bin/bash
  • Run inside the pod:    du -h /wavefront-proxy/

If all spool files on are 80k in size, this indicates no data is stored within the file.

Some or All spool files are larger in size than 80k the file(s) contain data. Next Action - Investigate to discover other issues why backlog is not clearing. 

 

Spool files when Persistent Volume Storage is used for the Containered Proxy. 

Check the yaml of the proxy pod to see where the PV is mounted

  • Describe the PV to see where the PV is mounted :        kubectl describe pv <NAMESPACE>
  • Navigate to mount point and run Command:                  du -h /var/spool/wavefront-proxy/

If all spool files on are 80k in size, this indicates no data is stored within the file.

Some or All spool files are larger in size than 80k the file(s) contain data. Next Action - Investigate to discover other issues why backlog is not clearing. 

 

Conclusion

The spool files hold the actual data until it can be successfully transmitted to Wavefront back-end. If the files are empty, there is a false reading within Wavefront Charts. 

Reboot the Proxy to confirm if this clears the backlog.   

For further assistance, please contact Broadcom Support.

Additional Information

See Proxy Troubleshooting Section: Manage the Proxy Queue