Dynatrace Golang injection feature could cause apps and PAS components to crash
search cancel

Dynatrace Golang injection feature could cause apps and PAS components to crash

book

Article ID: 297777

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

Symptoms:
The Diego Database bbs.stdout.log, or cf events report Golang applications exit with the status code 141.
{"timestamp":"1549056677.836944342","source":"bbs","message":"bbs.request.crash-actual-lrp.complete","log_level":1,"data":{"crash_reason":"APP/PROC/WEB: Exited with status 141","instance_key":{"instance_guid":"GUID","cell_id":"GUID"},"key":{"process_guid":"8d55fa46-cdcd-432f-b0a2-9cb01b47f107-00be8e94-51bd-4d45-b599-ca6dc3c2516b","index":0,"domain":"cf-apps"},"session":"29748187.1"}}
The following processes exit throughout the platform and restart with no errors in their logs:
  • adapter

  • auctioneer

  • bbs

  • cc_uploader

  • doppler

  • gorouter

  • locket

  • loggregator_trafficcontroller

  • metron_agent

  • rep

  • route_emitter

  • route_registrar

  • routing-api

  • scheduler

  • silk-daemon

  • tps_watcher

Environment


Cause

Dynatrace uses LD_PRELOAD to inject an agent into the Golang process. This agent is responsible for collecting telemetry about the injected Golang process and it needs to open a TCP connection with the Dynatrace ActiveGate server. 

The problem surfaces when Golang receives a SIGPIPE error while writing to the TCP socket associated with Dynatrace ActiveGate Server. By default. Golang will exit when SIGPIPE (141), is received unless the signal is caught and handled. 

You can identify which processes are communicating with the ActiveGate server using the bellow command. 

Note: Replace "ACTIVEGATE HOST OR IP" with the hostname or IP of the ActiveGate server.
router/ed383a22-7578-40c0-8a00-bf317b925504:~# lsof -n | egrep "<ACTIVEGATE HOST OR IP>:9999" | awk '{print $1}' | uniq
oneagento
oneagentl
metron
gorouter

Resolution

If you experience these symptoms, Pivotal advises contacting Dynatrace support to assist in implementing the workaround which includes disabling Deep Monitoring for Pivotal Cloud Foundry (PCF) applications and PAS Runtime components.  


This Dynatrace Oneagnet bug impacts version 1.157.x and later. Currently Dynatrace plans to back port a fix to version 1.159.x.