Route Emitter Condition Causes Routes to Not be Emitted
search cancel

Route Emitter Condition Causes Routes to Not be Emitted

book

Article ID: 298143

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

A rare bug has been discovered with the route-emitter service that sometimes prevents it from connecting to the nats service. When this happens, the application instances on the affected diego_cell won't be able to register their routes with the gorouter. As a result, any requests sent to applications on that diego_cell will return a 404 error because the routes are unrecognized. A topology diagram and more information on Tanzu Application Service (TAS) Routing architecture can be found in these docs.

If this bug is present, the following series of logs will appear in route-emitter logs:
{"timestamp":"2024-03-28T13:25:07.647293475Z","level":"error","source":"route-emitter","message":"route-emitter.route-broadcast-scheduler.received-invalid-external-service-start","data":{"error":"unexpected end of JSON input","name":"router","payload":"","session":"2"}}
Then the below logs will continuously repeat.
{"timestamp":"2024-03-29T12:39:01.646358374Z","level":"info","source":"route-emitter","message":"route-emitter.route-broadcast-scheduler.retrying","data":{"name":"router","session":"2"}}
{"timestamp":"2024-03-29T12:39:01.646498627Z","level":"info","source":"route-emitter","message":"route-emitter.route-broadcast-scheduler.greeting-external-service","data":{"name":"router","session":"2"}}


Environment

Product Version: 3.0

Resolution

Fixed in diego-release 2.99.0 which is in the following versions of TAS and IST tiles

  • TAS 6.0.4
  • TAS 5.0.14
  • TAS 4.0.24
  • TAS 2.13.40
  • TAS 2.11.58
  • IST 6.0.4
  • IST 5.0.14
  • IST 4.0.24
  • IST 2.13.37
  • IST 2.11.52

The following workaround is available until upgrade is possible:

  • Restart the route-emitter service on the impacted diego_cell.
diego_cell/660729bc-6c0a-4bee-91f8-cf1aedd6a71a:~# monit restart route_emitter
  • Once the route_emitter process is restarted, then it should be able to fresh connect to nats and register the routes properly. 

    Metrics that are helpful to monitor for this condition:
    HTTPRouteNATSMessagesEmitted or RoutesRegistered