Queries for apps deployed in Tanzu Application Service might fail during upgrades or cell recreation, when container networking is provided by NSX-T
search cancel

Queries for apps deployed in Tanzu Application Service might fail during upgrades or cell recreation, when container networking is provided by NSX-T

book

Article ID: 322570

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
When a TAS application is queried during an upgrade or any other event where cells are evacuated, the CF gorouter responds with status code 499, even if all instances for a given application are running and there is no network connectivity issue between the goRouter and Diego Cells.

Environment

VMware NSX-T Data Center 3.x
VMware NSX-T Data Center

Cause

During cell evacutation, application instances are supposed to be available until they shutdown or are destroyed by the rep job. Until this happens, the gorouter entry for a given instance is still active, and requests can be routed to it.

However. with NSX-T one of the following might happen:
- The Openvswitch drain script might remove connectivity for instances before the corresponding container are terminated
- NCP might remove the logical port for an instance being evacuated as soon as the "new" application instance with the same index is created.

Resolution

This issue is resolved in NCP 3.0.2.5 and NCP 3.1.2.4

Workaround:
There is no workaround available.
Increasing the number of instances for an application and reducing max_in_flight to 1 might mitigate the issue, but it can still occur.