[TAS+NSX-T] Duplicate T1 gateways on T0 router in NSX-T causes application container network errors
search cancel

[TAS+NSX-T] Duplicate T1 gateways on T0 router in NSX-T causes application container network errors

book

Article ID: 297966

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

Symptoms

Intermittently some application containers experience network error. Specifically, we have seen the following symptoms.


Symptom 1

Application crashes during staging or startup with the following credhub interpolate error.
2019-12-26T21:23:17.02-0500 [APP/PROC/WEB/1] ERR Unable to interpolate credhub refs: Unable to interpolate credhub references: Post https://credhub.service.cf.internal:8844/api/v1/interpolate: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Symptom 2

VMware Tanzu Application Service (TAS) for VMs Gorouter instances report  "i/o timeout" errors when connecting to backend containers and consequently clients get HTTP 502 errors when making requests to application containers.
gorouter.stdout.log:
{"log_level":3,"timestamp":1574145601.6562595,"message":"backend-endpoint-failed","source":"vcap.gorouter","data":{"route-endpoint":{"ApplicationId,"Addr":"192.168.130.20:61001","Tags":{"component":"route-emitter"},"RouteServiceUrl":""},"error":"dial tcp 192.168.130.20:61001: i/o timeout","attempt":1,"vcap_reques15b"}}

Cause

We have only observed this type of error in NSX-T environment running 2.4.x and the problem is related to a known bug fixed in NSX-T 2.4.3. Below is the description from NSX-T 2.4.3 release notes.
  • Fixed Issue 2448254 - Intermittent network connectivity loss to VMs in multi-tenant environments with overlapping IP subnets.
The problem manifests itself when you have multiple VMware Tanzu Application Service (TAS) for VMs environments in the same NSX-T environment where two orgs, one from each VMware Tanzu Application Service (TAS) for VMs environment, use the same IP subnet for application containers. As an example, the following diagram shows a problematic topology. The blue T1 router uses subnet 10.255.18.1/24 and its default gateway is 10.255.18.1. The red T1 router uses the same subnet 10.255.18.1/24 and default gateway 10.255.18.1. Caused by the duplicated gateway IPs, the blue T1 router may fail to populate its ARP entries into an ESXi host H when the red T1 has already done that earlier. As a consequence, an app container from org-blue will have connectivity error once deployed/restarted in a Diego cell running in host H, because the distributed v-switch on this host fails to forward packet from/to this container when required ARP entries are missing.



This issue is reported in Support case 217240 and 228260, both with a TAS+NSX-T environment. VMware Enterprise PKS also utilizes NSX-T for container networking and allocates T1 routers for kubernetes namespaces. Therefore it is possible to hit this bug in a VMware Enterprise PKS environment.

Environment

Product Version: 2.5

Resolution

Fix:

Upgrade NSX-T to 2.4.3 (Fixed Issue 2448254) or 2.5.1 (Fixed Issue 2442933).

Workaround:

Note: This workaround should be applied when upgrading is not possible in short term.

Identify the ESXi host and NSX-T T1 router in question

1. Identify the Diego cell which hosts app container.

  • If the container crashes or got restarted in a new cell, you can still find the GUID of the original Diego cell from app log.
   2019-12-26T21:23:17.02-0500 [APP/PROC/WEB/1] ERR Unable to interpolate credhub refs: Unable to interpolate credhub references: Post https://credhub.service.cf.internal:8844/api/v1/interpolate: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
   2019-12-26T21:23:17.02-0500 [CELL/SSHD/1] OUT Exit status 0
   2019-12-26T21:23:22.29-0500 [CELL/1] OUT Cell 4c969dfe-cb81-4e13-90ce-bfa7071c57a1 stopping instance 5771f24f-d674-4fe3-72d2-3736
  • If the app containers have not crashed or restarted, for example as in Symptom 2 mentioned earlier, you can find the app GUID and container IP from Gorouter logs.
access.log:
<app-route> - [2019-11-19T06:39:56.655+0000] "GET /boxcare/api/initial-admin-job-codes HTTP/1.1" 502 0 67 "https://edepot.portn0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36" "100.64.0.5:34362" "192.168.130.20:61001" x_forwarded_for:"10x_forwarded_proto:"https" vcap_request_id:"8f7ce4be-fb2f-4187-5ba8-aae5e007e15b" response_time:15.002142646 app_id:"ad5a084a-30b0-427d-b062-a09904eb3342" app_index:"1" anid:"5256615e503bddfb" x_b3_parentspanid:"-" b3:"5256615e503bddfb-5256615e503bddfb"
gorouter.stdout.log:
{"log_level":3,"timestamp":1574145601.6562595,"message":"backend-endpoint-failed","source":"vcap.gorouter","data":{"route-endpoint":{"ApplicationId":"ad5a084a-30b0-427d-b062-a09904eb3342", "Addr":"192.168.130.20:61001","Tags":{"component":"route-emitter"},"RouteServiceUrl":""},"error":"dial tcp 192.168.130.20:61001: i/o timeout","attempt":1,"vcap_reques15b"}}
Then run the following commands to get the mapping between cell IP and container IP.
bosh ssh diego_cell/0    # ssh to an arbitrary cell
cfdot actual-lrp-groups | jq '. | select(.instance.process_guid | startswith("app_GUID"))' | egrep '"(address|instance_address)'
If subcommand actual-lrp-groups is deprecated in your TAS version, you may run the following command:
cfdot actual-lrps | jq '. | select(.process_guid | startswith("app_GUID"))'
Example output with cell IPs highlighted:
$ cfdot actual-lrps | jq '. | select(.process_guid | startswith("ad5a084a-30b0-427d-b062-a09904eb3342"))' | egrep '"(address|instance_address)'
  "address": "10.193.71.40",
  "instance_address": "192.168.130.20",
  "address": "10.193.71.41",
  "instance_address": "192.168.130.22",
2. Get VM-CID based on the diego cell IP or GUID.
bosh vms | grep "<cell_IP/GUID>"
Search VM-CID in vCenter to identify the ESXi host.

3. Get the org name of the app. Search the org name in NSX-T Manager UI to identify the corresponding T1 router.

Review the logical switch which is attached to the downlink port of the T1 router.



Identify the VNI (Virtual Network Identifier) of the logical switch.

4. Login to a terminal of the ESXi host identified in step 2. Get the name of Virtual Distributed Switch:
[root@gtdc-az2esx-25:~] net-vdr -C -l
 
Host locale Id:             00000000-0000-0000-0000-000000000000
 
Connection Information:
-----------------------
 
DvsName           VdrPort           NumLifs  DRvmac
-------           -------           -------  -------
NVDS-Overlay      vdrPort           344      02:50:56:56:44:52
    Teaming Policy: Default Teaming
    Uplink   : uplink-1(67108866): 00:50:56:fc:33:9a(Non-team member)
    Uplink   : uplink-2(67108868): 00:50:56:ea:37:32(Non-team member)
Use the VNI obtained in step 3 to verify if T1 router’s ARP entries are missing or not. Review the example diagram of a problematic NSX-T topology introduced in previous section. The blue T1 router (downlink VNI 73903) has no ARP entries.
[root@gtdc-az2esx-25:~] net-vdl2 -M ip -s NVDS-Overlay -n 73903
IP entry count: 0
[root@gtdc-az2esx-25:~] net-vdl2 -M arp -s NVDS-Overlay -n 73903
Legend: [V:Valid], [U:in Use],
Legend: [N:Unknown - Not known by control plane],
Legend: [S:Seen - learnt or extended during the last ageing period],
Legend: [A:Aged - not updated in during the last ageing period]
ARP Entry Count:        0
The blue T1 router ARP entries are missing due to this host's local v-switch has been populated with a PROTECTED ARP entry associated with the red T1 router (downlink VNI 73796).
[root@gtdc-az2esx-25:~] net-vdl2 -M ip -s NVDS-Overlay -n 73796
IP entry count: 7
       ...
       IP:             10.255.18.1
       MAC:            02:50:56:56:44:52
       Flags:          1(PROTECTED)
       vxlanID:        73796


Proceed to correct the gateway of the identified subnet for one of the PAS foundations

For each pair of affected T1 routers (e.g., the blue and red T1 routers in our example), perform the workaround as follows:

1. In NSX-T Manager UI Navigate to: Advanced network and security > Routers > Select the checkbox of T1 router > Configuration > Router port 

2. Select the checkbox of the Downlink router port (e..g, the blue T1 router downlink port has subnet 10.255.18.1/24). Click EDIT.

3. In the pop-up window, change the gateway IP to 10.255.18.254.
Note: Make sure to choose a free IP, not yet occupied by any container or by any other T1 router gateway.

4. Click SAVE. This will populate ARP entries associated to this new gateway IP down to each ESXi host.

5. Restart all applications under the T1 router to restore connectivity to these apps. After restarting, apps will pick up the new gateway IP.

The following two sequences are both valid options:
  • Change blue T1 gateway to 10.255.18.254; change red T1 gateway to 10.255.18.253; restart all apps under two T1 routers.
  • Change blue T1 gateway to 10.255.18.254; restart all apps under blue T1 router; change red T1 gateway to 10.255.18.253; change red T1 gateway back to 10.255.18.1 (to avoid restarting apps under red T1).

Note: Please contact NSX-T support if the above workaround does not help.