In a Multi-tenant (T1s) environment with the following conditions:
Topology Information:
- Bosh IP: 172.16.80.2
- Bosh IP 172.16.80.2 is SNAT’ed at the T0 to 192.168.80.2
- K8s Master VM IP: 172.16.90.2
- K8s Master VM IP 172.16.90.2 is SNAT’ed at the T0 to 192.168.90.2
- There’s a DNAT rule configured at the T0 to translate 192.168.80.2 to 172.16.80.2 for North-South communication
Note:
This DNAT rule is configured for illustration purpose in this use case. This issue would occur even if a DNAT rule is configured for a completely different workflow
East-West traffic between workloads behind different T1 is impacted, when communication happens over their Private IPs (i.e. NON-NATed IPs). In the above example when 172.16.80.2 tries to communicate 172.16.90.2, the communication is impacted.
Note:
The SNAT rule starts taking effect in 2.4.2 between the T0 and T1 causing the traffic to be SNATed twice, once while traffic is egress to the destination and once again when traffic returning back from the destination. This leads to the workload dropping the traffic. The following is the packet walk, for the above example:
Packet Walk:
Request:
Response:
This issue is resolved in VMware NSX-T Data Center 2.5.
Workaround:
Workaround:
If there are no services (like NAT, Firewall, etc.) on the T1 SR, it is safe to detach it from the edge cluster. This is a two-step process as illustrated below
If you are not able to perform this workaround or have any additional questions, file a support request with VMware Support and quote this Knowledge Base article ID (71363) in the problem description.