Note: There are multiple scenarios where remote bosh agent is not able to communicate with NATS.
For example: IaaS Network configuration issue, duplicate IP on the network, stemcell build is corrupted, etc.
This article aim towards resolving the errors caused in a
vSphere with NSX-T (using Hybrid Topology) environment that has
incorrect/missing NAT rules on T0 router, specifically when SNAT rule is not setup to translate non-routable IPs (PKS cluster), to routable IPs (PKS Management Plane - includes bosh, ops manager, PKS API and DB VMs, etc.).
In a Hybrid NSX-T topology, you will see similar errors as seen in the
Issue section above, when there is a problem with source NAT'ing to translate non-routable IPs (usually it's the subnet where you deploy your TKGI/PKS cluster) to routable IPs (usually it's the subnet where you have your Management plane deployed - Ops Manager, Bosh, PKS/TKGI, etc) In order to resolve this issue, verify the SNAT rules configured on T0 router and make corrections if there's an incorrect rule or if you are just missing one.
The following is an
example on how you will setup the SNAT rule on T0 router:
- PKS Management Plane CIDR is 10.40.14.0/24 (routable IPs)
- CIDR used for a TKGI/PKS cluster is 172.31.0.0/24 (non-routable IPs)
SNAT rule for non-routable IPs to have communication with routable IPs will be as follows:

where
10.40.14.40 is a routable IP that can access PKS Management Plane VMs.