Customer's Tanzu platform is running with vSphere and NSX-T. Apply Change on the bosh director, failed with below error message:
Deploying:
Creating instance 'bosh/0':
Waiting until instance is ready:
Post "https://vcap:<redacted>@##.##.##.##:6868/agent": dial tcp ##.##.##.##:6868: i/o timeout
Exit code 1Search the bosh director VM in the vSphere Console, we found that IP address not get assigned.
Tanzu Platform running on vSphere with NSX 9.0.0
To trouble-shooting this kind of issue:
1. Search bosh director IP address in the vSphere Console to verify if IP address has been occupied by other VMs. If there is no other VMs occupied this IP address, it's good.
2. Try to change the Network adapter of the issued bosh director VM to non-NSX port group or distributed switch in order to narrow down the issue.
3. If bosh director VM can be assigned IP after changing the Network adapter, it indicates the issue is on networking. Suggest customer to further check on the network side.
In this customer's case, Bosh Director does not get an IP address post reboot, and interface remains down only when using NSX segment. The issue was identified to be due to JDK bug causing the ESXi host to control plane link to go down, this caused port block and lead to the interface be down.
This issue could related to NSX bug - NSX is Impacted by JDK-8330017: ForkJoinPool Stops Executing Tasks Due to ctl Field Release Count (RC) Overflow.
For environments running affected versions (VMware NSX 4.2.0.x, VMware NSX 4.2.1.0, 4.2.1.1, 4.2.1.2, 4.2.1.3, VMware NSX 9.0.0.0), implement a preventative monthly rolling reboot schedule:
This issue is resolved in VMware NSX 4.2.1.4, 4.2.2.0, 9.0.1.0 and above, available at Broadcom downloads.