Symptoms:
NSX upgrade to 4.1.0 or later failed on VCF side during NSX manager postcheck.
- Upgrade is successfully completed from NSX side.
- Retry NSX upgrade from VCF side failed in precheck for manager cluster with error 'Check for the version sync status of UC on all the MP nodes'.
The lcm logs would report the following:
2024-01-09T05:52:16.564+0000 ERROR [vcf_lcm,0000000000000000,0000,upgradeId=########-####-####-####-########7c82,resourceType=NSX_T_PARALLEL_CLUSTER,resourceId=vcf-nsx01.vmware.local:_ParallelClusterUpgradeElement,bundleElementId=########-####-####-####-########6931] [c.v.e.s.l.p.i.n.s.NsxtUpgradeStageRunner,Upgrade-6] performPostcheckForComponent failed for MP with exception {}
com.vmware.vapi.client.exception.ConnectionException: Read timed out
Caused by: java.net.SocketTimeoutException: Read timed out
2024-01-09T05:52:16.565+0000 ERROR [vcf_lcm,0000000000000000,0000,upgradeId=########-####-####-####-########7c82,resourceType=NSX_T_PARALLEL_CLUSTER,resourceId=han-m01-nsx01.north.vib.corp:_ParallelClusterUpgradeElement,bundleElementId=########-####-####-####-########6931] [c.v.e.s.l.p.i.nsxt.NsxtUpgradeUtil,Upgrade-6] Handling NSX Exception
com.vmware.vapi.client.exception.ConnectionException: Read timed out
com.vmware.vapi.client.exception.ConnectionException: Read timed out
Caused by: java.net.SocketTimeoutException: Read timed out
2024-01-09T05:52:16.565+0000 ERROR [vcf_lcm,0000000000000000,0000,upgradeId=########-####-####-####-########7c82,resourceType=NSX_T_PARALLEL_CLUSTER,resourceId=han-m01-nsx01.north.vib.corp:_ParallelClusterUpgradeElement,bundleElementId=########-####-####-####-########6931] [c.v.e.s.l.p.i.n.s.NsxtManagerClusterUpgradeStageRunner,Upgrade-6] NSX Manager failed in Upgrade Coordinator postchecks NSXT_MANAGER_CLUSTER_UPGRADE_FAILED_POSTCHECK
From NSX 4.1.0, NSX manager precheck is blocked from NSX side after NSX upgrade completed. However, if NSX upgrade failed in postcheck from VCF side but completed from NSX side, retry upgrade from VCF side will trigger NSX manager precheck from NSX side, which will return UC version issue and cause upgrade failure from VCF side.
VMware is aware of this issue and working towards a fix in a future release.