After Transport node reboot or Transport node configuration changes NSX Transport node configuration state stuck at "100% Configuration complete".
Logs: /var/log/proton/nsxapi.log
2024-09-04T13:32:29.839Z ERROR L2HostConfigTaskExecutor4 TransportNodeAsyncServiceImpl 5162 FABRIC [nsx@6876 comp="nsx-manager" errorCode="MP600" level="ERROR" subcomp="manager"] Failed to update TransportNodeState to TransportNodeState [transportNodeId=TransportNode/###################, hostId= ###################, fabricNodeType=HOST_NODE, transportNodeRevision=8, hostConfigStatus=SUCCESS, failureCode=0, failureMessage=null, errorParams=null, appInitSucceed=false, logicalSwitchFullSyncSucceed=true, transportZoneIds=[TransportZoneEndpoint/ ###################], hostSwitchStates=[HostSwitchState [switchId = ###################, switchName = ###################, teamingPolicyType = LOADBALANCE_SRCID, staticIpPoolId = IpPool/ ###################, defaultGateway = ###################, netmask = ###################, ipv6DefaultGateway = , ipv6PrefixLength = , staticIpv6PoolId = null, pnics = [Pnic [deviceName=Uplink 1, uplinkName=Uplink-1], Pnic [deviceName=Uplink 2, uplinkName=Uplink-2]], endpoints = [Endpoint [deviceName=vmk10, ipAddress= ###################label=109569, macAddress= ###################, macAutoAllocated=false, ipv6Address=, linkLocalIpv6Address=, tepGroupLabel=0], Endpoint [deviceName=vmk11, ipAddress= ###################, label=27649, macAddress= ###################, macAutoAllocated=false, ipv6Address=, linkLocalIpv6Address=, tepGroupLabel=0]], transportZones = [ ###################], ipAssignmentType = STATIC_IP_POOL, ipv6AssignmentType = NONE, overlayEncap = GENEVE, vdrMac = ###################, cpuConfigurations = [], hostSwitchType = VDS, portgroupTransportZoneId = TransportZone/ ###################, transportZoneEndpoints =[TransportZoneEndpoint [transportZoneId=TransportZone/ ###################, transportZoneProfileIds [TransportZoneProfile [profileId= ###################, type=BFD]]], vifCollectionStatus = VIF_COLLECTION_STATUS_INVALID]], oldHostSwitchStates = [], isAutoRectifying = false, currentHostConfigRpcTimeout = 150000, tnType=NSX_CONFIGURED, hostConfigId= ###################, mmState=DISABLED, isNsxaDown=false, restoreState=null, vmkMigrationTnRevision=-1, tnSuccessRealizeTime=1718045706569, tnLastSuccessRealizeTime=1723829589568, tnLastSuccessRealizeRevision=8, tnHostSwitchesMigrationDetails=null, revision=2375]. ErrorCode: common-services:600
com.vmware.nsx.management.common.exceptions.ObjectNotFoundException: null
at com.vmware.nsx.management.idas.ipam.service.IPAMServiceImpl.getPoolIfExists(IPAMServiceImpl.java:1137) ~[?:?]
at com.vmware.nsx.management.idas.ipam.service.IPAMServiceImpl.release(IPAMServiceImpl.java:756) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.IPAMUtils.releaseIpThrow(IPAMUtils.java:170) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseVtepResources(VtepPopulatorUtil.java:731) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseExistingVteResources(VtepPopulatorUtil.java:703) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseAllResources(VtepPopulatorUtil.java:645) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseAllResources(VtepPopulatorUtil.java:630) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.releaseUnusedVtepResourcesAndSaveTnStateInTransaction(TransportNodeAsyncServiceImpl.java:2620) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.releaseUnusedVtepResourcesAndSaveTnStateWithRetry(TransportNodeAsyncServiceImpl.java:2677) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.releaseUnusedVtepResourcesAndSaveTnState(TransportNodeAsyncServiceImpl.java:2599) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.stagePersistingTnState(TransportNodeAsyncServiceImpl.java:2490) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.syncTransportNode(TransportNodeAsyncServiceImpl.java:3162) ~[?:?]
VMware NSX-T Data Center
VMware NSX
IP pool is deleted when in use by Transport Node and later reconfiguring the TN with different Ip pool, the system will try to release the IP from the old/deleted IP Pool which is not in the environment can cause this "100% configuration complete" issue.
This issue is resolved in VMware NSX 4.1.2.0, available at Broadcom downloads.
If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB
If you believe you have encountered this issue and are unable to upgrade, open a support case with Broadcom Support and refer to this KB article.
For more information, see Creating and managing Broadcom support cases.
While the Transport Node will reject new configuration updates, the existing datapath remains unaffected.