After Transport node reboot or Transport node configuration changes NSX Transport node configuration state stuck at "100% Configuration complete".
Logs: /var/log/proton/nsxapi.log
2024-09-04T13:32:29.839Z ERROR L2HostConfigTaskExecutor4 TransportNodeAsyncServiceImpl 5162 FABRIC [nsx@6876 comp="nsx-manager" errorCode="MP600" level="ERROR" subcomp="manager"] Failed to update TransportNodeState to TransportNodeState [transportNodeId=TransportNode/###################, hostId= ###################, fabricNodeType=HOST_NODE, transportNodeRevision=8, hostConfigStatus=SUCCESS, failureCode=0, failureMessage=null, errorParams=null, appInitSucceed=false, logicalSwitchFullSyncSucceed=true, transportZoneIds=[TransportZoneEndpoint/ ###################], hostSwitchStates=[HostSwitchState [switchId = ###################, switchName = ###################, teamingPolicyType = LOADBALANCE_SRCID, staticIpPoolId = IpPool/ ###################, defaultGateway = ###################, netmask = ###################, ipv6DefaultGateway = , ipv6PrefixLength = , staticIpv6PoolId = null, pnics = [Pnic [deviceName=Uplink 1, uplinkName=Uplink-1], Pnic [deviceName=Uplink 2, uplinkName=Uplink-2]], endpoints = [Endpoint [deviceName=vmk10, ipAddress= ###################label=109569, macAddress= ###################, macAutoAllocated=false, ipv6Address=, linkLocalIpv6Address=, tepGroupLabel=0], Endpoint [deviceName=vmk11, ipAddress= ###################, label=27649, macAddress= ###################, macAutoAllocated=false, ipv6Address=, linkLocalIpv6Address=, tepGroupLabel=0]], transportZones = [ ###################], ipAssignmentType = STATIC_IP_POOL, ipv6AssignmentType = NONE, overlayEncap = GENEVE, vdrMac = ###################, cpuConfigurations = [], hostSwitchType = VDS, portgroupTransportZoneId = TransportZone/ ###################, transportZoneEndpoints =[TransportZoneEndpoint [transportZoneId=TransportZone/ ###################, transportZoneProfileIds [TransportZoneProfile [profileId= ###################, type=BFD]]], vifCollectionStatus = VIF_COLLECTION_STATUS_INVALID]], oldHostSwitchStates = [], isAutoRectifying = false, currentHostConfigRpcTimeout = 150000, tnType=NSX_CONFIGURED, hostConfigId= ###################, mmState=DISABLED, isNsxaDown=false, restoreState=null, vmkMigrationTnRevision=-1, tnSuccessRealizeTime=1718045706569, tnLastSuccessRealizeTime=1723829589568, tnLastSuccessRealizeRevision=8, tnHostSwitchesMigrationDetails=null, revision=2375]. ErrorCode: common-services:600
com.vmware.nsx.management.common.exceptions.ObjectNotFoundException: null
at com.vmware.nsx.management.idas.ipam.service.IPAMServiceImpl.getPoolIfExists(IPAMServiceImpl.java:1137) ~[?:?]
at com.vmware.nsx.management.idas.ipam.service.IPAMServiceImpl.release(IPAMServiceImpl.java:756) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.IPAMUtils.releaseIpThrow(IPAMUtils.java:170) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseVtepResources(VtepPopulatorUtil.java:731) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseExistingVteResources(VtepPopulatorUtil.java:703) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseAllResources(VtepPopulatorUtil.java:645) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseAllResources(VtepPopulatorUtil.java:630) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.releaseUnusedVtepResourcesAndSaveTnStateInTransaction(TransportNodeAsyncServiceImpl.java:2620) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.releaseUnusedVtepResourcesAndSaveTnStateWithRetry(TransportNodeAsyncServiceImpl.java:2677) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.releaseUnusedVtepResourcesAndSaveTnState(TransportNodeAsyncServiceImpl.java:2599) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.stagePersistingTnState(TransportNodeAsyncServiceImpl.java:2490) ~[?:?]
at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.syncTransportNode(TransportNodeAsyncServiceImpl.java:3162) ~[?:?]
VMware NSX-T Data Center
VMware NSX
IP pool is deleted when in use by Transport Node and later reconfiguring the TN with different Ip pool, the system will try to release the IP from the old/deleted IP Pool which is not in the environment can cause this "100% configuration complete" issue.
Fixed in 4.1.2.0
For some reason if the upgrade is not feasible at the moment and need a workaround, Kindly open a Broadcom support case along with this KB number.
The customer wont be able to change any thing on this Transport node. Datapath continue to work.