NSX Transport node configuration state stuck at "100% Configuration complete"
search cancel

NSX Transport node configuration state stuck at "100% Configuration complete"

book

Article ID: 381363

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

After Transport node reboot or Transport node configuration changes NSX Transport node configuration state stuck at "100% Configuration complete".

Logs: /var/log/proton/nsxapi.log

2024-09-04T13:32:29.839Z ERROR L2HostConfigTaskExecutor4 TransportNodeAsyncServiceImpl 5162 FABRIC [nsx@6876 comp="nsx-manager" errorCode="MP600" level="ERROR" subcomp="manager"] Failed to update TransportNodeState to TransportNodeState [transportNodeId=TransportNode/###################, hostId= ###################, fabricNodeType=HOST_NODE, transportNodeRevision=8, hostConfigStatus=SUCCESS, failureCode=0, failureMessage=null, errorParams=null, appInitSucceed=false, logicalSwitchFullSyncSucceed=true, transportZoneIds=[TransportZoneEndpoint/ ###################], hostSwitchStates=[HostSwitchState [switchId = ###################, switchName =  ###################, teamingPolicyType = LOADBALANCE_SRCID, staticIpPoolId = IpPool/ ###################, defaultGateway =  ###################, netmask =  ###################, ipv6DefaultGateway = , ipv6PrefixLength = , staticIpv6PoolId = null, pnics = [Pnic [deviceName=Uplink 1, uplinkName=Uplink-1], Pnic [deviceName=Uplink 2, uplinkName=Uplink-2]], endpoints = [Endpoint [deviceName=vmk10, ipAddress= ###################label=109569, macAddress= ###################, macAutoAllocated=false, ipv6Address=, linkLocalIpv6Address=, tepGroupLabel=0], Endpoint [deviceName=vmk11, ipAddress= ###################, label=27649, macAddress= ###################, macAutoAllocated=false, ipv6Address=, linkLocalIpv6Address=, tepGroupLabel=0]], transportZones = [ ###################], ipAssignmentType = STATIC_IP_POOL, ipv6AssignmentType = NONE, overlayEncap = GENEVE, vdrMac =  ###################, cpuConfigurations = [], hostSwitchType = VDS, portgroupTransportZoneId = TransportZone/ ###################, transportZoneEndpoints =[TransportZoneEndpoint [transportZoneId=TransportZone/ ###################, transportZoneProfileIds [TransportZoneProfile [profileId= ###################, type=BFD]]], vifCollectionStatus = VIF_COLLECTION_STATUS_INVALID]], oldHostSwitchStates = [], isAutoRectifying = false, currentHostConfigRpcTimeout = 150000, tnType=NSX_CONFIGURED, hostConfigId= ###################, mmState=DISABLED, isNsxaDown=false, restoreState=null, vmkMigrationTnRevision=-1, tnSuccessRealizeTime=1718045706569, tnLastSuccessRealizeTime=1723829589568, tnLastSuccessRealizeRevision=8, tnHostSwitchesMigrationDetails=null, revision=2375]. ErrorCode: common-services:600
com.vmware.nsx.management.common.exceptions.ObjectNotFoundException: null
        at com.vmware.nsx.management.idas.ipam.service.IPAMServiceImpl.getPoolIfExists(IPAMServiceImpl.java:1137) ~[?:?]
        at com.vmware.nsx.management.idas.ipam.service.IPAMServiceImpl.release(IPAMServiceImpl.java:756) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.IPAMUtils.releaseIpThrow(IPAMUtils.java:170) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseVtepResources(VtepPopulatorUtil.java:731) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseExistingVteResources(VtepPopulatorUtil.java:703) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseAllResources(VtepPopulatorUtil.java:645) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.VtepPopulatorUtil.releaseAllResources(VtepPopulatorUtil.java:630) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.releaseUnusedVtepResourcesAndSaveTnStateInTransaction(TransportNodeAsyncServiceImpl.java:2620) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.releaseUnusedVtepResourcesAndSaveTnStateWithRetry(TransportNodeAsyncServiceImpl.java:2677) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.releaseUnusedVtepResourcesAndSaveTnState(TransportNodeAsyncServiceImpl.java:2599) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.stagePersistingTnState(TransportNodeAsyncServiceImpl.java:2490) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.syncTransportNode(TransportNodeAsyncServiceImpl.java:3162) ~[?:?]

 

Environment

VMware NSX-T Data Center
VMware NSX

Cause

IP pool is deleted when in use by Transport Node and later reconfiguring the TN with different Ip pool, the system will try to release the IP from the old/deleted IP Pool which is not in the environment can cause this "100% configuration complete" issue.

Resolution

Fixed in 4.1.2.0

For some reason if the upgrade is not feasible at the moment and need a workaround, Kindly open a Broadcom support case along with this KB number.

Additional Information

The customer wont be able to change any thing on this Transport node. Datapath continue to work.