"Failure Domain down" error in newly deployed Edge VM
search cancel

"Failure Domain down" error in newly deployed Edge VM

book

Article ID: 408159

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

A newly deployed Edge VM had a controller status reported as disconnected, but has connectivity to all 3 NSX managers is UP. 

edge-x> get controllers
Thu Aug 21 2025 UTC 06:23:06.908
 Controller IP    Port     SSL         Status       Is Physical Master   Session State  Controller FQDN           Failure Reason
  #.#.#.#    1235   enabled    disconnected           true              down              NA                  OTHER_ERROR
 #.#.#.#    1235   enabled      not used            false              null              NA                       NA
  #.#.#.#    1235   enabled      not used            false              null              NA                       NA

edge-x> get managers
Thu Aug 21 2025 UTC 06:23:01.609
- #.#.#.#     Connected (NSX-RPC) *
- #.#.#.#     Connected (NSX-RPC)
- #.#.#.#     Connected (NSX-RPC)

The issue does not get resolved by deletion/redeploying the edge. 

Environment

VMware NSX

Cause

The issue is caused by IP pool exhaustion. 

Resolution

Expand the IP pool CIDR or delete unused edges to free up IPs.

Additional Information

The issue can be found in the NSX Manager NSXAPI log (/var/log/proton/nsxapi.log)

2025-08-21T05:27:20.166Z ERROR L2HostConfigTaskExecutor4 IPAMServiceImpl 81229 POOL-MGMT [nsx@6876 comp="nsx-manager" errorCode="MP5109" level="ERROR" subcomp="manager"] Insufficient Resource Exception : no free IP found to allocate from the pool IpPool/####-####-####-####.
2025-08-21T05:27:20.166Z ERROR L2HostConfigTaskExecutor4 IPAMUtils 81229 FABRIC [nsx@6876 comp="nsx-manager" errorCode="MP8212" level="ERROR" subcomp="manager"] Failed to get a valid IP from IpPool IpPool/####-####-####-#### with cidr null, {}.
com.vmware.nsx.management.common.exceptions.InsufficientResourcesException: null
        at com.vmware.nsx.management.idas.ipam.service.IPAMServiceImpl.allocate(IPAMServiceImpl.java:105) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.IPAMUtils.ufoAllocateIpFromPool(IPAMUtils.java:63) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.VtepPopulator.allocateResourcesToNewVteps(VtepPopulator.java:436) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.VtepPopulator.populateVtepsForHostSwitch(VtepPopulator.java:264) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.VtepPopulator.populateVtepsForHostSwitch(VtepPopulator.java:226) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.VtepPopulator.populateVtepForUpdatedHostSwitches(VtepPopulator.java:119) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.stageSendingHostConfigMsg(TransportNodeAsyncServiceImpl.java:1248) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncServiceImpl.syncTransportNode(TransportNodeAsyncServiceImpl.java:3216) ~[?:?]
        at com.vmware.nsx.management.switching.sync.host.TransportNodeAsyncRequestHandler$TransportNodeConfigTask.run(TransportNodeAsyncRequestHandler.java:720) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
        at java.lang.Thread.run(Unknown Source) ~[?:?]