VMware NSX-T manager prechecks fail: Failed to execute Check for sufficient free space on /tmp, /image and /config partition

search cancel

VMware NSX-T manager prechecks fail: Failed to execute Check for sufficient free space on /tmp, /image and /config partition

book

Article ID: 376982

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

During an NSX-T upgrade the prechecks fail for the managers with error:

Failed to execute Check for sufficient free space on /tmp partition.
Failed to execute Check for sufficient free space on /image partition.
Failed to execute Check for sufficient free space on /config partition.

As root on the NSX-T managers, running df -h, we see no space issues.

/tmp has more than 18 MB free
/image has more than 4299 MB free
/config has more than 14796 MB free

In the NSX-T manager log /var/log/upgrade-coordinator/upgrade-coordinator.log, this is the manager running the upgrade service (Orchestrator node),the following log WARNING's are seen:

WARN pool-48-thread-1 UpgradeServiceImpl 2692813 SYSTEM [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="upgrade-coordinator"] [PUC] Pre-upgrade check InspectionTaskInfo[id=mpFreeSpaceCheck-tmp,name=Check for sufficient free space on /tmp partition,description=This check will result in warning if free space in /tmp partition on the node is less than 18 MB.,componentType=MP,needsAcknowledgement=false,acknowledgement=false,needsResolution=false,resolution=false,resolutionError=<null>] failed with result BasicInspectionTaskResult{status=FAILURE, taskInfo=InspectionTaskInfo[id=mpFreeSpaceCheck-tmp,name=Check for sufficient free space on /tmp partition,description=This check will result in warning if free space in /tmp partition on the node is less than 18 MB.,componentType=MP,needsAcknowledgement=false,acknowledgement=false,needsResolution=false,resolution=false,resolutionError=<null>], failureMessages=null, failures=[{"moduleName":"upgrade-coordinator","errorCode":30956,"errorMessage":"Failed to execute Check for sufficient free space on /tmp partition. "}]}

WARN pool-48-thread-3 UpgradeServiceImpl 2692813 SYSTEM [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="upgrade-coordinator"] [PUC] Pre-upgrade check InspectionTaskInfo[id=mpFreeSpaceCheck-image,name=Check for sufficient free space on /image partition,description=This check will result in warning if free space in /image partition on the node is less than 4299 MB.,componentType=MP,needsAcknowledgement=false,acknowledgement=false,needsResolution=false,resolution=false,resolutionError=<null>] failed with result BasicInspectionTaskResult{status=FAILURE, taskInfo=InspectionTaskInfo[id=mpFreeSpaceCheck-image,name=Check for sufficient free space on /image partition,description=This check will result in warning if free space in /image partition on the node is less than 4299 MB.,componentType=MP,needsAcknowledgement=false,acknowledgement=false,needsResolution=false,resolution=false,resolutionError=<null>], failureMessages=null, failures=[{"moduleName":"upgrade-coordinator","errorCode":30956,"errorMessage":"Failed to execute Check for sufficient free space on /image partition. "}]}

WARN pool-48-thread-3 UpgradeServiceImpl 2692813 SYSTEM [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="upgrade-coordinator"] [PUC] Pre-upgrade check InspectionTaskInfo[id=mpFreeSpaceCheck-config,name=Check for sufficient free space on /config partition,description=This check will result in warning if free space in /config partition on the node is less than 14796 MB.,componentType=MP,needsAcknowledgement=false,acknowledgement=false,needsResolution=false,resolution=false,resolutionError=<null>] failed with result BasicInspectionTaskResult{status=FAILURE, taskInfo=InspectionTaskInfo[id=mpFreeSpaceCheck-config,name=Check for sufficient free space on /config partition,description=This check will result in warning if free space in /config partition on the node is less than 14796 MB.,componentType=MP,needsAcknowledgement=false,acknowledgement=false,needsResolution=false,resolution=false,resolutionError=<null>], failureMessages=null, failures=[{"moduleName":"upgrade-coordinator","errorCode":30956,"errorMessage":"Failed to execute Check for sufficient free space on /config partition. "}]}

In the NSX-T manager logs /var/log/upgrade-coordinator/upgrade-coordinator.log the following log ERROR is seen:

ERROR pool-48-thread-1 UcRestClient 2692813 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP30014" level="ERROR" subcomp="upgrade-coordinator"] Error during GET rest request /nsxapi/api/v1/cluster/nodes/########-09f2-53ba-61a4-############/status?source=realtime , trial 3 , err com.vmware.nsx.management.upgrade.rpcframework.UcRestRpcException: [UC] Error in rest call. url= //nsxapi/api/v1/cluster/nodes/########-09f2-53ba-61a4-############/status?source=realtime , method= GET , response= {
"httpStatus" : "SERVICE_UNAVAILABLE",
"error_code" : 202102,
"module_name" : "Platform Management",
"error_message" : "Some error has occurred."
} , error= 503 : "{<EOL> "httpStatus" : "SERVICE_UNAVAILABLE",<EOL> "error_code" : 202102,<EOL> "module_name" : "Platform Management",<EOL> "error_message" : "Some error has occurred."<EOL>}" .

In the NSX-T manager logs /var/log/proton/nsxapi.log the following log entries are seen:

INFO RpcManagerRequestCleanupTimer RpcManager 792599 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Rpc response not received for application HostNodeStatusVertical request com.vmware.nsx.management.agg.messaging.AggService$ClientDataRequestMsg from client ########-fc99-4301-9639-############ with correlation id ########-7889-417e-8951--############ in 5000 msec.

In the log /var/log/syslog the following log entries are seen, waiting for stub creation:

NSX 67000 - [nsx@6876 comp="nsx-manager" subcomp="mpa-client" tid="67234" level="INFO"] [HostNodeStatusVertical] WaitForInfightStubCreateRequest
NSX 67000 - [nsx@6876 comp="nsx-manager" subcomp="mpa-client" tid="67234" level="INFO"] [HostNodeStatusVertical] Waiting for (2) stub creation request(s) to finish.
NSX 67000 - [nsx@6876 comp="nsx-manager" subcomp="mpa-client" tid="67234" level="INFO"] [HostNodeStatusVertical] stub-creation pending for APH (########-514f-409f-8766-############)
NSX 67000 - [nsx@6876 comp="nsx-manager" subcomp="mpa-client" tid="67234" level="INFO"] [HostNodeStatusVertical] stub-creation pending for APH (########-6b4b-4a8e-8384-############)
NSX 67000 - [nsx@6876 comp="nsx-manager" subcomp="mpa-client" tid="67234" level="INFO"] [HostNodeStatusVertical] performUpdateStubMap: Re-scheduling self on provider thread

NSX 67517 - [nsx@6876 comp="nsx-manager" subcomp="mpa-client" tid="67588" level="INFO"] [FileMonitor] WaitForInfightStubCreateRequest
NSX 67517 - [nsx@6876 comp="nsx-manager" subcomp="mpa-client" tid="67588" level="INFO"] [FileMonitor] Waiting for (2) stub creation request(s) to finish.
NSX 67517 - [nsx@6876 comp="nsx-manager" subcomp="mpa-client" tid="67588" level="INFO"] [FileMonitor] stub-creation pending for APH (########-514f-409f-8766-############)
NSX 67517 - [nsx@6876 comp="nsx-manager" subcomp="mpa-client" tid="67588" level="INFO"] [FileMonitor] stub-creation pending for APH (########-6b4b-4a8e-8384-############)
NSX 67517 - [nsx@6876 comp="nsx-manager" subcomp="mpa-client" tid="67588" level="INFO"] [FileMonitor] performUpdateStubMap: Re-scheduling self on provider thread

Note: To find the Orchestrator node, as admin user run: get service install-upgrade - the IP address in the 'running on' section of the result, shows which manager is the orchestrator node.

Cause

At some point in the past, the APH (Appliance Proxy Hub) RPC (Remote Procedure Call) channel got closed and restarted, and now the APH Stub creation fails to these nodes. This leads to the prechecks GET API call /nsxapi/api/v1/cluster/nodes/########-09f2-53ba-61a4-############/status?source=realtime failing. This GET API return the status of the nodes for the prechecks, which include disk space and usage results.

Resolution

This issue is resolved in VMware NSX 4.2, available at Broadcom downloads.

If you are having difficulty finding and download software, please review the Download Broadcom products and software KB.

Workaround:

On all 3 managers, carry out the following:

As admin user, check the cluster is up and stable:
get cluster status
Then run the following 2 commands as root user:
1. service nsx-host-node-status-reporter restart
2. service file-monitor restart
Verify the services above restarted ok, as root user:
1. service nsx-host-node-status-reporter status
2. service file-monitor status
Ensure cluster is stable after the services above have restarted, as admin user:
get cluster status

Proceed and carry out above steps (1 through 4) on the next 2 managers one by one.

Feedback

thumb_up Yes

thumb_down No