Upgrading NSX Manager (Local or Global) VMs to 4.2.x fails to exit maintenance mode on the Monitoring cluster service.
search cancel

Upgrading NSX Manager (Local or Global) VMs to 4.2.x fails to exit maintenance mode on the Monitoring cluster service.

book

Article ID: 382149

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Phonehome-coordinator (Monitoring) service fails to start because of out of memory issue and upgrade cannot be continued.
  • In appliance CLI logged in as admin running get cluster status shows Group Type: MONITORING as STATUS DOWN 

  • This issue can also be observed during new NSX Manager deployment via VCF Cloud Builder with Small size form factor.
  • Log lines similar to the below are encountered in /var/log/phonehome-coordinator/phonehome-coordinator-tomcat-wrapper.log:
| java.lang.OutOfMemoryError: Java heap space
| The JVM has run out of memory.  Requesting thread dump.
| Dumping JVM state.
| Dumping heap to /image/core/phc_oom.hprof
  • A core dump file is present under path /image/core/phc_oom.hprof

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

  • May see the following error in the NSX UI. 
    • "Unexpected error while upgrading upgrade unit. Failed to exit node <UUID> from maintenance mode. Please retry the operation."

Environment

VMware NSX

Cause

The cause is due to a race condition. The phonehome-coordinator service crashes and won't start because of out of a memory issue during initializing time.

Resolution

  • If the NSX Manager VMs have been deployed as small form factor (which is not supported in production environments), then the workaround will be to power down one NSX Manager VM at a time, increase the resources to a medium form factor or higher and power the VM back on until all three NSX Managers have been adjusted. NSX Manager VM and Host Transport Node System Requirements
  • If the NSX Manager VMs are deployed as medium form factor or higher and this issue is encountered, perform a reboot of the impacted NSX Manager VM to resolve the "out of memory" error.

The is fixed in NSX 4.2.1.1 and later. 

Additional Information

For additional information see Troubleshooting NSX Manager Upgrade Failures.