NSX upgrade from 9.0.x to 9.1.0 fails with message "Upgrade has failed. Appliance OS is of a new version."
search cancel

NSX upgrade from 9.0.x to 9.1.0 fails with message "Upgrade has failed. Appliance OS is of a new version."

book

Article ID: 440244

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Upgrade from NSX 9.0.x to 9.1.0 fails with one or more manager nodes still at the old version.
  • The detailed error message is similar to:

    ------ Output of last step start ------
        Status:
    wait_for_proton: resp_status: 500, body: None
    
        Stdout: Removed /config/.resume_upgrade flag
    Starting start_manager upgrade script
    Creating /.eula/.eula_accepted.txt.
    Migrated /os_bak/.eula/.upgrade_eula_accepted.txt to /.eula/.eula_accepted.txt successfully.
    Setting file permissions for eula
    Removing eula.txt
    Proton service did not start within the allotted time.
    
        Stderr: /image/VMware-NSX-unified-appliance-9.1.0.0.25318227/nsx-manager/common/settings.py:11: SyntaxWarning: invalid escape sequence '\S'
      NSX_ISSUE_RE = "version: (\S+).node-type: ([^\n]+).build-type: (\S+)"
    
        Troubleshooting: Upgrade has failed. Appliance OS is of a new version. If the UI is available, please retry upgrade from the UI. If the problem persist, please contact GSS to retry the upgrade.

     

  • The NSX UI is inaccessible on the manager node that is partially upgraded.
  • Messages similar to the following are seen in the /var/log/li-syslog file on the partially upgraded NSX manager node:

    <TIMESTAMP> ERROR nonconfig-corfu - -  org.corfudb.infrastructure.NettyServerRouter Error in handling inbound message#012io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: General OpenSslEngine problem#012#011at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:500)#012#011at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)#012#011at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)#012#011at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)#012#011at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)#012Caused by: javax.net.ssl.SSLHandshakeException: General OpenSslEngine problem#012#011at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.handshakeException(ReferenceCountedOpenSslEngine.java:1939)#012#011at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.wrap(ReferenceCountedOpenSslEngine.java:862)#012#011at java.base/javax.net.ssl.SSLEngine.wrap(Unknown Source)#012#011at io.netty.handler.ssl.SslHandler.wrap(SslHandler.java:1148)#012#011at io.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:992)#012Caused by: java.security.cert.CertificateException: Unable to construct a valid chain#012#011at org.bouncycastle.jsse.provider.ProvX509TrustManager.validateChain(ProvX509TrustManager.java:321)#012#011at org.bouncycastle.jsse.provider.ProvX509TrustManager.checkTrusted(ProvX509TrustManager.java:276)#012#011at org.bouncycastle.jsse.provider.ProvX509TrustManager.checkClientTrusted(ProvX509TrustManager.java:157)#012#011at org.bouncycastle.jsse.provider.ExportX509TrustManager_7.checkClientTrusted(ExportX509TrustManager_7.java:31)#012#011at org.corfudb.security.tls.ReloadableTrustManager.checkClientTrusted(ReloadableTrustManager.java:41)#012#011Suppressed: io.netty.handler.ssl.ReferenceCountedOpenSslEngine$OpenSslHandshakeException: error:0A000086:SSL routines::certificate verify failed#012#011#011at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.newSSLExceptionForError(ReferenceCountedOpenSslEngine.java:1401)#012#011#011at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.needWrapAgain(ReferenceCountedOpenSslEngine.java:1389)#012#011#011at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.sslReadErrorResult(ReferenceCountedOpenSslEngine.java:1418)#012#011#011at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1349)#012#011#011at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1450)#012Caused by: java.security.cert.CertPathValidatorException: Certificate doesn't support 'clientAuth' ExtendedKeyUsage#012#011at org.bouncycastle.jsse.provider.ProvAlgorithmChecker.checkEndEntity(ProvAlgorithmChecker.java:230)#012#011at org.bouncycastle.jsse.provider.ProvAlgorithmChecker.checkCertPathExtras(ProvAlgorithmChecker.java:182)#012#011at org.bouncycastle.jsse.provider.ProvX509TrustManager.validateChain(ProvX509TrustManager.java:309)#012#011at org.bouncycastle.jsse.provider.ProvX509TrustManager.checkTrusted(ProvX509TrustManager.java:276)#012#011at org.bouncycastle.jsse.provider.ProvX509TrustManager.checkClientTrusted(ProvX509TrustManager.java:157)

Environment

VMware NSX 9.1

Cause

  • The 9.1 version of NSX requires that the CBM_Corfu certificate be configured for both server and client authentication.
  • One or more manager nodes have a CBM_Corfu certificate that does not support client authentication.
  • Once one node with a CBM_Corfu certificate that does not support client authentication is upgraded to 9.1, the nodes in the cluster will not be able to properly communicate with each other while there are un-upgraded nodes present.

Resolution

The NSX manager cluster must be restored from backup to recover from this issue. See Backup and Restore During Upgrade and Restore a Backup for more details. 

Once the cluster is restored, all CBM_Corfu certificates must be configured to support both server and client authentication. See Replace NSX Certificates from NSX Manager for more details.