NSX UI inaccessible post CBM cert replacement
search cancel

NSX UI inaccessible post CBM cert replacement

book

Article ID: 368578

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • CBM certs have been replaced due to the following issue - NSX alarms indicating certificates have expired or are expiring
  • Certificates are replaced on multiple NSX Manager nodes in quick succession while NSX manager cluster status is degraded.
    • e.g. NSX Manager A certs are replaced. As expected, this results in service restart and cluster is degraded for a period. During this period NSX Manager B and/or C certs are replaced.
  • If you use CARR script, it reports errors like below in CARR Script Validation Report at the end.
    Example:
    +-------------------------------------------------------------------------------------------------------------------------------------------------------+
    |                                                             CARR Script Validation Report                                                             |
    +-------------------------+--------------------------------------------------------------+--------------------------------------------------------------+
    | Certificate Checks      | Validation Results                                           | Probable Fix                                                 |
    +-------------------------+--------------------------------------------------------------+--------------------------------------------------------------+
    ...

    r+-------------------------+--------------------------------------------------------------+--------------------------------------------------------------+
    | CBM_CORFU               | ERROR  : <IP address>    : Certificate in database does not  | Certificate : 'CBM_CORFU' on disk(keystore) will be replaced |
    |                         | match with keystore of corfu server <IP address>             | by certificate from datastore.                               |
    |                         | ERROR  : <IP address>    : Certificate in database does not  | Certificate : 'CBM_CORFU' on disk(keystore) will be replaced |
    |                         | match with keystore of corfu server <IP address>             | by certificate from datastore.                               |
    |                         | ERROR  : <IP address>    : Certificate in database does not  | Certificate : 'CBM_CORFU' on disk(keystore) will be replaced |
    |                         | match with keystore of corfu server <IP address>             | by certificate from datastore.                               |
    |                         |                                                              |                                                              |
    +-------------------------+--------------------------------------------------------------+--------------------------------------------------------------+

    +-------------------------------------------------------------------------------------------------------------------------------------------------------+
    |                                                             CARR Script Validation Report                                                             |
    +-------------------------+--------------------------------------------------------------+--------------------------------------------------------------+
    | Certificate Checks      | Validation Results                                           | Probable Fix                                                 |
    +-------------------------+--------------------------------------------------------------+--------------------------------------------------------------+
    ...
    +-------------------------+--------------------------------------------------------------+--------------------------------------------------------------+
    | CBM_CORFU               | ERROR  : <IP address>    : cert CBM_CORFU in keystore of     | Certificate with alias <UUID>                                |
    |                         | <IP address> does not match with truststore cert of          | of node <IP address> will be replaced with keystore          |
    |                         | <IP address>                                                 | '<IP address>' certificate         
  • NSX Manager logs shows that the certs were applied successfully with POST request returning 200 for the CBM certs

/var/log/proxy/reverse-proxy.log

[TIMESTAMP] <IP> <IP> "POST" "/api/v1/trust-management/certificates/<UUID>?action=apply_certificate&service_type=CBM_[CERT_TYPE]&node_id=<UUID>" "HTTP/1.1" 200 - 0 0 1738 842 "<IP>"  "<UUID>" "<FQDN" "127.###.###.###:7440"

  • NSX Manager logs show bad cert

/var/log/proton/nsxapi.log 

[TIMESTAMP]  WARN netty-1 NettyClientRouter 493572 userEventTriggered: unhandled event SslHandshakeCompletionEvent(javax.net.ssl.SSLHandshakeException: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate)io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate

Other log files might also have the error depending on the affected certificate.
ccp -                     /var/log/cloudnet/nsx-ccp.log
CBM -                     /var/log/cbm/cbm.log
corfu -                   /var/log/corfu/corfu.9000.log
messaging-manager -       /var/log/messaging-manager/messaging-manager.log
mp/proton -               /var/log/proton/nsxapi.log
site-manager -            /var/log/site-manager/sm.log
ar -                      /var/log/async-replicator/ar.log
cm-inventory -            /var/log/cm-inventory/cm-inventory.log
idps-reporting -          /var/log/idps-reporting/idps.log
monitoring -              /var/log/phonehome-coordinator/phonehome-coordinator.log 
upgrade-coordinator -     /var/log/upgrade-coordinator/upgrade-coordinator.log

  • Cluster might stays in a degraded state
  • NSX UI is inaccessible
    • NSX UI might go inaccessible in a few days after replacing certificates.

Environment

VMware NSX 4.1.x
VMware NSX 4.2.0.x

Cause

A race condition exists between CBM certificate replacement task and the periodic sync task. As a result, the Corfu DB public trust store is not updated. Due to this, services fail to connect to Corfu DB.

Resolution

This issue is resolved in VMware NSX 4.2.1 available at Broadcom Downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

If this issue is encountered, run CARR script 1.19 and later.
If the issue still persists, please contact VMware NSX GS by opening a service request and referencing this KB article Creating and managing Broadcom support cases