HTTPS service down on NSX manager
search cancel

HTTPS service down on NSX manager

book

Article ID: 393040

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • One of the NSX manager is showing HTTPS status as DOWN when running command get cluster status with group status as DEGRADED.
  • Below is the output obtained when checking cluster status via CLI :
    get cluster status

Group Type: HTTPS
Group Status: DEGRADED

Members:
    UUID                                    FQDN              IP             STATUS
    ########-####-####-####-#########       Manager1          ##.###.##.#    DOWN
    ########-####-####-####-#########       Manager2          ##.###.##.#    UP
    ########-####-####-####-#########       Manager3          ##.###.##.#    UP

  • NSX Manager logs will show below logging :
    /var/log/syslog:
    <timestamps> <NSX-Manager-FQDN> NSX 27660 - [nsx@6876 comp="nsx-manager" subcomp="cli" username="admin" level="ERROR" errorCode="('CLI110',)"] Error getting node services, status: HTTPStatus.INTERNAL_SERVER_ERROR, json: None, defaulting to ['node-mgmt']
    /var/log/proton/nsxapi.log (corfu-9000.log also has similar entries).
    <timestamps>  WARN ShardingServiceThread AbstractView 4278 layoutHelper: System seems unavailable
    org.corfudb.runtime.exceptions.NetworkException: Router stopped [endpoint=#.#.#.#:9000]
            at org.corfudb.runtime.clients.NettyClientRouter.stop(NettyClientRouter.java:403) ~[?:?]
            at org.corfudb.runtime.NodeRouterPool.shutdown(NodeRouterPool.java:56) ~[?:?]
            at org.corfudb.runtime.CorfuRuntime.stop(CorfuRuntime.java:984) ~[?:?]
            at org.corfudb.runtime.CorfuRuntime.shutdown(CorfuRuntime.java:964) ~[?:?]
    
    <timestamps> ERROR PolicyDashboardInitializer AddressSpaceView 4168 write: Got exception during replication protocol write with token: TokenResponse(respType=NORMAL, conflictKey=[0], conflictStream=00000000-0000-0000-0000-000000000000, token=Token(epoch=106, sequence=2817820876), backpointerMap={b####-###-####-####-e#######=2817812366}, streamTails={})
    org.corfudb.runtime.exceptions.WrongEpochException: Wrong epoch. [expected=107]

Environment

NSX-T Data Center
VMware NSX

Cause

This issue may occur if the /usr/local directory is renamed or if any files within this directory are moved or altered.

Resolution

WORKAROUND: 

  • SSH to NSX Manager with HTTPS service down.
  • mv /usr/local.bak /usr/local   (In this example the directory is renamed local.bak if the directory was renamed to something else, substitute the invalid directory name in place of local.bak)
  • Restart proxy service. The NSX Management Proxy acts as a reverse proxy, it handles incoming HTTP/HTTPS traffic.
    /etc/init.d/proxy stop
    /etc/init.d/proxy start
  • To check cluster status: su admin -c get cluster status
  • Confirm cluster is stable.