HTTPS service down on NSX manager

Products

VMware NSX

Issue/Introduction

One of the NSX manager is showing HTTPS status as DOWN when running command get cluster status with group status as DEGRADED.
Below is the output obtained when checking cluster status via CLI :
get cluster status

Group Type: HTTPS
Group Status: DEGRADED

Members:
UUID FQDN IP STATUS
########-####-####-####-######### Manager1 ##.###.##.# DOWN
########-####-####-####-######### Manager2 ##.###.##.# UP
########-####-####-####-######### Manager3 ##.###.##.# UP

NSX Manager logs will show below logging :


/var/log/syslog:

<timestamps> <NSX-Manager-FQDN> NSX 27660 - [nsx@6876 comp="nsx-manager" subcomp="cli" username="admin" level="ERROR" errorCode="('CLI110',)"] Error getting node services, status: HTTPStatus.INTERNAL_SERVER_ERROR, json: None, defaulting to ['node-mgmt']

/var/log/proton/nsxapi.log (corfu-9000.log also has similar entries).

<timestamps>  WARN ShardingServiceThread AbstractView 4278 layoutHelper: System seems unavailable
org.corfudb.runtime.exceptions.NetworkException: Router stopped [endpoint=#.#.#.#:9000]
        at org.corfudb.runtime.clients.NettyClientRouter.stop(NettyClientRouter.java:403) ~[?:?]
        at org.corfudb.runtime.NodeRouterPool.shutdown(NodeRouterPool.java:56) ~[?:?]
        at org.corfudb.runtime.CorfuRuntime.stop(CorfuRuntime.java:984) ~[?:?]
        at org.corfudb.runtime.CorfuRuntime.shutdown(CorfuRuntime.java:964) ~[?:?]

<timestamps> ERROR PolicyDashboardInitializer AddressSpaceView 4168 write: Got exception during replication protocol write with token: TokenResponse(respType=NORMAL, conflictKey=[0], conflictStream=00000000-0000-0000-0000-000000000000, token=Token(epoch=106, sequence=2817820876), backpointerMap={b####-###-####-####-e#######=2817812366}, streamTails={})
org.corfudb.runtime.exceptions.WrongEpochException: Wrong epoch. [expected=107]

Environment

NSX-T Data Center
VMware NSX

Cause

This issue may occur if the /usr/local directory is renamed or if any files within this directory are moved or altered.

Resolution

WORKAROUND:

SSH to NSX Manager with HTTPS service down.
mv /usr/local.bak /usr/local (In this example the directory is renamed local.bak if the directory was renamed to something else, substitute the invalid directory name in place of local.bak)
Restart proxy service. The NSX Management Proxy acts as a reverse proxy, it handles incoming HTTP/HTTPS traffic.
/etc/init.d/proxy stop
/etc/init.d/proxy start
To check cluster status: su admin -c get cluster status
Confirm cluster is stable.