NSX-T UI inaccessible and CBM core dumps are filling up /image/core
search cancel

NSX-T UI inaccessible and CBM core dumps are filling up /image/core

book

Article ID: 318331

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • The NSX-T UI becomes unavailable.
  • The NSX-T UI may show "Some appliance components are not functioning properly. Component health: POLICY:UNKNOWN, MANAGER:UNKNOWN, SEARCH:UNKNOWN, NO."
  • On vSphere, it's not possible to change port groups, vMotion or power on NSX-backed VMs.
  • The compactor throws OOM when checkpointing table LoginLogoutEvent as seen in /var/log/corfu/corfu-compactor-audit.log:

2021-10-01T10:30:16.418Z INFO main FrameworkCorfuCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] Serial checkpoint for map nsx-manager LoginLogoutEvent 7103, id 3e6e9124-c492-####-####-########d3d.
2021-10-01T10:30:16.418Z INFO main MultiCheckpointWriter - appendCheckpoints: appending checkpoints for 1 maps
2021-10-01T10:30:16.438Z INFO main CheckpointWriter - appendCheckpoint: Started checkpoint for 3e6e9124-c492-####-####-########d3d at snapshot Token(epoch=3968, sequence=2173643076)
2021-10-01T10:30:23.993Z INFO main ObjectSerializer - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfudb"] Identifier deserialization cache stat. Total=1811683, hits=40000
.
.
.
2021-10-01T10:38:16.093Z INFO main ObjectSerializer - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfudb"] Identifier deserialization cache stat. Total=11891683, hits=5080000
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /image/core/java_pid3360.hprof ...
Heap dump file created [2718867198 bytes in 13.856 secs]
Aborting due to java.lang.OutOfMemoryError: Java heap space
#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (debug.cpp:308), pid=3360, tid=0x000074996e1ed700
# fatal error: OutOfMemory encountered: Java heap space
#
# JRE version: Java(TM) SE Runtime Environment (8.0_301-b09) (build 1.8.0_301-b09)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.301-b09 mixed mode linux-amd64 compressed oops)
# Core dump written. Default location: //core or core.3360
#
# An error report file with more information is saved as:
# //hs_err_pid3360.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)

  • By the time, the compactor throws OutOfMemory, the table has a high number of entries and the table size has been growing and seems there is no cleanup task happening.

2021-10-01T00:19:21.809Z INFO main CheckpointWriter - appendCheckpoint: completed checkpoint for 3e6e9124-c492-####-####-########d3d, entries(3797270), cpSize(2354490943) bytes at snapshot Token(epoch=3968, sequence=2171963962) in 1177056 ms
.
.
.
2021-10-01T07:10:02.856Z INFO main CheckpointWriter - appendCheckpoint: completed checkpoint for 3e6e9124-c492-####-####-########d3d, entries(4761324), cpSize(2950812316) bytes at snapshot Token(epoch=3968, sequence=2172928084) in 5079209 ms <= last successful run

 

Environment

VMware NSX-T Data Center 3.x

Cause

No cleanup task is happening for the table LoginLogoutEvent which results in an increase in the table size. Hence, the /image/core/*.hprof files are created.

Resolution

This is resolved in VMware NSX-T Data Center version 3.1.3.3 available in Support Documents and Downloads (broadcom.com).

Workaround:
To work around this issue, contact Broadcom Support and note this Article ID in the problem description.