Unable to run prechecks on Edges when the check is triggered from SDDC.
Error: An unexpected exception occurred during NSX-T Pre-upgrade check.
When we run prechecks from NSX UI it only completes for hosts and never starts for edges and managers.
VMware NSX.
In a large scale setup during NSX T edge pre-check Upgrade coordinator loads all the compute collections and matches it against the compute on which the edge is deployed. This workflow is executed in parallel processing to load all Edge clusters at the same time. This leads to the UC going OOM and hence the pre-check fails.
This issue is fixed in VMware NSX releases 3.2.4, 4.2.0.
After triggering the upgrade if the Edge pre-check fails, we would observe below symptoms :
Log files to validate from NSX Manager:
#/image/core indicating the UC coredump :
-rw------- 1 uuc uuc 112M Sep 28 10:42 uc_oom.hprof.gz
# UC running out of memory during execution of group pre-checks for NSX T Edge:
2023-09-28T10:42:50.902Z ERROR pool-56-thread-1 UpgradeServiceImpl 15923 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP30972" level="ERROR" subcomp="upgrade-coordinator"] Error while running pre-upgrade checks
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< OOM error could be evident here
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_342]
at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_342]
at com.vmware.nsx.management.upgrade.service.impl.UpgradeServiceImpl.awaitExecutionOfChecks(UpgradeServiceImpl.java:1231) ~[libuc-core.jar:?]
at com.vmware.nsx.management.upgrade.service.impl.UpgradeServiceImpl.doExecuteUniversalPreUpgradeChecks(UpgradeServiceImpl.java:1189) ~[libuc-core.jar:?]
at com.vmware.nsx.management.upgrade.service.impl.UpgradeServiceImpl.lambda$executeUniversalPreUpgradeChecks$17(UpgradeServiceImpl.java:1145) ~[libuc-core.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_342]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_342]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_342]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_342]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_342]
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332) ~[?:1.8.0_342]
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) ~[?:1.8.0_342]
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) ~[?:1.8.0_342]
at java.lang.StringBuilder.append(StringBuilder.java:141) ~[?:1.8.0_342]
at com.vmware.nsx.management.upgrade.rpcframework.LoggingRestTemplate$LoggingInputStream.read(LoggingRestTemplate.java:219) ~[libuc-core.jar:?]
at java.io.FilterInputStream.read(FilterInputStream.java:133) ~[?:1.8.0_342]
at java.io.PushbackInputStream.read(PushbackInputStream.java:186) ~[?:1.8.0_342]
at java.io.FilterInputStream.read(FilterInputStream.java:133) ~[?:1.8.0_342]
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._loadMore(UTF8StreamJsonParser.java:257) ~[jackson-core-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._loadMoreGuaranteed(UTF8StreamJsonParser.java:2491) ~[jackson-core-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2574) ~[jackson-core-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2523) ~[jackson-core-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getTextCharacters(UTF8StreamJsonParser.java:486) ~[jackson-core-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.util.TokenBuffer._copyBufferValue(TokenBuffer.java:1220) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.util.TokenBuffer.copyCurrentStructure(TokenBuffer.java:1158) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:117) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithType(BeanDeserializerBase.java:1292) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.deser.std.ObjectArrayDeserializer.deserialize(ObjectArrayDeserializer.java:218) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.deser.std.ObjectArrayDeserializer.deserialize(ObjectArrayDeserializer.java:26) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:392) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:185) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4730) ~[jackson-databind-2.14.0.jar:2.14.0]
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3730) ~[jackson-databind-2.14.0.jar:2.14.0]
at org.springframework.http.converter.json.AbstractJackson2HttpMessageConverter.readJavaType(AbstractJackson2HttpMessageConverter.java:380) ~[spring-web-5.3.20.jar:5.3.20]
at org.springframework.http.converter.json.AbstractJackson2HttpMessageConverter.read(AbstractJackson2HttpMessageConverter.java:343) ~[spring-web-5.3.20.jar:5.3.20]
at org.springframework.web.client.HttpMessageConverterExtractor.extractData(HttpMessageConverterExtractor.java:105) ~[spring-web-5.3.20.jar:5.3.20]
at org.springframework.web.client.RestTemplate$ResponseEntityResponseExtractor.extractData(RestTemplate.java:1037) ~[spring-web-5.3.20.jar:5.3.20]
at org.springframework.web.client.RestTemplate$ResponseEntityResponseExtractor.extractData(RestTemplate.java:1020) ~[spring-web-5.3.20.jar:5.3.20]
at com.vmware.nsx.management.upgrade.rpcframework.LoggingRestTemplate$LoggingResponseExtractor.extractData(LoggingRestTemplate.java:128) ~[libuc-core.jar:?]
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:778) ~[spring-web-5.3.20.jar:5.3.20]
# var/log/upgrade-coordinator/upgrade-coordinator-tomcat-wrapper.log indicating JVM running out of space during Edge pre-check execution
INFO | jvm 3 | 2023/09/28 10:42:26 | #
INFO | jvm 3 | 2023/09/28 10:42:26 | # java.lang.OutOfMemoryError: Java heap space
STATUS | wrapper | 2023/09/28 10:42:26 | The JVM has run out of memory. Requesting thread dump.
STATUS | wrapper | 2023/09/28 10:42:26 | Dumping JVM state.
STATUS | wrapper | 2023/09/28 10:42:26 | The JVM has run out of memory. Restart JVM (Ignoring, already restarting).
INFO | jvm 3 | 2023/09/28 10:42:26 | # -XX:OnOutOfMemoryError="gzip -f /image/core/uc_oom.hprof"
INFO | jvm 3 | 2023/09/28 10:42:26 | # Executing /bin/sh -c "gzip -f /image/core/uc_oom.hprof"...
# desired_state_manajor.json will show high count of Edge clusters and compute collections as below :
"/nsxapi/api/v1/edge-clusters": {
"result_count": 45,<<<<<<<<<<<<
"results": [
{
{
"/cm-inventory/api/v1/fabric/compute-collections": {
"cursor": "############-b818-####-acd3-0ff9########:resgroup-126338dDE2MC11dnAwM##############",
"result_count": 2451,<<<<<<<<<<<<
"results": [
{