LS remain in "IN progress" state and K8S PODs take long time to come in Running state

search cancel

LS remain in "IN progress" state and K8S PODs take long time to come in Running state

book

Article ID: 306217

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

In the Churn environment, PODs start taking a long time to come into running state.
NSX T Version is 3.0.X
Many LS remain in "In progress" state for a long time.
Following error messages will be seen in /var/log/nsxapi* log

2020-09-10T03:51:51.268Z WARN RealizationServiceMaintenanceExecutor-0 VersionLockedObject - SyncObjectUnsafe[CorfuTable[c3e0]@26373221+-1] to 26370094 failed org.corfudb.runtime.exceptions.NoRollbackException: Can't roll back due to put@26373221 but need 26370094 so can't undo

Environment

VMware NSX-T Data Center
VMware NSX-T Data Center 3.x

Resolution

This issue is fixed in NSXT 3.1.0 and higher

Workaround:

cd /opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib

Take backup of file nsx-realization-1.0.jar.
Copy nsx-realization-1.0.jar to another Linux machine.
On Linux machine where "jar" utility is installed, extract nsxrealization-config.properties file from nsx-realization-1.0.jar.

jar xf nsx-realization-1.0.jar META-INF/spring/nsxrealization-config.properties

Edit the files to update properties:

vi META-INF/spring/nsxrealization-config.properties

Change value of property realization.realizationstate.maintenance.apiBatchSize to 100 and save.
Update the jar files with the modified file:

jar uf nsx-realization-1.0.jar META-INF/spring/nsxrealization-config.properties

Copy modified JAR file in all 3 UA nodes in /opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib

scp nsx-realization-1.0.jar root@<UA-IP>:/opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/

On all 3 UA nodes check if file is copied properly in dir:

cd /opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/

ls -la | grep nsx-realization-1.0.jar

Restart proton service in all UA nodes one by one:

systemctl restart proton

Wait for the cluster to become stable, check cluster status using nsxcli:

get cluster status

Feedback

thumb_up Yes

thumb_down No