/config partition on NSX Manager nodes may grow to 100% post upgrading to NSX-T 3.0.0VMware NSX-T Data Center 3.x
This issue is resolved in VMware NSX-T Data Center 3.0.1, available at Broadcom downloads.
If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.
Workaround:
GET /api/v1/app-discovery/sessions (Note this API is not available starting NSX-T Datacenter 3.0.0){"results" : [ {"status" : "FINISHED","reclassification" : "NOT_REQUIRED", "start_timestamp" : 1541181098384,"end_timestamp" : 1541181148659, "id" : "f36e3055-####-####-####-4547e8c38ce0", "_protection" : "NOT_PROTECTED"} ], "result_count" : 1, "sort_by" : "start_timestamp","sort_ascending" : false}result_count in the response is greater than 0, then proceed with the remaining steps, ELSE you can continue to upgrade to NSX-T 3.0.0 using normal upgrade procedurepreUpgradeCleanup.py script to cleanup all AppDiscovery sessions in the database. The script requires 3 arguments as indicated below and when run it gets all the AppDiscovery sessions and cleans up the entries.Here is an example on how to run the script:
python preUpgradeCleanup.py --endpoint-ip <nsxmgr-ip> --user-name admin --password <adminpasswd>
Output printed when there are no sessions found
Fetching AppDiscovery sessions/Library/Python/2.7/site-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made to host '10.92.166.59'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warningsInsecureRequestWarning,Found 0 AppDiscovery sessions.Success!
Output printed when there are some sessions found
Fetching AppDiscovery sessions/Library/Python/2.7/site-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made to host '10.92.166.59'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warningsInsecureRequestWarning,Found 1 AppDiscovery sessions.Deleting AppDiscovery Session 600cbf06-####-####-####-c57c84da5c4a/Library/Python/2.7/site-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made to host '10.92.166.59'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warningsInsecureRequestWarning,Deleted AppDiscovery Session 600cbf06-####-####-####-c57c84da5c4a SuccesfullySuccess!
If Option #1 was not exercised prior to the upgrade, then after the upgrade - run these steps on any one node of the NSX-T Manager cluster. Check /var/log/corfu/corfu-compactor-audit.log to see if compaction is failing due to AppProfileInstance deserialization error.
grep -a “Trim completed” /var/log/corfu/corfu-compactor-audit.logNo binding for AppProfileInstance; this is the known issue with AppProfileInstance table not being cleared during upgrade.zgrep -a "No binding for type: AppProfileInstance" /var/log/corfu/corfu-compactor-audit*
df -h /config and if usage is above 85%, do not proceed further and contact Broadcom Support.<node-ip> to the node the you are currently logged in. This query also will fail with the same No binding found for AppProfileInstance table with serialization exception; this indicates there are entries in this table but the browser cannot display them as the AppProfileInstance class has been deleted in NSX-T 3.0.0.java -Dlog4j.configurationFile=/opt/vmware/corfu-tools/corfu-browser-log4j2.xml -cp "/opt/vmware/corfu-tools/corfu-browser-1.0-jar-with-dependencies.jar:/opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/*" com.vmware.nsx.management.tools.corfu.CorfuBrowserMain -hostname <node-ip> -port 9000 printTable -tableName 'nsx-manager AppProfileInstance f405'
Note: Above command may throw an error after copy/paste, please retype the quotes in case of error.jarFiles.zip (attached to KB) (will output app-discovery-1.0.jar and context-common-1.0.jar). Copy the two JAR files into /opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/ on all three MP nodes.java -Dlog4j.configurationFile=/opt/vmware/corfu-tools/corfu-browser-log4j2.xml -cp "/opt/vmware/corfu-tools/corfu-browser-1.0-jar-with-dependencies.jar:/opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/*" com.vmware.nsx.management.tools.corfu.CorfuBrowserMain -hostname <node-ip> -port 9000 printTable -tableName 'nsx-manager AppProfileInstance f405'
java -Xmx640m -Dlog4j.configurationFile=/opt/vmware/corfu-tools/corfu-browser-log4j2.xml -cp "/opt/vmware/corfu-tools/corfu-editor-1.0-jar-with-dependencies.jar:/opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/*" com.vmware.nsx.management.tools.corfu.CorfuEditorMain -hostname <node-ip> -port 9000 removeEntries -tableName 'nsx-manager AppProfileInstance f405' -cleanUp
Trim completed" messages in the corfu-compactor-audit.log. for most recent timestamp.grep -a “Trim completed” /var/log/corfu/corfu-compactor-audit.log2020-04-27T23:39:59.765Z INFO main FrameworkCorfuCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] Trim completed, elapsed(0s), appliance(nsx-manager), token(Token(epoch=423, sequence=65083185)).2020-04-27T23:55:00.560Z INFO main FrameworkCorfuCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] Trim completed, elapsed(0s), appliance(nsx-manager), token(Token(epoch=423, sequence=65110210)).2020-04-28T00:09:59.421Z INFO main FrameworkCorfuCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] Trim completed, elapsed(0s), appliance(nsx-manager), token(Token(epoch=423, sequence=65137146)).
corfu-compactor-audit.log after you copied those 2 jar files from step 1 above.grep -a "No binding for type: AppProfileInstance" /var/log/corfu/corfu-compactor-audit.logdf -h /config command, to verify /config is less than 10%.rm /opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/app-discovery-1.0.jarrm /opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/context-common-1.0.jar
If this article did not help resolve your issue, you can review the following article for further reference: Troubleshooting disk space related issues on NSX Nodes