This issue is resolved in VMware NSX-T Data Center 3.0.1, available at VMware Downloads.
1) Run the following API to check if any AppDiscovery sessions have been collected.
$ GET /api/v1/app-discovery/sessions (Note this API is not available starting NSX-T Datacenter 3.0.0)
{
"results" : [ {
"status" : "FINISHED",
"reclassification" : "NOT_REQUIRED",
"start_timestamp" : 1541181098384,
"end_timestamp" : 1541181148659,
"id" : "f36e3055-6d04-4150-99f4-4547e8c38ce0",
"_protection" : "NOT_PROTECTED"
} ],
"result_count" : 1,
"sort_by" : "start_timestamp",
"sort_ascending" : false
}
If the result_count in the response is greater than 0, then proceed with the remaining steps, ELSE you can continue to upgrade to NSX-T 3.0.0 using normal upgrade procedure
2) The pre-upgrade script (attached to this KB) MUST be run prior to upgrading to NSX-T 3.0.0 Release. Run the attached preUpgradeCleanup.py script to cleanup all AppDiscovery sessions in the database. The script requires 3 arguments as indicated below and when run it gets all the AppDiscovery sessions and cleans up the entries.
endpoint-ip: IP address of the NSX Manager
user-name: Optional parameter of the admin user of the NSX manager, default is admin
password: User password
Here is an example on how to run the script:
$ python preUpgradeCleanup.py --endpoint-ip <nsxmgr-ip> --user-name admin --password <adminpasswd>
Output printed when there are no sessions found
Fetching AppDiscovery sessions
/Library/Python/2.7/site-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made to host '10.92.166.59'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
Found 0 AppDiscovery sessions.
Success!
Output printed when there are some sessions found
Fetching AppDiscovery sessions
/Library/Python/2.7/site-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made to host '10.92.166.59'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
Found 1 AppDiscovery sessions.
Deleting AppDiscovery Session 600cbf06-e661-4c77-9e86-c57c84da5c4a
/Library/Python/2.7/site-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made to host '10.92.166.59'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
Deleted AppDiscovery Session 600cbf06-e661-4c77-9e86-c57c84da5c4a Succesfully
Success!
3) After the above steps, proceed with the regular upgrade steps to NSX-T 3.0.0 Release. If you have successfully completed the above steps prior to the upgrade, then you do not have to run Option #2 after upgrade is completed.
If Option #1 was not exercised prior to the upgrade, then after the upgrade - run these steps on any one node of the NSX-T Manager cluster. Check /var/log/corfu/corfu-compactor-audit.log to see if compaction is failing due to AppProfileInstance deserialization error.
1) Run the command below and see if last three trim completed messages all have the same sequence number,
$ grep -a “Trim completed” /var/log/corfu/corfu-compactor-audit.log
Note: Above command may throw an error after copy/paste, please retype the quotes in case of error.
2) Check if you see results for No binding for AppProfileInstance; this is the known issue with AppProfileInstance table not being cleared during upgrade.
$ zgrep -a "No binding for type: AppProfileInstance" /var/log/corfu/corfu-compactor-audit*
3) Run df -h /config and if usage is above 85%, do not proceed further and engage VMware Support via a Support Request.
4) Query the database on any node of the cluster to make sure there are entries in AppProfileInstance table. Change the <node-ip> to the node the you are currently logged in. This query also will fail with the same No binding found for AppProfileInstance table with serialization exception; this indicates there are entries in this table but the browser cannot display them as the AppProfileInstance class has been deleted in NSX-T 3.0.0.
$ java -Dlog4j.configurationFile=/opt/vmware/corfu-tools/corfu-browser-log4j2.xml -cp "/opt/vmware/corfu-tools/corfu-browser-1.0-jar-with-dependencies.jar:/opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/*" com.vmware.nsx.management.tools.corfu.CorfuBrowserMain -hostname <node-ip> -port 9000 printTable -tableName 'nsx-manager AppProfileInstance f405'
Note: Above command may throw an error after copy/paste, please retype the quotes in case of error.
5) Unzip jarFiles.zip (attached to KB) (will output app-discovery-1.0.jar and context-common-1.0.jar). Copy the two JAR files into /opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/. on all three MP nodes
6) Run the next steps on any one MP node of the cluster.
Query the database to see the contents of AppProfileInstance table, now that the missing class file is added to the classpath folder. Now you should be able to see all the data in this table.
$ java -Dlog4j.configurationFile=/opt/vmware/corfu-tools/corfu-browser-log4j2.xml -cp "/opt/vmware/corfu-tools/corfu-browser-1.0-jar-with-dependencies.jar:/opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/*" com.vmware.nsx.management.tools.corfu.CorfuBrowserMain -hostname <node-ip> -port 9000 printTable -tableName 'nsx-manager AppProfileInstance f405'
7) Delete all the entries in this AppProfileInstance table (change the IP address to the node you are logged in to).
$ java -Xmx640m -Dlog4j.configurationFile=/opt/vmware/corfu-tools/corfu-browser-log4j2.xml -cp "/opt/vmware/corfu-tools/corfu-editor-1.0-jar-with-dependencies.jar:/opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/*" com.vmware.nsx.management.tools.corfu.CorfuEditorMain -hostname <node-ip> -port 9000 removeEntries -tableName 'nsx-manager AppProfileInstance f405' -cleanUp
8) Run the query from step #2 above to make sure that AppProfileInstance table is empty.
9) Wait for 3 compaction cycles to ensure that the data indeed got trimmed. To verify if the above steps have been successfully implemented, look for "Trim completed" messages in the corfu-compactor-audit.log. for most recent timestamp.
The sequence numbers at the end of the log line should be incremental in each line.
$ grep -a “Trim completed” /var/log/corfu/corfu-compactor-audit.log
2020-04-27T23:39:59.765Z INFO main FrameworkCorfuCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] Trim completed, elapsed(0s), appliance(nsx-manager), token(Token(epoch=423, sequence=65083185)).
2020-04-27T23:55:00.560Z INFO main FrameworkCorfuCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] Trim completed, elapsed(0s), appliance(nsx-manager), token(Token(epoch=423, sequence=65110210)).
2020-04-28T00:09:59.421Z INFO main FrameworkCorfuCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] Trim completed, elapsed(0s), appliance(nsx-manager), token(Token(epoch=423, sequence=65137146)).
10) You should no longer see these exceptions in the corfu-compactor-audit.log after you copied those 2 jar files from step 1 above.
$ grep -a "No binding for type: AppProfileInstance" /var/log/corfu/corfu-compactor-audit.log
11) Execute the df -h /config command, to verify /config is less than 10%.
12) Remove the copied jars files from all nodes.
$ rm /opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/app-discovery-1.0.jar
$ rm /opt/vmware/proton-tomcat/webapps/nsxapi/WEB-INF/lib/context-common-1.0.jar