In NSX-T 3.2.4, if there are more than 100 EdgeTransportNodes in the system while upgrading NSX Manager, the upgrade can fail
search cancel

In NSX-T 3.2.4, if there are more than 100 EdgeTransportNodes in the system while upgrading NSX Manager, the upgrade can fail

book

Article ID: 368723

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • On the NSX Manager entries similar to the below will be visible in var/log/upgrade-coordinator/upgrade-coordinator.log 

2024-05-07T11:06:23.534Z ERROR http-nio-127.0.0.1-7442-exec-1 UcRestClient 69275 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP30014" level="ERROR" subcomp="upgrade-coordinator"] Error during GET rest request nsxapi/api/v1/transport-nodes/?node_types=EdgeNode&page_size=100&cursor=E$0104/infra/sites/default/enforcement-points/default/edge-transport-node/<edge_node_id>== , trial 2 , err com.vmware.nsx.management.upgrade.rpcframework.UcRestRpcException: [UC] Error in rest call. url= /nsxapi/api/v1/transport-nodes/?node_types=EdgeNode&page_size=100&cursor=E$0104/infra/sites/default/enforcement-points/default/edge-transport-node/<edge_node_id>== , method= GET , response= {
  "httpStatus" : "BAD_REQUEST",
  "error_code" : 2057,
  "module_name" : "internal-framework",
  "error_message" : "E$0104/infra/sites/default/enforcement-points/default/edge-transport-node/<edge_node_id>== is not a valid cursor."
} , error= 400 : "{<EOL>  "httpStatus" : "BAD_REQUEST",<EOL>  "error_code" : 2057,<EOL>  "module_name" : "internal-framework",<EOL>  "error_message" : "E$0104/infra/sites/default/enforcement-points/default/edge-transport-node/<edge_node_id>== is not a valid cursor."<EOL>}" .

Environment

VMware NSX-T Data Center
VMware NSX

Cause

Due to issues in the transport-nodes pagination code, the Upgrade Coordinator is not able to get the full list of edges, this can lead to failures during the Management Plane upgrade.

Resolution

This is a known issue impacting NSX, currently there is no resolution. 

Additional Information

Workaround  

There are 2 scenarios: 

Scenario 1 

  • If you are already seeing the error mentioned in Symptoms section above and the upgrade to NSX-T 3.2.4 has failed.
  • Execute the "Workaround Steps" mentioned below on 2 upgraded nodes first and then go ahead with the upgrade for the third manager node.
  • After the third manager node has been upgraded, apply the "Workaround steps" to that node.

Scenario 2 

  • If you haven't started with the upgrade to NSX-T 3.2.4 and have more than 100 EdgeTransportNodes, there is no workaround to avoid the issue from occurring.
  • Thus you need to proceed with the upgrade, and follow steps in 'Scenario 1' after the upgrade fails.

 

Workaround steps (to be executed on NSX Manager): 

  1. Stop the Upgrading Coordinator using /etc/init.d/upgrade-coordinator stop
  2. Change the value of following properties in the file /opt/vmware/upgrade-coordinator-tomcat/webapps/upgrade-coordinator/WEB-INF/classes/config.properties
    From
    upgrade.edge.service.edgenodefetch.pagesize=100
    upgrade.edge.service.edgenodefetch.useCustomPageSize=false

    To
    upgrade.edge.service.edgenodefetch.pagesize=800
    upgrade.edge.service.edgenodefetch.useCustomPageSize=true
  3. Start the Upgrade coordinator service using /etc/init.d/upgrade-coordinator start