NSX Edge upgrade fails as appliance configuration is invalid
search cancel

NSX Edge upgrade fails as appliance configuration is invalid

book

Article ID: 325071

calendar_today

Updated On:

Products

VMware NSX Data Center for vSphere

Issue/Introduction


  • An upgrade is available on the affected NSX Edge.
  • The upgrade fails because of an invalid appliance configuration, such as invalid datastore-id.
For example, in vsm.log:
2018-07-24 12:57:31.442 GMT-00:00 ERROR TaskFrameworkExecutor-27 VcOperationsUtils:2270 - - [nsxv@6876 comp="nsx
-manager" subcomp="manager"] Edge VM 'edge-##-jobdata-8635-0' deployment/installOvf failed for edge Id 'edge-##'
com.vmware.vshield.vsm.inventory.vcoperations.OvfManagerInternalErrorException: nested exception is (vmodl.fault
.ManagedObjectNotFound) {
   faultCause = null,
   faultMessage = null,
   obj = ManagedObjectReference: type = Datastore, value = datastore-##, serverGuid = aabaf72b-####-####-####-############
}
[...]
2018-07-24 12:57:31.510 GMT-00:00  INFO TaskFrameworkExecutor-27 VcOperationsUtils:1845 - - [nsxv@6876 comp="nsx-manager" subcomp="manager"] Phase=Publishing edgeId=edge-## jobId=jobdata-8635 taskType=configPublishTask timeSinceStart=3016 ms jobStatus=RUNNING message=Failed to deploy NSX Edge appliance edge-#-jobdata-8635-0. User configured placement params are not valid, attempt to install using that failed. About to install using live params. Db params are resourcePool=resgroup-#, datastore=datastore-##, vmFolder =null, Vm haIndex=0.
2018-07-24 12:57:31.510 GMT-00:00 ERROR TaskFrameworkExecutor-27 VcOperationsUtils:1733 - - [nsxv@6876 comp="nsx-manager" subcomp="manager"] Job 'jobdata-8635' - Db Network parameters are not valid and Live placement related network mapping is not specified for edge 'edge-##' haIndex '0', so considering this operation as failed. OldVmMoId='vm-#'
 
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
  • Modifying the configuration of the NSX Edge to correct the invalid object-id fails as the edge must be upgraded before any configuration change.
For example, in UI: "Appliance has to be upgraded before performing any configuration change."
Via REST API call: 
<?xml version="1.0" encoding="UTF-8"?>
<error>
  <errorCode>10220</errorCode>
  <details>Appliance has to be upgraded before performing any configuration change.</details>
  <moduleName>vShield Edge</moduleName>
</error>


Cause

The invalid configuration prevents the upgrade.
The upgrade available prevents the modification of the configuration.

Resolution

Apply the workaround described in the relevant section.

Workaround:
Undeploying the Edge allows the upgrade.
Appliances can be redeployed post-upgrade.

Follow the steps as below, via REST API call:
Note: As the appliances will be undeployed, network disruption is to be expected during the operation.

  1. REST API call: GET /api/4.0/edges/edge-#/appliances
  2. Modify <deployAppliances> from true to false:

<deployAppliances>false</deployAppliances>

  1. REST API call with the above modified body: PUT /api/4.0/edges/edge-#/appliances

The content should be similar to:

<?xml version="1.0" encoding="UTF-8"?>
<appliances>
  <appliance>
    <highAvailabilityIndex>0</highAvailabilityIndex>
    <vcUuid>50068fa9-####-####-####-############</vcUuid>
    <vmId>vm-#</vmId>
    <haAdminState>up</haAdminState>
    <resourcePoolId>resgroup-#</resourcePoolId>
    <resourcePoolName>PrimaryResourcePool</resourcePoolName>
    <datastoreId>datastore-#</datastoreId>
    <datastoreName>datastore-#</datastoreName>
    <hostId>host-#</hostId>
    <hostName>esxi#.mydomain</hostName>
    <vmHostname>nsx-edge#-#</vmHostname>
    <vmName>nsx-edge#-#</vmName>
    <deployed>true</deployed>
    <cpuReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </cpuReservation>
    <memoryReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </memoryReservation>
    <edgeId>edge-#</edgeId>
    <configuredResourcePool>
      <id>resgroup-#</id>
      <name>PrimaryResourcePool</name>
      <isValid>true</isValid>
    </configuredResourcePool>
    <configuredDataStore>
      <id>datastore-#</id>
      <isValid>false</isValid>
    </configuredDataStore>
    <configuredHost>
      <id>host-#</id>
      <name>esxi#.mydomain</name>
      <isValid>true</isValid>
    </configuredHost>
  </appliance>
  <appliance>
    <highAvailabilityIndex>1</highAvailabilityIndex>
    <vcUuid>5006d672-fd98-####-####-########0db</vcUuid>
    <vmId>vm-#</vmId>
    <haAdminState>up</haAdminState>
    <resourcePoolId>domain-#</resourcePoolId>
    <resourcePoolName>SecondaryResourcePool</resourcePoolName>
    <datastoreId>datastore-#</datastoreId>
    <datastoreName>PrimaryDatastore</datastoreName>
    <hostId>host-#</hostId>
    <hostName>esxi#.mydomain</hostName>
    <vmHostname>nsx-edge#-#</vmHostname>
    <vmName>nsx-edge#-#</vmName>
    <deployed>true</deployed>
    <cpuReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </cpuReservation>
    <memoryReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </memoryReservation>
    <edgeId>edge-#</edgeId>
    <configuredResourcePool>
      <id>domain-#</id>
      <name>SecondaryResourcePool</name>
      <isValid>true</isValid>
    </configuredResourcePool>
    <configuredDataStore>
      <id>datastore-#</id>
      <name>PrimaryDatastore</name>
      <isValid>true</isValid>
    </configuredDataStore>
    <configuredHost>
      <id>host-#</id>
      <name>esxi#.mydomain</name>
      <isValid>true</isValid>
    </configuredHost>
  </appliance>
  <deployAppliances>false</deployAppliances>
</appliances>
  1. Trigger the Upgrade.
  2. Repeat the steps 1 to 3, and set deployAppliances to true, with valid datastoreId for both appliances.

The content should be similar to:

<?xml version="1.0" encoding="UTF-8"?>
<appliances>
  <appliance>
    <highAvailabilityIndex>0</highAvailabilityIndex>
    <vcUuid>50068fa9-2d61-####-####-########e9d</vcUuid> <vmId>vm-#</vmId> <haAdminState>up</haAdminState> <resourcePoolId>resgroup-#</resourcePoolId> <resourcePoolName>PrimaryResourcePool</resourcePoolName>  <datastoreId>datastore-#</datastoreId> <datastoreName>PrimaryDatastore</datastoreName> <hostId>host-#</hostId> <hostName>esxi#.mydomain</hostName> <vmHostname>nsx-edge#-#</vmHostname> <vmName>nsx-edge#-#</vmName> <deployed>true</deployed> <cpuReservation> <limit>-1</limit> <reservation>0</reservation> </cpuReservation> <memoryReservation> <limit>-1</limit> <reservation>0</reservation> </memoryReservation> <edgeId>edge-#</edgeId> <configuredResourcePool> <id>resgroup-#</id> <name>PrimaryResourcePool</name> <isValid>true</isValid> </configuredResourcePool>  <configuredDataStore> <id>datastore-#</id> <name>PrimaryDatastore</name> <isValid>true</isValid> </configuredDataStore> <configuredHost> <id>host-#</id> <name>esxi#.mydomain</name> <isValid>true</isValid> </configuredHost> </appliance> <appliance> <highAvailabilityIndex>1</highAvailabilityIndex> <vcUuid>5006d672-fd98-####-####-########0db</vcUuid> <vmId>vm-#</vmId> <haAdminState>up</haAdminState> <resourcePoolId>domain-#</resourcePoolId> <resourcePoolName>SecondaryResourcePool</resourcePoolName>  <datastoreId>datastore-#</datastoreId> <datastoreName>PrimaryDatastore</datastoreName> <hostId>host-#</hostId> <hostName>esxi#.mydomain</hostName> <vmHostname>nsx-edge#-#</vmHostname> <vmName>nsx-edge#-#</vmName> <deployed>true</deployed> <cpuReservation> <limit>-1</limit> <reservation>0</reservation> </cpuReservation> <memoryReservation> <limit>-1</limit> <reservation>0</reservation> </memoryReservation> <edgeId>edge-#</edgeId> <configuredResourcePool> <id>domain-#</id> <name>SecondaryResourcePool</name> <isValid>true</isValid> </configuredResourcePool>  <configuredDataStore> <id>datastore-<#/id> <name>PrimaryDatastore</name> <isValid>true</isValid> </configuredDataStore> <configuredHost> <id>host-#</id> <name>esxi#.mydomain</name> <isValid>true</isValid> </configuredHost> </appliance>  <deployAppliances>true</deployAppliances> </appliances>



Additional Information

Impact/Risks:
This situation does not impact the data plane.
The NSX Edge is running as normal.