NSX Edge upgrade fails as appliance configuration is invalid
search cancel

NSX Edge upgrade fails as appliance configuration is invalid

book

Article ID: 325071

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

This article is meant to document how to get out of the blocking situation.

Symptoms:
  • An upgrade is available on the affected NSX Edge.
  • The upgrade fails because of an invalid appliance configuration, such as invalid datastore-id.
For example, in vsm.log:
2018-07-24 12:57:31.442 GMT-00:00 ERROR TaskFrameworkExecutor-27 VcOperationsUtils:2270 - - [nsxv@6876 comp="nsx
-manager" subcomp="manager"] Edge VM 'edge-29-jobdata-8635-0' deployment/installOvf failed for edge Id 'edge-29'
com.vmware.vshield.vsm.inventory.vcoperations.OvfManagerInternalErrorException: nested exception is (vmodl.fault
.ManagedObjectNotFound) {
   faultCause = null,
   faultMessage = null,
   obj = ManagedObjectReference: type = Datastore, value = datastore-62, serverGuid = aabaf72b-8ab0-4ab5-b13f-a016baf52fd7
}
[...]
2018-07-24 12:57:31.510 GMT-00:00  INFO TaskFrameworkExecutor-27 VcOperationsUtils:1845 - - [nsxv@6876 comp="nsx-manager" subcomp="manager"] Phase=Publishing edgeId=edge-29 jobId=jobdata-8635 taskType=configPublishTask timeSinceStart=3016 ms jobStatus=RUNNING message=Failed to deploy NSX Edge appliance edge-29-jobdata-8635-0. User configured placement params are not valid, attempt to install using that failed. About to install using live params. Db params are resourcePool=resgroup-35, datastore=datastore-62, vmFolder =null, Vm haIndex=0.
2018-07-24 12:57:31.510 GMT-00:00 ERROR TaskFrameworkExecutor-27 VcOperationsUtils:1733 - - [nsxv@6876 comp="nsx-manager" subcomp="manager"] Job 'jobdata-8635' - Db Network parameters are not valid and Live placement related network mapping is not specified for edge 'edge-29' haIndex '0', so considering this operation as failed. OldVmMoId='vm-246'

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
  • Modifying the configuration of the NSX Edge to correct the invalid object-id fails as the edge must be upgraded before any configuration change.
For example, in UI: "Appliance has to be upgraded before perofmring any configuration change."
Via REST API call: 
<?xml version="1.0" encoding="UTF-8"?>
<error>
  <errorCode>10220</errorCode>
  <details>Appliance has to be upgraded before performing any configuration change.</details>
  <moduleName>vShield Edge</moduleName>
</error>


Cause

The invalid configuration prevents the upgrade.
The upgrade available prevents the modification of the configuration.

Resolution

This situation does not need a code change.
Apply the workaround described in the relevant section.

Workaround:
Undeploying the Edge will allow the upgrade.
Appliances can be redeployed post-upgrade.

Follow the steps as below, via REST API call:
Note: As the appliances will be undeployed, network disruption is to be expected during the operation.
  1. REST API call: GET /api/4.0/edges/edge-2/appliances
  2. Modify <deployAppliances> from true to false:
<deployAppliances>false</deployAppliances>
  1. REST API call with the above modified body: PUT /api/4.0/edges/edge-2/appliances
The content should be similar to:
<?xml version="1.0" encoding="UTF-8"?>
<appliances>
  <appliance>
    <highAvailabilityIndex>0</highAvailabilityIndex>
    <vcUuid>50068fa9-2d61-395d-a11d-1aee8cf90e9d</vcUuid>
    <vmId>vm-651</vmId>
    <haAdminState>up</haAdminState>
    <resourcePoolId>resgroup-346</resourcePoolId>
    <resourcePoolName>PrimaryResourcePool</resourcePoolName>
    <datastoreId>datastore-250</datastoreId>
    <datastoreName>datastore-250</datastoreName>
    <hostId>host-232</hostId>
    <hostName>esxi01.mydomain</hostName>
    <vmHostname>nsx-edge01-0</vmHostname>
    <vmName>nsx-edge01-0</vmName>
    <deployed>true</deployed>
    <cpuReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </cpuReservation>
    <memoryReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </memoryReservation>
    <edgeId>edge-2</edgeId>
    <configuredResourcePool>
      <id>resgroup-346</id>
      <name>PrimaryResourcePool</name>
      <isValid>true</isValid>
    </configuredResourcePool>
    <configuredDataStore>
      <id>datastore-250</id>
      <isValid>false</isValid>
    </configuredDataStore>
    <configuredHost>
      <id>host-232</id>
      <name>esxi01.mydomain</name>
      <isValid>true</isValid>
    </configuredHost>
  </appliance>
  <appliance>
    <highAvailabilityIndex>1</highAvailabilityIndex>
    <vcUuid>5006d672-fd98-abc4-6383-0be1a977b0db</vcUuid>
    <vmId>vm-745</vmId>
    <haAdminState>up</haAdminState>
    <resourcePoolId>domain-c212</resourcePoolId>
    <resourcePoolName>SecondaryResourcePool</resourcePoolName>
    <datastoreId>datastore-233</datastoreId>
    <datastoreName>PrimaryDatastore</datastoreName>
    <hostId>host-234</hostId>
    <hostName>esxi02.mydomain</hostName>
    <vmHostname>nsx-edge01-1</vmHostname>
    <vmName>nsx-edge01-1</vmName>
    <deployed>true</deployed>
    <cpuReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </cpuReservation>
    <memoryReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </memoryReservation>
    <edgeId>edge-2</edgeId>
    <configuredResourcePool>
      <id>domain-c212</id>
      <name>SecondaryResourcePool</name>
      <isValid>true</isValid>
    </configuredResourcePool>
    <configuredDataStore>
      <id>datastore-233</id>
      <name>PrimaryDatastore</name>
      <isValid>true</isValid>
    </configuredDataStore>
    <configuredHost>
      <id>host-234</id>
      <name>esxi02.mydomain</name>
      <isValid>true</isValid>
    </configuredHost>
  </appliance>
  <deployAppliances>false</deployAppliances>
</appliances>
  1. Trigger the Upgrade.
  2. Repeat the steps 1 to 3, and set deployAppliances to true, with valid datastoreId for both appliances.
The content should be similar to:
<?xml version="1.0" encoding="UTF-8"?>
<appliances>
  <appliance>
    <highAvailabilityIndex>0</highAvailabilityIndex>
    <vcUuid>50068fa9-2d61-395d-a11d-1aee8cf90e9d</vcUuid>
    <vmId>vm-651</vmId>
    <haAdminState>up</haAdminState>
    <resourcePoolId>resgroup-346</resourcePoolId>
    <resourcePoolName>PrimaryResourcePool</resourcePoolName>
    <datastoreId>datastore-233</datastoreId>
    <datastoreName>PrimaryDatastore</datastoreName>
    <hostId>host-232</hostId>
    <hostName>esxi01.mydomain</hostName>
    <vmHostname>nsx-edge01-0</vmHostname>
    <vmName>nsx-edge01-0</vmName>
    <deployed>true</deployed>
    <cpuReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </cpuReservation>
    <memoryReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </memoryReservation>
    <edgeId>edge-2</edgeId>
    <configuredResourcePool>
      <id>resgroup-346</id>
      <name>PrimaryResourcePool</name>
      <isValid>true</isValid>
    </configuredResourcePool>
    <configuredDataStore>
      <id>datastore-233</id>
      <name>PrimaryDatastore</name>
      <isValid>true</isValid>
    </configuredDataStore>
    <configuredHost>
      <id>host-232</id>
      <name>esxi01.mydomain</name>
      <isValid>true</isValid>
    </configuredHost>
  </appliance>
  <appliance>
    <highAvailabilityIndex>1</highAvailabilityIndex>
    <vcUuid>5006d672-fd98-abc4-6383-0be1a977b0db</vcUuid>
    <vmId>vm-745</vmId>
    <haAdminState>up</haAdminState>
    <resourcePoolId>domain-c212</resourcePoolId>
    <resourcePoolName>SecondaryResourcePool</resourcePoolName>
    <datastoreId>datastore-233</datastoreId>
    <datastoreName>PrimaryDatastore</datastoreName>
    <hostId>host-234</hostId>
    <hostName>esxi02.mydomain</hostName>
    <vmHostname>nsx-edge01-1</vmHostname>
    <vmName>nsx-edge01-1</vmName>
    <deployed>true</deployed>
    <cpuReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </cpuReservation>
    <memoryReservation>
      <limit>-1</limit>
      <reservation>0</reservation>
    </memoryReservation>
    <edgeId>edge-2</edgeId>
    <configuredResourcePool>
      <id>domain-c212</id>
      <name>SecondaryResourcePool</name>
      <isValid>true</isValid>
    </configuredResourcePool>
    <configuredDataStore>
      <id>datastore-233</id>
      <name>PrimaryDatastore</name>
      <isValid>true</isValid>
    </configuredDataStore>
    <configuredHost>
      <id>host-234</id>
      <name>esxi02.mydomain</name>
      <isValid>true</isValid>
    </configuredHost>
  </appliance>
  <deployAppliances>true</deployAppliances>
</appliances>


Additional Information

Impact/Risks:
This situation does not impact the data plane.
The NSX Edge is running as normal.