SDDC Initiated upgrade fails on hosts utilizing AMD Pensando DPUs
search cancel

SDDC Initiated upgrade fails on hosts utilizing AMD Pensando DPUs

book

Article ID: 391443

calendar_today

Updated On:

Products

VMware SDDC Manager

Issue/Introduction

The ESXi upgrade completes successfully but the host looses network connectivity after the post upgrade reboot. 


The following message is observed in the lcm-debug.log

id":{"value":"com.vmware.vcIntegrity.lifecycle.HostScan.SolutionComponentRemoval"},"originator":{},"time":{"value":"2025-03-17T19:40:44.394Z"},"message":{"fields":{"args":{"list":[{"value":"Pensando(Pensando(4.2.1.3.0-8.0.24533885))"},{"value":"com.vmware.nsxt"}]},"default_message":{"value":"Components Pensando(Pensando(4.2.1.3.0-8.0.24533885)) in Solution com.vmware.nsxt are removed in the image. They will be removed from the host during remediation."},"localized":{"value":{"value":"Components Pensando(Pensando(4.2.1.3.0-8.0.24533885)) in Solution com.vmware.nsxt are removed in the image. They will be removed from the host during remediation."}},"id":{"value":"com.vmware.vcIntegrity.lifecycle.HostScan.SolutionComponentRemoval"},"params":{}},"name":"com.vmware.vapi.std.localizable_message"},"type":{"value":{"value":"WARNING"}},


Environment

VMware Cloud Foundation 5.x

Cause

A known limitation with SDDC Manager where the required NSX Pensando component is incorrectly removed from the vLCM image. 

Resolution

Recovering host from a failed upgrade attempt:

1. Connect to the host's DCUI and attempt to rollback the ESXi bootbank to prior version. 

     KB: Reverting to a previous version of ESXi 6.x, 7.x or 8.x

2. If the upgrade rollback does not restore the host's networking.

  • Remove the failed host from SDDC Manager
  • Reinstall ESXi with the current version of ESXi.
  • Commission the host into SDDC manager.
  • Add the host back into the impacted cluster.

Preemptive Workaround:

1. SSH into the vCenter that manages the hosts with the DPUs.
 
2. List the cluster software solutions with the dcli command. (dcli documentation can be found here Using DCLI with a Credential Store File )
 
To find the cluster ID, login to the vCenter UI, navigate to the cluster, then check the URL in your browser; the cluster ID will be in the format "domain-c<number>
 
dcli com vmware esx settings clusters software solutions list --cluster domain-##

3. Verify the NSX Pensando VIB is missing from the components section. 

Example:

dcli com vmware esx settings clusters software solutions list --cluster domain-c8
 
com.vmware.nsxt:
   components:
  - component: nsx-lcp-bundle
 
   details:
  components:
- component: nsx-lcp-bundle
   display_version:
   vendor: VMware
   display_name: NSX LCP Bundle
 
 
4. Now run a curl command from the vCenter against the NSX-T manager associated with the cluster. We need to record the ID for the corresponding cluster(s). 
 
  Example:

curl -k -X GET https://nsxt-fqdn-or-ip/api/v1/transport-node-collections -u admin
 
{
  "results" : [ {
"compute_collection_id" : "########-####-####-####-##############:domain-c9",
"transport_node_profile_id" : "########-####-####-####-##############",
"resource_type" : "TransportNodeCollection",
"id" : "########-####-####-####-##############",
"display_name" : "########-####-####-####-##############",
"tags" : [ {
  "scope" : "Created by",
  "tag" : "VCF"
} ],
"_system_owned" : false,
"_protection" : "NOT_PROTECTED",
"_create_time" : 1741122283820,
"_create_user" : "admin",
"_last_modified_time" : 1741122283820,
"_last_modified_user" : "admin",
"_revision" : 0
  }, {
"compute_collection_id" : "########-####-####-####-##############:domain-c8",
"transport_node_profile_id" : "########-####-####-####-##############",
"resource_type" : "TransportNodeCollection",
"id" : "########-####-####-####-##############",
"display_name" : "########-####-####-####-##############",
"tags" : [ {
  "scope" : "Created by",
  "tag" : "VCF"
} ],
"_system_owned" : false,
"_protection" : "NOT_PROTECTED",
"_create_time" : 1721235908452,
"_create_user" : "admin",
"_last_modified_time" : 1721235908452,
"_last_modified_user" : "admin",
"_revision" : 0
  } ],
  "result_count" : 2
 
 
5. Sync the nsx-pensando component to the cluster by triggering a profile realization. 
 
curl -k -X POST https://nsx-t-fqdn/api/v1/transport-node-collections/uuidfromstep4?action=retry_profile_realization -u admin
 
6. Verify the NSX pensando component is present on the cluster(s) by running the dcli command again. 
 
dcli com vmware esx settings clusters software solutions list --cluster clusterID
 
Correct result:
 
com.vmware.nsxt:
   components:
  - component: nsx-lcp-bundle
 
  - component: nsx-pensando
 
   details:
  components:
- component: nsx-lcp-bundle
   display_version:
   vendor: VMware
   display_name: NSX LCP Bundle
 
- component: nsx-pensando
   display_version:
   vendor: VMware
   display_name: Pensando
 
  display_version: 4.2.1.3.0-24533885
  display_name: com.vmware.nsxt
 
 
7. Trigger the upgrade directly from the vCenter UI. 
 
 
 
TOKEN=$(curl -X POST -H "Content-Type: application/json" -d '{"username": "<ssoUsername>","password": "<ssoPassword>"}' http://localhost/v1/tokens | jq -r '.accessToken')
 
curl -X POST -H 'Content-type: application/json' -H 'Accept: application/json' -H "Authorization: Bearer $TOKEN" http://localhost/v1/resources/version-syncs -d '{"resourceType":"ESXI"}'