ESXi 8.0 Update 2 upgrade and Firmware update order for vSphere Distributed Service Engine hosts configured with Nvidia BlueField-2 (BF2) DPUs
search cancel

ESXi 8.0 Update 2 upgrade and Firmware update order for vSphere Distributed Service Engine hosts configured with Nvidia BlueField-2 (BF2) DPUs

book

Article ID: 313260

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This article provides guidance for upgrading  ESXi to 8.0 Update 2 (or later versions) and updating Nvidia BlueField-2 DPU firmware on a vSphere Distributed Service Engine hosts. These guidelines are applicable only for the Nvidia BF2 DPU based hosts that are being upgraded to 8.0 Update 2 (or later versions) from ESXi 8.0U1x or prior 8.0 releases ( 8.0GA, 8.0a. 8.0b, 8.0c, 8.0U1, 8.0U1a, 8.0U1c etc.)

Note: These instructions do not apply to hosts where fresh install of vSphere 8.0U2 is being performed.


Symptoms:

On a vSphere Distributed Service Engine hosts configured with Nvidia BF2 DPU and DPU running BF2 NIC Firmware version  greater than 24.33.1246. ( e.g.  24.36.7506), during 8.0 Update 2 upgrade, ESXi rollback can fail and the host configuration cannot be recovered in the event of an upgrade failure.


Environment

VMware vSphere ESXi 8.0.2

Cause

In vSphere 8.0u2, networking driver for Nvidia BF2 DPU has been enhanced to improve reliability of communication channel between the ESXi on the x86 host and the DPU. This change requires specific firmware version and driver compatibility. In an event of upgrade failure, during the rollback process, ESXi on x86 is expected to rollback to previous version of ESXi first, after which rollback of "ESXi on DPU" will be initiated and completed. Rollback implementation relies on availability of communication channel to update the boot bank of "ESXi on DPU"(to rollback ESXi on DPU).

When BF2 DPU running firmware version > 24.33.1246  ( e.g.  24.36.7506), during the rollback mechanism, ESXi on x86 will rollback to prior ESXi version (e.g., 8.0U1), while "ESXi on DPU" is still running 8.0U2 (yet to be rolled back). In this state, communication channel uplink on 8.0U1 ESXi on the x86 is unable to establish communication with 8.0U2 "ESXi on DPU" due to certain limitations of newer firmware version ( > 24.33.1246) operating with two different driver versions.  This communication channel connection failure will cause the rollback to fail.

 

Resolution

For ESXi 8.0 update 2 upgrade on vSphere Distributed Service Engine hosts configured with Nvidia BF2 DPUs , it is recommended to perform ESXi upgrade with Nvidia BF2 DPU running NIC firmware version less than or equal to 24.33.1246 and ARM (UEFI & ATF) firmware version 18.2.0.12580.

Please follow the below mentioned ESXi and DPU firmware upgrade order in the same maintenance window:

  1. Enter the vSphere Distributed Service Engine host into maintenance mode
  2. Make sure BlueField-2 DPU: NIC and ARM firmware’s are not updated and they are still running firmwares applicable for 8.0U1x and older releases. e.g., 8.0U1 versions - BF2 NIC FW: 24.33.1246 and BF2 ARM FW: 18.2.0.12580.
  3. Update ESXi to 8.0 update 2
  4. After successful ESXi upgrade, perform DPU firmware update in the below order:
    1. Update the BF2 DPU ARM (UEFI & ATF) Firmware version required for 8.0U2. e.g., BF2 ARM FW version:  4.0.2.12722.
    2. Update the BF2 DPU NIC Firmware version required for 8.0U2. e.g., BF2 NIC FW version to 24.36.7506
  5. Exit the vSphere Distributed Service Engine host from maintenance mode.

 Note: For fresh install of ESXi 8.0U2 (or newer) on host with DPU, it is required to update BF2 DPU firmware prior to ESXi install.
i. Update BF2 ARM FW to 4.0.2.12722.
ii. Update BF2 NIC FW to 24.36.7506 and perform ESXi install.