vSphere Bitfusion Support Guide
search cancel

vSphere Bitfusion Support Guide

book

Article ID: 322304

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Bitfusion is has now reached End of Availability. It will have support until May 5th 2025. More can be read about it here:

End of Availability (EOA) for vSphere Bitfusion (91138) (vmware.com)

VMware vSphere Bitfusion is a VMware fully supported product.  Bitfusion works with the hardware and software stack of AI/ML environments and applications. In bullet form, this stack might look like the following:

  • Application (often Python)
  • Framework (examples include TensorFlow and PyTorch)
  • AI/ML and other libraries (examples include CuDNN, CUBLAS, CUDA runtime driver)
  • CUDA driver (libcuda.so)
  • Hardware driver (NVIDIA driver, nvidia.ko. provides an interface, on Linux, called Resource Manager)
  • OS
  • GPU (example: NVIDIA T4)

Bitfusion has a client-server architecture.  This page discusses Bitfusion lifecycle policy with Bitfusion server and client, and the supported configurations in the AI/ML software and hardware stack. It also gives guidance for a customer to effectively locate a problem for which to get support from the H/W accelerator vendor or VMware.

Note: Bitfusion is compatible with CUDA. There is not a certification suite provided by NVIDIA to certify CUDA compatibility, but VMware do test Bitfusion with the CUDA samples and the failures tend to only be in areas that aren’t used for ML applications e.g. Graphics Interop.  Otherwise, there could be a Bitfusion bug that VMware can work on a fix.

Resolution

Bitfusion Lifecycle Policy

VMware will offer 2 years of support for Bitfusion server from the general availability of a new Release as per the Support and Subscription Terms and Conditions. Starting with Bitfusion 2.0, every numbered release will be a new Release of Bitfusion represented by a change to the x.y version (e.g., Bitfusion 2.0>>2.5 or 2.0>>2.5).    In general, Bitfusion server of newer versions support Bitfusion client of earlier versions.  If Bitfusion server and client have the same version, VMware will offer 2 years of support for the configuration from the general availability of this release.  If Bitfusion client has earlier versions, VMware will offer support based on the  Bitfusion Compatibility and Interoperability guide .
 

Supported Configurations in the AI/ML Software and Hardware Stack

VMware will offer support for configurations that are described in the Bitfusion Compatibility and Interoperability guide under the conditions of:

  1. Bitfusion server is within the 2 years lifecycle
  2. Each component is within its own lifecycle, e.g., GPU, CUDA driver.


Quick Problem Location

When a customer runs into a problem, the customer usually needs to decide whether the problem is caused by H/W accelerator or by Bitfusion or by components interoperability in the stack. Below are steps to quickly locate the problem:

  1. Check the Bitfusion Compatibility and Interoperability guide to make sure the configuration is supported.
  2. Test the workload in the same environment but without Bitfusion, i.e., run the workload with H/W accelerator directly without invoking Bitfusion.  If the problem occurs, it’s most likely a problem you can get support from your H/W accelerator vendor, e.g., Nvidia. If the problem disappears, it’s a problem you can get support from VMware for Bitfusion. 
  3. To get support from VMware, check the Bitfusion Lifecycle Policy discussed above to make sure the Bitfusion server, client and other components are within their lifecycle. Otherwise, it’s suggested to update to newer versions.