In a vSAN 8.0, 8.0U1, or 8.0U2 environment, applications may encounter data inconsistencies or checksum failures when utilizing AVX-512 CPU instructions to process workload data.
Applications that support AVX-512 and may experience these issues include, but are not limited to, the following based on available evidence:
VMware vSAN OSA and ESA on 8.0, 8.0U1 and 8.0U2
Applications can utilize AVX-512 CPU instructions from Intel or AMD to speed up checksum calculations, depending on hardware compatibility. The issue stems from how the specific CPU instructions in AVX-512 interact with the underlying storage subsystem.
Enabling AVX-512 in applications, when they interact with vSAN's AVX-2 instructions, may lead to incorrect checksum results. If AVX-512 is unavailable or explicitly disabled in the applications' configurations, they can instead use AVX-2 instructions, which do not experience this issue.
This issue is fixed in ESXi 8.0U2c, build 23825572.
It’s strongly recommended that all vSAN customers using vSAN 8.0, 8.0U1 or 8.0U2 upgrade to vSAN 8.0U2c or later.
Workaround
To work around this issue, if an upgrade is not immediately possible, VMware recommends disabling the AVX-512 instruction-set at an application-level within potentially-impacted applications.
As an example, the below workaround applies to IBM Db2 software:
Disable use of AVX-512 CPU instruction in Db2 application. This will cause Db2 to fall back to AVX2 instructions which do not create the conditions for the issue to manifest.
Run the following commands:
Please Note: VMware by Broadcom is not the provider of potentially impacted software, and as a result cannot provide instructions to disable AVX-512 in all cases - please contact your application vendor if you are unable to locate a guide to disabling the AVX-512 instruction set and reference this Knowledge Base article.