Applications using AVX-512 instructions in a vSAN environment may report data consistency errors
book
Article ID: 367589
calendar_today
Updated On:
Products
VMware vSAN
Issue/Introduction
If an application in a vSAN environment uses AVX-512 instructions, it may produce incorrect results or report data consistency errors, related to the data calculated by using such results.
IBM Db2 application reports checksum errors when running on a vSAN 8.x cluster with AVX-512-compatible CPU hardware.
This article provides information regarding a potential risk of application failure.
Environment
vSAN 8.x
Cause
This issue occurs due to an interaction between the advanced CPU instruction set used by applications and the underlying storage subsystem.
Databases will, by default, use AVX-512 CPU instructions to accelerate its calculation of checksums if supported by the underlying virtual and physical hardware. This interacts with the use of AVX2 instructions in vSAN, causing incorrect checksums to be generated. If AVX-512 is not available or explicitly disabled in the database configuration, Db2 will use AVX2 which does not exhibit the issue.
Resolution
This issue is fixed in ESXi 8.0U2c, build 23825572
VMware by Broadcom strongly recommends all vSAN customers using vSAN 8.x upgrade to vSphere ESXi 8.0U2c.
Additional Information
Workaround
The below workaround only applies to Db2 applications.
Disable use of AVX-512 CPU instruction in Db2 application. This will cause Db2 to fall back to AVX2 instructions which do not create the conditions for the issue to manifest.
Run the following commands:
db2set DB2_CPU_FEATURE_DISABLE=AVX512
db2stop
db2set -all
## Verify that the output includes DB2_CPU_FEATURE_DISABLE=AVX512