Using the same MAC address for vmnic0 across multiple ESXi hosts in the same cluster will lead to VMFS and vSAN (VMFSD) metadata corruption

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
The full list of symptoms are numerous given what is actually happening to the underlying filesystems. The most commonly reported ones are:

- VMs crashing and/or going inaccessible/orphaned
- VM deployments fail
- ESXi hosts PSODing
- ESXi host log entries similar to the following in /var/log/vmkernel.log :

HBX: 2622: Mismatch between in-memory and on-disk HB at offset: 0x3ec000 on vol '64f92da4-########-####-##########98'.
HBX: 2625: 'a42df964-####-####-####-##########98': HB at offset 4112384 - In-memory version:
[HB state abcdef02 offset 4112384 gen 63355 stampUS 6275570981326 uuid 649ef214-########-####-##########98 jrnl <FB 0> drv 24.82 lockImpl 4 ip 10.88.136.17]
HBX: 2626: 'a42df964-####-####-####-##########98': HB at offset 4112384 - On-disk version:
[HB state abcdef02 offset 4112384 gen 63357 stampUS 6275592283591 uuid 649ef217-########-####-##########98 jrnl <FB 58720267> drv 24.82 lockImpl 4 ip 10.88.136.21]
HBX: 2627: 'a42df964-####-####-####-##########98': HB slot (offset: 0x3ec000) was freed/reacquired by another host on vol '64f92da4-########-####-##########98'.
FS3: 623: Failed to retrieve device name, and it might be unreliable
WARNING: FS3: 629: VMFS volume a42df964-####-####-####-##########98/64f92da4-########-####-##########98 on a42df964-####-####-####-##########98 has been detected corrupted
FS3: 632: While filing a PR, please report the names of all hosts that attach to this LUN, tests that were running on them,
FS3: 637: and upload the dump by `objtool open -u a42df964-####-####-####-##########98; voma -m vmfsd -f dump -d /dev/vsan/a42df964-####-####-####-##########98 -D X`
FS3: 658: where X is the dump file name on a DIFFERENT volume
FS3: 661: Head extent name not resolved: Inappropriate ioctl for device
FS3: 471: FS3HB 2882400002 4112384 63355 6275570981326 649ef214-########-####-##########98
FS3: 476: 0 24 82 4 0 00000000-00000000-0000-000000000000
FS3: 486: 49 48 46 56 56 46 49 51 54 46 49 55 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
FS3: 488: 0 0 0 0 0
WARNING: HBX: 2696: Failed to cleanup VMFS heartbeat on volume64f92da4-########-####-##########98: Failure
WARNING: Vol3: 4336: Error closing the volume: . Eviction fails: Failure

Environment

VMware vSphere ESXi 6.7
VMware ESXi 6.7.x
VMware vSphere ESXi 8.0
VMware vSphere ESXi 7.0

Cause

An ESXi host generates a UUID at each boot time for use with VMFS and VMFSD (vSAN) operations. Here is an example: 649ef214-#########-####-##########98

The UUID is 32 hexidecimal characters with the following layout: (Date/Time in Unix Time)-(CPU TimeStamp Counter)-(Randomly Generated Number)-(MAC Address of vmnic0)

For VMFS/VMFSD operations, this UUID is written to the metadata. When an ESXi host checks that metadata (Heartbeats, journals, file locks, etc) it validates against the last octet of the UUID, which is the MAC address for vmnic0, to understand who the owner is.

When more than a single ESXi host uses the same MAC address for vmnic0 in a cluster, those hosts will start to grab each others locks and replay their filesystem journals. This is where data corruption comes into play.

In the log example shared in the Symptoms section, we can see that two hosts (identified by IPs 10.88.136.17 and 10.88.136.21) are using the same MAC address in the UUID that they use for metadata operations:

[HB state abcdef02 offset 4112384 gen 63355 stampUS 6275570981326 uuid 649ef214-########-####-##########98 jrnl <FB 0> drv 24.82 lockImpl 4 ip 10.88.136.17]
[HB state abcdef02 offset 4112384 gen 63357 stampUS 6275592283591 uuid 649ef217-########-####-##########98 jrnl <FB 58720267> drv 24.82 lockImpl 4 ip 10.88.136.21]

Resolution

While MAC Addresses are supposed to unique, there are ways to override the MAC address given to a physical card. Typically, this is observed with blade server systems as some blade systems allow you to override the hardware MAC address (HPE Blade systems, Cisco UCS, etc). We are now seeing this with non-blade systems as well thru centralized server management software (Dell iDRAC is one example) where a servers profile can be cloned and applied to one or more physical hosts.

You cannot and should not have duplicate MAC addresses in the same environment. Beyond being a best practice, this becomes an even larger problem when vmnic0 for a particular ESXi host has the same MAC address as other ESXi hosts in the same cluster since this MAC address, which is supposed to be unique, is effectively used for VMFS/VMFSD metadata operations. ESXi hosts with the same MAC address for vmnic0 will try to replay other ESXi hosts journals as well as believe that they actually own file locks, VMFS Heartbeats, or journals that aren't actually owned by them. This is an unsupported configuration and will lead to filesystem corruption.

The only resolution to this problem is to ensure the physical NICs in ESXi hosts are using unique MAC addresses. Refrain from cloning and applying server profiles from other physical servers at the server/blade level to avoid a situation where MAC addresses from other servers are cloned and used in the same environment.