One NFS datastore becomes inaccessible after ESXi upgrade to 7.0.3.
search cancel

One NFS datastore becomes inaccessible after ESXi upgrade to 7.0.3.

book

Article ID: 397522

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • After ESXi upgrade to 7.0.3, after every host reboot, one NFS datastore is seen as inaccessible.
  • The NFS datastore in use is version 4.1 and no NFS v3 is mounted to the impacted host.
  • The datastore that is impacted on the freshly rebooted host is accessible on other hosts that weren't rebooted.
  • If the impacted datastore is unmounted from the host and there are other NFS v4.1 datastores mounted to the host, after the next reboot, another NFS v4.1 mount is impacted in the same way.
  • LACP/LAG is configured in the network.
  • You may see in the /var/run/log/vmkernel.log during the boot time entries similar to the below:

[timestamp] cpu9:2097545)Activating Jumpstart plugin restore-nfs-volumes.
[timestamp] cpu9:2097482)NFS41: NFS41_VSIMountSet:405: Mount server: 1#2.###.##.##1,1#2.###.##.##2,1#2.###.##.##3, port: 2049, path: /oDataStore01, label: oDatastore01, security: 1 user: , options: <none>
[timestamp] cpu9:2097482)StorageApdHandler: 966: APD Handle  Created with lock[StorageApd-0x43173ec03970]
[timestamp] cpu9:2097482)NFS41: NFS41_ConnectionLookup:805: Created new connection for address tcp 1#2.###.##.##1.8.1
[timestamp] cpu11:2097528)Jumpstart plugin vmkapps activated.
[timestamp] cpu9:2097482)NFS41: NFS41_ConnectionLookup:805: Created new connection for address tcp 1#2.###.##.##2.8.1
[timestamp] cpu9:2097482)NFS41: NFS41_ConnectionLookup:805: Created new connection for address tcp 1#2.###.##.##3.8.1
[timestamp] cpu18:2097760)LVM: 17593: One or more LVM devices have been discovered.
[timestamp] cpu1:2097835)vswitch: [nsx@6876 comp="nsx-esx" subcomp="vswitch"]Received event UPLINK LINK STATUS, LAG /-512880901, link UNKNOWN, uplink vmnic1/0x8#####a, link UP
[timestamp] cpu1:2097835)vswitch: [nsx@6876 comp="nsx-esx" subcomp="vswitch"]Received event LAG STATE, LAG lag1/-512880901, link UP, uplink /0x0, link UNKNOWN
[timestamp] cpu1:2097835)vswitch: [nsx@6876 comp="nsx-esx" subcomp="vswitch"]Received event UPLINK LINK STATUS, LAG /-512880901, link UNKNOWN, uplink vmnic0/0x8#####8, link UP
[timestamp] cpu1:2097966)WARNING: SunRPC: 3920: fail all pending calls for client 0x431003401370 IP 1#2.###.##.##1.8.1 (socket disconnected)
[timestamp] cpu23:2097967)WARNING: SunRPC: 3920: fail all pending calls for client 0x4310034018c0 IP 1#2.###.##.##2.8.1 (socket disconnected)
[timestamp] cpu2:2097968)WARNING: SunRPC: 3920: fail all pending calls for client 0x431003401e10 IP 1#2.###.##.##3.8.1 (socket disconnected)
[timestamp] cpu1:2097966)WARNING: SunRPC: 3920: fail all pending calls for client 0x431003401370 IP 1#2.###.##.##1.8.1 (socket disconnected)
[timestamp] cpu23:2097967)WARNING: SunRPC: 3920: fail all pending calls for client 0x4310034018c0 IP 1#2.###.##.##2.8.1 (socket disconnected)
[timestamp] cpu9:2097482)WARNING: NFS41: NFS41FSWaitForCluster:3769: Failed to wait for the cluster to be located: Timeout
[timestamp] cpu9:2097482)WARNING: NFS41: NFS41_FSMount:4888: NFS41FSDoMount failed: Timeout

Environment

VMware vSphere ESXi 7.0.3.

VMware vSphere ESXi 8.0.

VMware vSphere ESXi 8.0.1.

VMware vSphere ESXi 8.0.2.

Cause

With a LAG configuration, there is a delay in the network becoming available on boot. This can cause the NFS 4.1 datastore mount to fail during ESXi boot time, because the mount may be attempted before the LAG connection is available.

The process is retried, but the first datastore attempted to be mounted may be especially impacted in ESXi 7.0.3, because the retry is attempted in an early stage of the boot process.

Resolution

The issue is fixed on ESXi 8.0.3.

 

Workarounds: