NFSv4.1 client IO operations fail with 'Input/Output error
SEQUENCE op returns NFS4ERR_RETRY_UNCACHED_REP
This can be confirmed by searching for NFS4ERR_RETRY_UNCACHED_REP in the NFS server logs.
2020-03-29T07:00:47Z : epoch 5e803ebe : h10-170-115-245 : ganesha.nfsd-33[none] [svc_11] 1332 :nfs_rpc_decode_request :DISP :0x7f7788000a80 fd 21 context 0x7f7758000a80
2020-03-29T07:00:47Z : epoch 5e803ebe : h10-170-115-245 : ganesha.nfsd-33[none] [svc_11] 192 :get_gsh_client :HT CACHE :client_mgr cache hit slot 0
2020-03-29T07:00:47Z : epoch 5e803ebe : h10-170-115-245 : ganesha.nfsd-33[::ffff:10.170.110.60] [svc_11] 839 :nfs_rpc_process_request :DISP :Request from ::ffff:10.x.x.60 for Program 100003, Version 4, Function 1 has xid=2334580089
2020-03-29T07:00:47Z : epoch 5e803ebe : h10-170-115-245 : ganesha.nfsd-33[::ffff:10.170.110.60] [svc_11] 687 :nfs4_Compound :NFS4 :COMPOUND: There are 4 operations, res = 0x7f7758000ff0, tag = NO TAG
2020-03-29T07:00:47Z : epoch 5e803ebe : h10-170-115-245 : ganesha.nfsd-33[::ffff:10.170.110.60] [svc_11] 801 :nfs4_Compound :NFS4 :Request 0: opcode 53 is OP_SEQUENCE
2020-03-29T07:00:47Z : epoch 5e803ebe : h10-170-115-245 : ganesha.nfsd-33[::ffff:10.170.110.60] [svc_11] 81 :nfs4_op_sequence :SESSIONS :SEQUENCE session=0x7f7750001620
2020-03-29T07:00:47Z : epoch 5e803ebe : h10-170-115-245 : ganesha.nfsd-33[::ffff:10.170.110.60] [svc_11] 1112 :nfs4_Compound :NFS4 :End status = NFS4ERR_RETRY_UNCACHED_REP lastindex = 0 <-- ERROR for SEQUENCE op
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
VMware vSAN 7.x
This issue occurs due to a NFS client issue where the client is not incrementing the sequence number for subsequent operations.
NFSv4.1 RFC 5661 clearly states that the client should not reuse the sequence number irrespective of the cache flag is set or not set in the previous request. For such a request from the client using the previous sequence id, NFS server in vSAN File Services returns NFS4ERR_RETRY_UNCACHED_REP which is translated to Input/Output error by the client.
To resolve this issue, retry the same operation from the client after failure.
If the issue still persists after a retry then you can remount the NFS share.