NFS datastores experiences intermittent APD with retrying synchronous write operations
/var/run/log/vmkernel.log:
2024-07-31T05:04:10.727Z cpu0:2098751)NFS: 6284: Status:No connection. Retrying synchronous write I/O 1 of 25 times
2024-07-31T05:04:10.727Z cpu34:2100050 opID=dc6bca1b)World: 12077: VC opID lxxewe5x-1457912-auto-v8xl-h5:70078357-a5-02-8f-c2d6 maps to vmkernel opID dc6bca1b
2024-07-31T05:04:10.727Z cpu34:2100050 opID=dc6bca1b)SunRPC: 3291: Synchronous RPC cancel for client 0x430eddc01540 IP ##.##.##.##.## proc 1 xid 0xa9446181 attempt 1 of 3
2024-07-31T05:04:10.727Z cpu12:13085698)NFS: 6284: Status:No connection. Retrying synchronous write I/O 1 of 25 times
2024-07-31T05:04:10.727Z cpu54:2098211)StorageApdHandler: 1191: APD start for 0x431acd202b50 [b773ed96-8a62270e]
2024-07-31T05:04:10.727Z cpu11:2097542)StorageApdHandler: 408: APD start event for 0x431acd202b50 [b773ed96-8a62270e]
2024-07-31T05:04:10.727Z cpu11:2097542)StorageApdHandlerEv: 110: Device or filesystem with identifier [b773ed96-8a62270e] has entered the All Paths Down state.
2024-07-31T05:04:29.439Z cpu35:2098211)StorageApdHandler: 1304: APD exit for 0x431acd202b50 [b773ed96-8a62270e]
2024-07-31T05:04:29.439Z cpu11:2097542)StorageApdHandler: 501: APD exit event for 0x431acd202b50 [b773ed96-8a62270e, 0]
2024-07-31T05:04:29.439Z cpu11:2097542)StorageApdHandlerEv: 117: Device or filesystem with identifier [b773ed96-8a62270e] has exited the All Paths Down state.
/var/run/log/vobd.log:
2024-07-31T05:04:10.727Z: [APDCorrelator] 11051063865764us: [vob.storage.apd.start] Device or filesystem with identifier [b773ed96-8a62270e] has entered the All Paths Down state.
2024-07-31T05:04:10.727Z: [APDCorrelator] 11051208970053us: [esx.problem.storage.apd.start] Device or filesystem with identifier [b773ed96-8a62270e] has entered the All Paths Down state.
2024-07-31T05:04:29.439Z: [APDCorrelator] 11051082577551us: [vob.storage.apd.exit] Device or filesystem with identifier [b773ed96-8a62270e] has exited the All Paths Down state.
2024-07-31T05:04:29.439Z: [APDCorrelator] 11051227682174us: [esx.clear.storage.apd.exit] Device or filesystem with identifier [b773ed96-8a62270e] has exited the All Paths Down state.
VMware vSphere ESXi 6.x
VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x
The error "Synchronous RPC cancel for client" in the context of NFS typically indicates an issue with the communication between the NFS client and server.
This often occurs when the client abruptly terminates a request or the server encounters an unexpected error during the RPC call.
VMware ESXi host is unable to establish a connection to the NFS server for a write operation.
1. Verify network connectivity between the client and server using ping and other network diagnostic tools.
2. Check for network congestion that might be causing packet loss or delays.
3. Ensure that necessary ports (typically port 111 for RPC and 2049 for NFS) are open in firewalls on both the client and server.
4. Check the NFS server configuration for any issues like incorrect export paths, permissions, or quotas.
5. Monitor the NFS server for any errors or instability that could be affecting RPC calls.
RPC (Remote Procedure Call): A protocol that allows a program on one computer to execute a program on another computer.
Synchronous RPC: A type of RPC where the client waits for a response from the server before proceeding.