VM experiences performance issue while taking snapshot since a long snapshot stun time may occur when virtual machine runs on a datastore backed by NVMe-TCP.
Here is the sample vmware.log for a 2 vmdk VM:
XXX In(05) vcpu-0 - Checkpoint_Unstun: vm stopped for 2687873 us
XXX In(05) vcpu-0 - CPT: vm was stunned for 2806449 us
Storage is connected via NVMe-TCP protocal.
The NVMe-TCP driver sends 2 PUDs of the FUSED compare command and FUSED write command (as 2 packets) to the TCP layer without gap. Due to the way NVMe-oF targets process the FUSED commands, the TCP layer does not send the PDU of the FUSED write command until it receives the ACK signal of the FUSED compare command which can be up to 40 milliseconds. This 40 millisecond delay caused performance issue when taking snapshots.
Note: Latency issue is fixed in the following version by always sending 2 PDUs of FUSED compare command and write command in one TCP packet.
ESXi 9.0 is not affected.