Latency issues observed for vSAN DB VMs.
search cancel

Latency issues observed for vSAN DB VMs.

book

Article ID: 426346

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms :

  • Archive database logs are being transferred between two database servers located in different vCenter Servers and clusters.

  • The replication process is taking an unusually long time to complete.

  • More than 1000 files are queued and waiting to be applied due to high latency.

Validation Steps : 

Bandwidth issues shows between ESXi host during network proactive test 

  Path :  Cluster > Monitor Tab > vSAN > Proactive Tests > Network Performance Test

 

Environment

VMware VSAN 8.x

Cause

  • Network instability between ESXi hosts resulted in packet loss and bandwidth degradation.This network behavior caused increased latency, directly impacting database VM performance and delaying archive log replication between database servers

Cause validation 

  • Packet loss observed consistently between multiple hosts
  • Continuous ping test command (Using vsan network adpater)  :   vmkping -I vmk2 ##### -d -s 8972 -c 1000 -i 0.01

    ESXI host 3 to 4 

      8980 bytes from ######: icmp_seq=993 ttl=64 time=0.092 ms
      8980 bytes from #####: icmp_seq=994 ttl=64 time=0.090 ms
      8980 bytes from ######: icmp_seq=996 ttl=64 time=0.126 ms

      --- ###### ping statistics ---
      1000 packets transmitted, 886 packets received, 11.4% packet loss
      round-trip min/avg/max = 0.081/0.101/0.474 ms

   ESXI host 2 to 4 

    8980 bytes from ######: icmp_seq=993 ttl=64 time=0.098 ms
    8980 bytes from ######: icmp_seq=994 ttl=64 time=0.104 ms

    ---###### ping statistics ---
    1000 packets transmitted, 876 packets received, 12.4% packet loss
    round-trip min/avg/max = 0.081/0.109/3.959 ms

 

Resolution

 

  • Engage the network team to perform a detailed investigation of the physical network infrastructure.

  • Verify the following on upstream switches and interfaces:

    • CRC and physical layer errors

    • Packet drops and error counters

    • Link health and speed/duplex consistency

    • MTU configuration consistency across ESXi hosts and network devices

  • Resolve any identified network issues to restore stable connectivity and reduce latency for vSAN database workloads.