Frequent "Path redundancy to storage device degraded" events reported on vCenter
search cancel

Frequent "Path redundancy to storage device degraded" events reported on vCenter

book

Article ID: 391302

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms : 

  • Frequent "Path redundancy to storage device degraded" Events are seen on vCenter.
  • ESXi host may go unresponsive.
  • Virtual machine goes in hung state and may not be able to perform the operations like power on or off.

Validation Step:

  • In the var/run/log/vobd.log file, similar entries are seen:

YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6182941357us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device eui.################################ degraded. Path vmhba6:C0:T19:L0 is down. Affected datastores: Unknown.
YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6182942941us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device eui.################################ degraded. Path vmhba6:C0:T19:L1 is down. Affected datastores: Unknown.
YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6182944535us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device eui.################################ degraded. Path vmhba6:C0:T19:L2 is down. Affected datastores: Unknown.

 

  • VM power off operation gets stuck and queued up.
2025-06-06T10:05:29.568Z Db(167) Hostd[2102696]: [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/########-########-####-############/VM/VM.vmx opID=########-########-auto-#####-h5:########-32-ff-#### sid=######## user=vpxuser:VSPHERE.LOCAL\Administrator] PowerOff request queued
2025-06-06T10:05:29.570Z In(166) Hostd[2102688]: [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/########-########-####-############/VM/VM.vmx] Upgrade is required for virtual machine, version: 15
2025-06-06T10:05:36.903Z Wa(164) Hostd[2102684]: [Originator@6876 sub=IoTracker] In thread 2102680, fopen("/vmfs/volumes/########-########-####-############/VM/VM.vmx") took over 392 sec.

Environment

VMware vSphere ESXi 6.x
VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x

Cause

  • The driver's (VMBHA) status flickering caused the paths to drop intermittently.

Cause validation:

  • Verify the paths configured for the affected datastore using the command : "esxcfg-mpath -b -d naa.################################ ".

esxcfg-mpath -b -d naa.################################
vmhba4:C0:T1:L0 LUN:0 state:active fc Adapter: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##  Target: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##
vmhba4:C0:T2:L0 LUN:0 state:active fc Adapter: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##  Target: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##
vmhba4:C0:T19:L0 LUN:0 state:active fc Adapter: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##  Target: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##
vmhba4:C0:T0:L0 LUN:0 state:active fc Adapter: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##  Target: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##
vmhba6:C0:T2:L0 LUN:0 state:active fc Adapter: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##  Target: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##
vmhba6:C0:T0:L0 LUN:0 state:active fc Adapter: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##  Target: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##
vmhba6:C0:T1:L0 LUN:0 state:active fc Adapter: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##  Target: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##
vmhba6:C0:T19:L0 LUN:0 state:active fc Adapter: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##  Target: WWNN: ##:##:##.##:##:##:##:## WWPN: ##:##:##.##:##:##:##:##

  • In the above example, the datastore is configured with 8 paths: 4 paths use vmhba4 and the other 4 use vmhba6.

  • Verify if the affected paths are configured using a specific VMHBA.

    var/run/log/vobd.log

    a) Logs indicating Paths Going Down:
    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6183001020us: [vob.scsi.scsipath.pathstate.deadver2] scsiPath vmhba6:C0:T19:L0 changed state from on (device ID: eui.################################ )
    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6182941357us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device eui.################################ degraded. Path vmhba6:C0:T19:L0 is down. Affected datastores: Unknown.
    YYYY-MM-DDTHH:MM.SSSZIn(14) vobd[2098149]:  [scsiCorrelator] 6183001026us: [vob.scsi.scsipath.pathstate.deadver2] scsiPath vmhba6:C0:T0:L0 changed state from on (device ID: eui.################################ )
    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6182941809us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device eui.################################ degraded. Path vmhba6:C0:T0:L0 is down. Affected datastores: Unknown.
    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6183001029us: [vob.scsi.scsipath.pathstate.deadver2] scsiPath vmhba6:C0:T1:L0 changed state from on (device ID: eui.################################ )
    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6182942181us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device eui.################################ degraded. Path vmhba6:C0:T1:L0 is down. Affected datastores: Unknown.
    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6183001031us: [vob.scsi.scsipath.pathstate.deadver2] scsiPath vmhba6:C0:T2:L0 changed state from on (device ID: eui.################################ )
    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6182942542us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device eui.################################ degraded. Path vmhba6:C0:T2:L0 is down. Affected datastores: Unknown.

    b) Logs indicating Paths being restored.
    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6216943405us: [esx.clear.storage.redundancy.restored] Path redundancy to storage device eui.################################ (Datastores: Unknown) restored. Path vmhba6:C0:T19:L0 is active again.

    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6216943934us: [esx.clear.storage.redundancy.restored] Path redundancy to storage device eui.################################ (Datastores: Unknown) restored. Path vmhba6:C0:T0:L0 is active again.
    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6216944373us: [esx.clear.storage.redundancy.restored] Path redundancy to storage device eui.################################ (Datastores: Unknown) restored. Path vmhba6:C0:T1:L0 is active again.
    YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098149]:  [scsiCorrelator] 6216944804us: [esx.clear.storage.redundancy.restored] Path redundancy to storage device eui.################################ (Datastores: Unknown) restored. Path vmhba6:C0:T2:L0 is active again.

    In the above log example, the affected paths are only from vmhba6.

  • The /var/run/log/vmkernel.log file indicates that the vmhba6 status is flapping.

    YYYY-MM-DDTHH:MM.SSSZ In(182) vmkernel: cpu93:2101535)qlnativefc: vmhba6(99:0.1): qlnativefcAsyncEvent:847:NVMe report link down
    YYYY-MM-DDTHH:MM.SSSZIn(182) vmkernel: cpu93:2107156)qlnativefc: vmhba6(99:0.1): qlnativefcAsyncEvent:760:NVMe report link up.

  • Verify the link stats of the affected VMHBA using the command : esxcli storage san fc events get

    esxcli storage san fc events get

    FC Event Log
    ------------
    YYYY-MM-DD HH:MM:SS.ms [vmhba6] LINK DOWN
    YYYY-MM-DD HH:MM:SS.ms [vmhba6] LINK UP
    YYYY-MM-DD HH:MM:SS.ms [vmhba6] LINK DOWN
    YYYY-MM-DD HH:MM:SS.ms [vmhba6] LINK UP
    YYYY-MM-DD HH:MM:SS.ms [vmhba6] LINK DOWN
    YYYY-MM-DD HH:MM:SS.ms [vmhba6] LINK UP

    From the above example, it is confirmed that the VMHBA6 Link status is flapping, resulting in the paths going down intermittently.

Resolution

As a workaround you may reboot the ESXi host to improve the hostd performance.

Involve the switch and hardware vendor to validate the health of the affected drivers and the ports.

Additional Information