Velocloud edge showing "hardware event" and "corrected error" in the message logs
search cancel

Velocloud edge showing "hardware event" and "corrected error" in the message logs

book

Article ID: 385570

calendar_today

Updated On:

Products

VMware VeloCloud SD-WAN

Issue/Introduction

An HA pair may consistently fail over as a result of this error. 

 

 

From the message logs, we see hardware event for corrected error. 

 

./var/log/messages:2024-12-22T12:18:53.047 INFO   kern   kernel:[53356.224312] mce: [Hardware Error]: Machine check events logged
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: Hardware event. This is not a software error.
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: MCE 0
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: CPU 5 BANK 1 TSC 6ae04e474166
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: ADDR 79b1af080
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: TIME 1734869931 Sun Dec 22 12:18:51 2024
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: MCG status:
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: MCi status:
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: Corrected error
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: Error enabled
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: MCi_ADDR register valid
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: Threshold based error status: green
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: MCA: corrected filtering (some unreported errors in same region)
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: Generic CACHE Level-2 Generic Error
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: STATUS 942000400004110a MCGSTATUS 0
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: MCGCAP c09 APICID 14 SOCKETID 0
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: MICROCODE 34
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: CPUID Vendor Intel Family 6 Model 95

 

MCE errors are " Corrected error" , in ideal case it should be fine but since it is continuously reported across different addresses, this device needs to be RMA'ed

 

2024-12-22_19-11-15$ grep -r "Hardware event" | grep 2024-12
./var/log/messages:2024-12-22T12:18:51.932 ERR    user   mcelog: Hardware event. This is not a software error.
./velocloud/log/messages.165:2024-12-01T17:51:25.067 ERR    user   mcelog: Hardware event. This is not a software error.
./velocloud/log/messages.166:2024-12-07T20:44:13.995 ERR    user   mcelog: Hardware event. This is not a software error.
./velocloud/log/messages.166:2024-12-08T23:29:56.671 ERR    user   mcelog: Hardware event. This is not a software error.
./velocloud/log/messages.166:2024-12-16T21:53:41.090 ERR    user   mcelog: Hardware event. This is not a software error.
./velocloud/log/messages.166:2024-12-17T06:16:56.312 ERR    user   mcelog: Hardware event. This is not a software error.
./velocloud/log/messages.166:2024-12-22T12:18:51.932 ERR    user   mcelog: Hardware event. This is not a software error.

Resolution

Hardware error. RMA required for the device.