ESXi PSOD (purple screen of death) #PF Exception 14 in world tq tcpip4 tcp_timer_keep
search cancel

ESXi PSOD (purple screen of death) #PF Exception 14 in world tq tcpip4 tcp_timer_keep

book

Article ID: 380520

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • ESXi 8.0.X may encounter a Purple Screen of Death (PSOD) during tcpip world operations.

    /var/run/log/LogEFI.log

YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI[2098777]: #PF Exception 14 in world 2098398:tq:tcpip4 IP 0x420021b84d25 addr 0x134
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI[2098777]: PTEs:0x806a8bf023;0x806a8fb023;0x806a8dc023;0x0;
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI[2098777]:
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI[2098777]: Module(s) involved in panic: [tcpip4 Built on: MM DD YYYY]
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)cr0=0x8001003d cr2=0x134 cr3=0x60d000 cr4=0x14216c
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)FMS=06/6a/6 uCode=0xd0003e7
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)frame=0x4539a429bde0 ip=0x420021b84d25 err=0x2 rflags=0x10206
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)rax=0x0 rbx=0x41ffd440b860 rcx=0x0
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)rdx=0xf rbp=0x134 rsi=0x43162da120d0
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)rdi=0x134 r8=0x1 r9=0xffffffffffffffff
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)r10=0x0 r11=0xffffffffffffffff r12=0x43162dfd6f40
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)r13=0x43162da32aa0 r14=0x4 r15=0x134
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI[2098777]: *PCPU14:2098398/tq:tcpip4
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI[2098777]: PCPU 0: SVSVUVUVVVUSVSSSSSVSSSUVVUSSUVVVUVUVUVVVUVUVVVVVVVVVVVUVVVVUVSVV
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)Code start: 0x420021a00000 VMK uptime: HH:MM:SS:##.###
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)0x4539a429bea8:[0x420021b84d25]SPTryLockWork@vmkernel#nover+0x15 stack: 0x42002320955a
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)0x4539a429beb0:[0x4200231989e8]rw_try_wlock@(tcpip4)#<None>+0x39 stack: 0x4539a429bec0
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)0x4539a429bec0:[0x420023209559]tcp_timer_keep@(tcpip4)#<None>+0xba stack: 0xffff
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)0x4539a429bf10:[0x42002319185b]callout_timer@(tcpip4)#<None>+0x1a0 stack: 0x43162dc8cbd8
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)0x4539a429bf60:[0x420021a3a956]VmkTimerQueueWorldFunc@vmkernel#nover+0x38f stack: 0xffffffffffffffff
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)0x4539a429bfe0:[0x4200220d67b2]CpuSched_StartWorld@vmkernel#nover+0xbf stack: 0x0
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)0x4539a429c000:[0x420021b44c6f]Debug_IsInitialized@vmkernel#nover+0xc stack: 0x0
YYYY-MM-DDTHH:MM:SS.#### In(14) LogEFI: cpu14:2098398)base fs=0x0 gs=0x420043800000 Kgs=0x0

Environment

vSphere ESXi 8.0.x

Cause

A rare race condition between the TCP control path and keep alive timer during a disconnect operation might cause ESXi Host crash with PSOD.
The issue occurs under specific network conditions when the timer and disconnect logic overlap.

Resolution

This issue has been partially fixed in Version 8.0U3e.

The upcoming release 8.0.3P07 has rewritten timer code to fully resolve this defect. 

Additional Information