Service Engines may crash when DNS over TCP when payload is large (16k) sent towards the client upon receiving an acknoledgement when lossy network environment.
The crash stack trace will include the following function(s) present in initial #0 method calls:
#0 panic
#2 tcp_output_full
#3 tcp_output
Sample StackTrace(s):
#0 panic (fmt=fmt@entry=0x557639c4ebb3 "%s: beyond sb")
#1 0x00005576398539dd in sbsndptr (sb=sb@entry=0x55763eb57c14, off=off@entry=12330, len=len@entry=1370, moff=moff@entry=0x7ffdee72a02c) at
#2 0x00005576397a0ca1 in tcp_output_full (tp=0x55763e843ba0) at
#3 0x00005576397a2a9d in tcp_output (tp=0x55763e843ba0) at
#4 0x0000557639770646 in sosend_generic (so=<optimized out>, addr=<optimized out>, top=<optimized out>, control=0x0, flags=<optimized out>) at
To investigate further, you can review the latest stack traces from the Controller or SE by accessing the following path:
CLI:
Login to Controller via ssh and run this command.Please note you have to replace the name of se_dp file here.
root@<Controller ip>:# cat /opt/avi/archive/stack_traces/<se_dp.timestamp>.stack_trace
UI:
Navigate to Administration > Support > Crash Reports > Expand the latest crash file.
Affects Version(s):
22.1.x
30.1.x
30.2.1, 30.2.2, 30.2.3
31.1.1
This issue is cause when SE received DNS over TCP data. Client partially ACK'd, leading SE to clear its sender buffer without adjusting for suspected last-mile packet drop.
For the crash to occur there are a lot of conditions that must be met:
1. DNS over TCP
2. A very large response coming from a backend (in the crash we saw the response to be at least 16K bytes)
3. The TCP connection between client and the SE is in congestion recovery state. This means there are network drops and the connection is recovering from the drops now.
Note: This particular problem is not applicable to Non-DNS L4 VS or L7 VS
Please upgrade the system to the fix version.
AV-185290: Defensive fix to prevent SE from crashing on DNS over TCP when payload is large in a lossy network environment
Fix Version(s): 30.2.4, 31.1.2 & 31.2.1
Workaround(s):
Workaround:
Change the application profile of DNS Virtual Service to Application Profile: System-L4-Application
Caveat: With workaround 1, you'll lose DNS-related logs.