Customer is reporting alerts not seen before for BacklogRejectsByPort.
Please clarify what is happening and what may need to be adjusted to alleviate these alerts.
If I understand the message correctly, the setting of 20 conns (connections?) is set too low for the resource called <stc-id>. I'm just not clear on what I need to look at to accurately adjust this or if it is even needed.
Customer wants to know why this occurred and if they need to be concerned about it.
Alert Description
ASMON (TCPIP-TCP(1414)): BacklogRejectsByPort High, 124 conns > 20 conns
Alert History
Created at .................. THU 24-AUG-2023 20.31.44
Last Updated at ............. THU 24-AUG-2023 20.31.44
Number of occurrences ....... 1
Last occurred at ............ THU 24-AUG-2023 20.31.44
Alert Identification
Severity ............ 3 (Medium)
System Name.......... NETMSTR
System Identifier.... NETMSTR
Application ......... TCP/IP Services
Alert Class ......... ASPMHIGH
Class Description ... Address Space Monitor Upper Limit Status
Alert Description
ASMON (TCPIP-TCP(1414)): BacklogRejectsByPort High, 124 conns > 20 conns
Class Description ... Address Space Monitor Upper Limit Status
Resource ............ <stc-id>
Alert Text
Counter attribute BacklogRejectsByPort measures Connections rejected
due to backlog exceeded.
Alert Explanation
A High Value data sample was returned for the BacklogRejectsByPort attribute
of resource ASMON (TCPIP-TCP(1414)) <stc-id>.
The latest sample value or rate is 124 conns.
This alert is raised when a sample value or rate exceeds the specified
high limit of 20 conns.
Release : 12.2
As the alert shows, BacklogRejectsByPort refers to the attribute TCPCONR, which is the number of connections rejected due to backlog exceeded.
Backlog in this case refers to the IBM TCP PROFILE parameter SOMAXCONN that specifies the maximum length for the connection request queue.
Connection requests are coming in to the port faster than they can be completed, so are being rejected once the queue overflows. That event indicates a problem, and the alert would have been defined in the application to catch that.
There is either a problem with masses of connection attempts within a very short time (port scan occurring, or runaway application), or there is a problem with port 1414 not completing connections - whether that is on the TCP/IP or application side.
Questions to help determine cause:
1) did the alert clear and everything is working again, or does the problem persist?
2) Was this the only alert or were other ports/applications impacted as well?
3) Was a new application implemented that may be problematic?