Syslog Log Sink not forwarding all messages under load

book

Article ID: 191826

calendar_today

Updated On:

Products

CA API Gateway API SECURITY CA API Gateway Precision API Monitoring Module for API Gateway (Layer 7) CA API Gateway Enterprise Service Manager (Layer 7) STARTER PACK-7 CA Microgateway

Issue/Introduction

When using the Syslog Log Sink to forward all log messages to a remote syslog server.
When they trigger a large number of messages in a short period of time (about 1000 in ~8 seconds), only about 90% of them actually arrive at the destination syslog server. 
The customer is using a log audit sink based on TCP/SSL(due to the message size being larger than the UDP limit), so we ruled out UDP as a possible reason for the dropped messages.
We were wondering if there is some kind of undocumented throttling on the syslog component of the audit sink that might be causing the syslog messages to not be sent if they pile up too quickly?

Cause

Check the ssg or /var/log/messages log for messages like

Jan 14 03:21:26 rsyslogd-2177: imuxsock begins to drop messages from pid 22992 due to rate-limiting
Jan 14 03:21:28 rsyslogd-2177: imuxsock lost 115 messages from pid 22992 due to rate-limitingword

Environment

Release : 9.4

Component : API GATEWAY

Resolution

The message indicates a rate limit is limiting the about of messages, This is simply the o/s saying you are writing to many messages at the same time we are not going to take all of them (we will drop some). 

TO prevent this imuxsock rate limit you can change the configuration 


1.Edit /etc/rsyslog.conf 

2.Add the following parameters under ""$ModLoad imuxsock # needs to be done just once"" section. 

$SystemLogRateLimitInterval 0 
$SystemLogRateLimitBurst 0 
$IMUxSockRateLimitBurst 0 
$IMUXSockRateLimitInterval 0 
$IMUxSockRateLimitSeverity 7 

3.Restart rsyslog: 

# service rsyslog restart 

More details about the parameters mentioned above: 

$SystemLogRateLimitInterval [number] 
$SystemLogRateLimitBurst [number] 

The SystemLogRateLimitInterval determines the amount of time that is being measured for rate limiting.
By default this is set to 5 seconds.

The SystemLogRateLimitBurst defines the amount of messages, that have to occur in the time limit of SystemLogRateLimitInterval, to trigger rate limiting. Here, the default is 200 messages.
For creating a more effective test, we will alter the default values. 


$IMUXSockRateLimitBurst [number] - equivalent to: RateLimit.Burst, specifies the rate-limiting burst in number of messages. Default is 200. 

$IMUXSockRateLimitSeverity [numerical severity] - equivalent to: RateLimit.Severity, which specifies the severity of messages that shall be rate-limited. 

Root Cause: Those messages means that in 5 seconds, a process sends more than 200 messages to rsyslog. At this point, rsyslog will drop messages if rate limiting is enabled. 

NOTICE: This is a safeguard measure to prevent logs from filling the /var partition. Exercise care if you disable rate limiting, as it might fill your log partition. It is often better to investigate which process is flooding the logs with messages and resolve this issue. Often, an application will be set at a ""debug"" log level, which will cause very verbose logging. If the logging is not needed, please lower the logging level.