Transaction timeout errors in BrightmailLog.log when managing remote SMG scanners
search cancel

Transaction timeout errors in BrightmailLog.log when managing remote SMG scanners

book

Article ID: 160911

calendar_today

Updated On:

Products

Messaging Gateway

Issue/Introduction

 When managing some Symantec Messaging Gateway (SMG) scanners, you observe that some scheduled tasks or audit log queries fail and leave a timeout error in the BrightmailLog.log

May 13 2014 08:47:31 [BrightmailScheduler_Worker-26] [AgentEvent] WARN - Connection timed out : 10.11.12.13 
May 13 2014 08:47:31 [BrightmailScheduler_Worker-26] [AgentHelper] ERROR - An agent error has occurred in the following code path: 
java.lang.Exception 
May 13 2014 08:47:31 [BrightmailScheduler_Worker-26] [AgentHelper] ERROR - --- Host Name: 10.11.12.13 
May 13 2014 08:47:31 [BrightmailScheduler_Worker-26] [AgentHelper] ERROR - --- Agent Port Number: 41002 
May 13 2014 08:47:31 [BrightmailScheduler_Worker-26] [AgentHelper] ERROR - The response object is null. 
May 13 2014 08:47:31 [BrightmailScheduler_Worker-26] [ScriptHelper] ERROR - com.symantec.smg.controlcenter.BrightmailException: The Agent running on 10.11.12.13 is temporarily unreachable. Please check the specified host. ; nested exception is: 
java.net.SocketException: Connection timed out 

Environment

Messaging Gateway
 

Cause

Due to the way the SMG Control Center manages connections to the SGM Scanners, some firewalls or other network devices may drop or reset connections which go idle for a significant period of time. When this occurs the next transaction assigned to the dropped or reset connection will fail with either a timeout or network error, depending on how the idle connection was handled by the intervening network hardware.

Resolution

This issue has been addressed with Messaging Gateway 10.6.3. Long lived TCP connections between the Control Center and Scanner hosts now have the TCP keep alive properly set on the connection

Workaround

Ensure that idle TCP connections to port 41002 are not dropped by network hardware for at least 60 minutes as this will reduce the frequency with which a connection is idle long enough to be timed out.

This issue is being investigated by Symantec product engineering and may be addressed via changes to the software in a later release.