SGOS Unresponsive after "commands delete-logs"
search cancel

SGOS Unresponsive after "commands delete-logs"

book

Article ID: 260279

calendar_today

Updated On:

Products

ASG-S500

Issue/Introduction

Customer encountered a complete unresponsive situation on a particular day, from 12:54 PM UTC ( the script took 7 minutes to execute because the Access Log was reporting "Log manager: stopped due to log full" ).

access-log
edit log ArcSight
commands delete-logs
exit
edit log XXXXXXX
commands delete-logs
exit
exit

It is clearly not expected for the appliance to become completely Unresponsive with CPU going Up to 80% and the Memory going Up to 60% when we delete an access log.

Environment

Release : 7.3.11.3

Cause

Investigating the sysinfo log for the reported appliance, the reportedly logged high CPU utilization isn't seen.

Further checks show the below events, linked with "ArcSight".

SGARS: ProxySG Diagnostic

Time    Count/s    Message
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.
xxxx xxx   xx xxxx  xx:xx:xx    1    Access Log (ArcSight): Log uploading failed.  Remote filename: xxxxxxxxxxxxxxxxxxx size: 0 KB.

Trigger: Issues with access log uploads  

If the ProxySG is configured to upload access logs to a server and there is an issue with the upload, the result could be high CPU in Misc. From what we see in the logs, the failed access logs upload caused the sudden rise in the CPU utilization.

Resolution

While it isn't expected to see the CPU spike happen again, without the access log upload failures happening, should this happen again, it's recommended to check the event logs on the Edge SWG (ProxySG) to see if there is an issue with the access log uploads.  Also use the test upload button in the access log configurations to see whether the upload works or not.  If a problem is found, verify the configurations on the ProxySG for the access log upload.

Ref. doc.: https://knowledge.broadcom.com/external/article?legacyId=TECH242540 

For the recommended access log upload client settings, please refer to the guidance in the docs below: 

Editing Upload Clients

Editing the FTP Client

 

Note: This was not a bug but a high CPU utilization caused by the numerous failed access log uploads, which used up so much of the CPU process, thereby negatively impacting the availability of sufficient CPU resource for the processing of other requests.

Also, having read some "ArcSight" community resource, we read that ESM resources or high event throughput in ESM or poorly configured content, in ArcSight, can impact CPU usage. You may want to explore this as well.

Ref.: https://community.microfocus.com/cyberres/arcsight/f/arcsight-discussions/339065/esm-high-cpu-utilization 

Ensuring that the "ArcSight" is set up to prevent possible triggers for high CPU utilization, and keeping with the recommended access upload client configurations/settings on the ProxySG, would prevent this kind of high CPU utilization from happening. 

Deleting the logs on the Proxy, whenever the logs get filled up, and with the log upload failures happening, as shown above, is a recommended workaround.

Where this consistently recurs, a possible R&D engagement would be required, and with relevant log data.