Script to clean WSS SyncAPI corrupted files
search cancel

Script to clean WSS SyncAPI corrupted files

book

Article ID: 244453

calendar_today

Updated On:

Products

Cloud Secure Web Gateway - Cloud SWG

Issue/Introduction

The WSS Splunk app shows several corrupted files under /opt/splunk/etc/apps/TA-SymantecWebSecurityService/bin .

These files remain under the bin folder and are not automatically removed.

Environment

Web Security Service

SyncAPI

Cause

The corrupt files could be caused due to a break in the connection between either side (client/in-between/server).

The WSS Splunk transfer agent (TA) has built-in recovery to get the data on the next invocation. 

When there is a break in connection, The WSS Splunk TA will re-use the stored token on the next invocation which signals the WSS SyncAPI to return data from a specific point in time.

Both WSS Splunk TA and WSS SyncAPI are built in a way to account for this type of situation.

scwss-poll.log:
$SPLUNK_HOME/etc/var/log/scwss/scwss-poll.log

2022-06-10 11:02:32,689 INFO 140363035563840 - SWSS: Starting data collection...
2022-06-10 11:02:32,689 INFO 140363035563840 - Invoking API Request at 2022-06-10 11:02:32
2022-06-10 11:02:37,893 INFO 140363035563840 - Response received with status code 200
2022-06-10 11:03:11,632 INFO 140363035563840 - File name is cloud_archive_220610150235_stash_ta_scwss_logs.zip, size is 27.741273880004883 megabytes
2022-06-10 11:03:11,633 INFO 140363035563840 - File cloud_archive_220610150235_stash_ta_scwss_logs.zip downloaded from the API in 38 seconds
2022-06-10 11:03:11,639 ERROR 140363035563840 - SWSS: Received corrupted archive from WSS, will retry on the next invocation.
2022-06-10 11:03:11,639 INFO 140363035563840 - Time taken to run the script is 38 seconds

2022-06-10 11:07:57,603 INFO 139736779204416 - Script starting invocation at 2022-06-10 11:07:57
2022-06-10 11:07:57,622 INFO 139736779204416 - SWSS: Starting data collection...
2022-06-10 11:07:57,623 INFO 139736779204416 - Invoking API Request at 2022-06-10 11:07:57
2022-06-10 11:08:03,110 INFO 139736779204416 - Response received with status code 200
2022-06-10 11:09:06,115 INFO 139736779204416 - File name is cloud_archive_220610150800_stash_ta_scwss_logs.zip, size is 52.350932121276855 megabytes
2022-06-10 11:09:06,116 INFO 139736779204416 - File cloud_archive_220610150800_stash_ta_scwss_logs.zip downloaded from the API in 68 seconds
2022-06-10 11:09:06,120 INFO 139736779204416 - File given to batch input reader at 2022-06-10 11:09:06
2022-06-10 11:09:06,120 INFO 139736779204416 - SWSS: Completed data collection
2022-06-10 11:09:06,120 INFO 139736779204416 - Time taken to run the script is 68 seconds

Resolution

To remove these files, it is recommended to set up a scheduled CRON job to delete the files.

The following syntax can be used to delete log files over 7 days and is set to execute daily at midnight (00:00:00).

Create the script

Make sure to save the file as delete_wss_ta_old_files.sh

#!bin/sh

find /opt/splunk/etc/apps/TA-SymantecWebSecurityService/bin -type f -mtime +7 -exec rm {} +

Add the Cron job

crontab -e

Add the following line to the very end of the file:

0 0 * * * . /path_to_your_script/delete_wss_ta_old_files.sh

Note: The new cron job will be executed every day at midnight.