Not able to view Summary pages in the console, or the Meta partition is over 84% used

book

Article ID: 168333

calendar_today

Updated On:

Products

Security Analytics

Issue/Introduction

How to manually delete old index files on Security Analytics (Solera)?

On all Security Analytics Appliances, there is a partition that stores the meta data or index files.  These files are stored in the /var/lib/solera/meta partition.  There is an automatic garbage collection process that will start to clean up the server once it reaches 83.3% usage.  Once this happens, the garbage collection process will begin to delete meta data until the usage falls down to 80%.  From the customer side, this looks like the /var/lib/solera/meta partitions usage will fluctuate between 80 and 83% usage.  This is completely normal.  However, if usage rises above 84 or 85%, there is a good chance there is something corrupt and the partition will eventually fill up causing all sorts of indexing and GUI issues.

Cause

Some of the errors you may see in /var/log/messages are:

Compressed index creation failed /dev/shm/var/lib/solera/meta1/flows/aggregate_ssl_validity_not_after_hooks/1894/17542960-17542969.sflt : No space left on device (28)

This message will be repeated over and over and over.

You can confirm that the partition is full using the df -h command.  Use% should never reach above 84% for /var/lib/solera/metaX

Filesystem Size Used Avail Use% Mounted on 
/dev/sda4 4.9G 2.1G 2.6G 45% / 
/dev/sda6 2.0G 240M 1.6G 13% /gui 
/dev/sda5 4.9G 138M 4.5G 3% /ds 
/dev/sda2 68G 11G 54G 16% /var 
/dev/sda1 1.5G 12M 1.4G 1% /boot 
tmpfs 63G 661M 63G 2% /dev/shm 
/dev/sdc1 21T 21T 0 100% /pfs 
/dev/sdb1 9.6T 9.6T 0 100% /var/lib/solera/meta1 
/dev/sdb2 9.6T 9.6T 0 100% /var/lib/solera/meta2 
/dev/sda3 2.7T 4.1G 2.5T 1% /home 
gaugefs 21T 21T 0 100% /etc/solera/flows

Resolution

To recover from this, you first need to manually delete index files that are no longer needed.  But, you don't want to delete index files for packet data that you still have physically on disk.  To determine your current data window, you can log in to the web GUI and on the Capture Summary page look at the graph and identify the white area, light pink, and dark pink areas.  White indicates you have packet data available.  Light pink indicates you have meta data (index data), but packet data has already been overwritten.  Dark pink indicates you neither have packet data or meta data.  

If for some reason the GUI cannot be reached (which is also a symptom of this isssue), use the dscapture --status command to see what the current packet data window is.  Notice in the example below, there is live packet data between Wed, Dec 10th and Mon, Jan 12th.

eth2 start/end 00000000->00005FB3 time start/end 54889AB7/54B4237F
start: Wed Dec 10 12:10:47 2014 end: Mon Jan 12 12:41:51 2015
   slots-24500 elements-2111703813 size-1414426088008 bytes
   active slot 24499/0xffff881b3a200000 packets-54311 ringbufs-255
   total_bytes-26924732 metadata-3041416


If you want to see the beginning of the light pink area or the oldest index data on the appliance from the command line, use this command:

walk_space_table_journal | head -4

The output of this command will look similar to the following:

[[email protected] ~]# walk_space_table_journal | head -4
Space table journal name:    /var/lib/solera/meta1/space_table_journal_v3
Entries:            24516
---- Space table journal contents ----
Slot 0 start Wed Oct 10 12:10:47 2014 (1418238647) end Wed Oct 10 12:39:18 2014 (1418240358) iface 3
 

It is recommended to start at a point in time older than the oldest meta data.  If that doesn't clear out enough space, you can start deleting newer files.  But you should never have to delete index files for which you still have live packet data.  

First you much change directory to the /var/lib/solera/meta directory:

cd /var/lib/solera/meta

If you have multiple /var/lib/solera/meta directories (i.e. meta1, meta2, etc.) you will need to cd to each of those directories and run the same command.

IMPORTANT:  Please confirm that you are in the /var/lib/solera/meta/ directory structure.  This command does a recursive subdirectory search and deletes all files that were modified NN days ago or older.  If you are in the wrong directory, you can potentially cause irreversible damage to the server resulting in a complete loss of data, thus requiring a reinstall from scratch.  Warning that this command could take a very long time to complete depending on how much data is being deleted.

The command to run is as follows:

find . -mtime +NN -type f -exec rm {} \;

Where NN is the number of days in the past you would like to begin deleting meta data.  For example, if you wanted to delete all meta data older than 30 days from current time, you would run this command:

find . -mtime +30 -type f -exec rm {} \;

Once you have deleted enough data to bring down the usage of /var/lib/solera/meta to some value under 70%, you should be able to reboot the server and then check functionality.    

 

If for some reason after a reboot, you are still not able to access the GUI, these steps must be run:

1.  scotus stop
2.  mount /var/lib/solera   -ignore if this if already mounted
3.  cd /var/lib/solera/meta
4.  mv space_table_journal_vX  (where X is the version number) space_table_journal_vX.backup   # if not enough space, delete the file
5.  rm -rf /var/lib/redis/dump.rdb
6.  redis-cli -p `grep "^port" /etc/redis-ui.conf | cut -d' ' -f2` flushall  (only required if running 7.1.6 or higher)
7.  reboot