Avi Logs & Events Loading Slow
search cancel

Avi Logs & Events Loading Slow

book

Article ID: 371300

calendar_today

Updated On:

Products

VMware Avi Load Balancer

Issue/Introduction

In 22.1.3 and later versions, there has been significant changes done to the Events processing on Avi. One of the major change is the Events API 3.0 which includes the consolidation of events for many processes and better file name formatting for easy log rotation and file management to help reduce disk usage.

In prior versions, each process will generate its own event file based on timestamp and pid. When certain processes (such as portal) restart frequently, a huge amount of event files will be generated and each one has a quite small size. This will cause log manager spending longer time to process all the files state and cause log query timeouts for loading events/logs.

The events created across the system for the processes are stored as event log files in /var/lib/avi/logs/ALL-EVENTS folder in the fileformat such as, log_event_adf_ALL-EVENTS_XXX_portal_<pid>.XXX. The new fileformat in 22.1.3-2p3 and later would look like log_event_adf_ALL-EVENTS_XXX_portal_idx1.XXX. This new naming strategy is done for 4 major processes that are frequently restarted [‘portal’, ‘analyticsportal’, ‘maintenanceportal’, ‘systemportal’]

 

Some of the symptoms are:

  1. Log’s API timeout related to the huge amount of event files on the controller.
  2. Log manager's memory usage exceed limit
  3. High Disk Usage as the logs are not rotated out due to the presence of old invalid event files(EVENTS_XXX_portal_<pid>.XXX format) 
  4. High inode usage in filesystem due to large number of event files of small size created

Environment

Controller version prior to 22.1.3 or upgraded from version prior to 22.1.3 may carry over the old format event-files

Resolution

All events files with names portal_<pid>, analyticsportal_<pid>, maintenanceportal_<pid>, systemportal_<pid> needs to be deleted from /var/lib/avi/logs/ALL-EVENTS/controller-xxx folder on all controller nodes

Steps to follow:

  1. systemctl stop rsync.service on all controller nodes
  2. remove all the files portal-<process_id> and portal-<process_id>-config files (leaving in the portal_idx<int> files) in /var/lib/avi/logs/ALL-EVENTS/controller-xxx on all controller nodes.

    Dry run: find /path/to/folder -type f -name '*portal*' -not -name '*portal_idx*'

    Delete: find /path/to/folder -type f -name '*portal*' -not -name '*portal_idx*' -delete

  3. Repeat step 2 for "analyticsportal", "maintenanceportal", "systemportal".
  4. systemctl start rsync.service on all controller nodes
  5. systemctl restart avi-indexer.service, systemctl restart avi-logmgr.service on all nodes