NSX for vSphere 6.4.x postgres database queuing too many events and alarms causing high NSX Manager CPU utilization and filling of ‘/common’ partition
search cancel

NSX for vSphere 6.4.x postgres database queuing too many events and alarms causing high NSX Manager CPU utilization and filling of ‘/common’ partition

book

Article ID: 317554

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX Manager ‘no space’ left in ‘/common’ partition
  • NSX Manager shows high CPU 
  • SSH access to NSX Manager fails, postgres service stopped in the NSX Manager UI

ESX Host vmkernel.log messages
2018-MM-DDT19:20:36.453Z cpu2:2157823)pfp_insert_ruleid: Error Inserting rule Curr 1019, new 1019
2018-MM-DDT19:20:36.453Z cpu2:2157823)pfp_insert_ruleid: Error Inserting rule Curr 1019, new 1019
 
NSX Manager vsm.log messages
2018-MM-DD 09:16:48.775 UTC INFO SimpleAsyncTaskExecutor-1 EdgeUtils:451 - - [nsxv@6876 comp="nsx-manager" subcomp="manager"] populateSystemEvent parameters : sourceName edge-20, morefIdOfObjectOnVc datacenter-2, moduleName vShield Edge LoadBalancer, eventCode EDGE_LOAD_BALANCER_BACKEND_SERVER_DOWN, severity Informational, messageParams [BOPUK_HTTP_Pool, any, 10.176.28.136] eventMetaData {edgeId=edge-20, edgeVmName=VMNAME, ipAddress=XX.YY.ZZ.MM, hostId=host-391789, edgeVmVcUUId=########-####-####-####-########0fb3, edgeVmId=vm-401525, poolName=BOPUK_HTTP_Pool}

Cause

Event generation, alarms generated from Cluster hosts or deployed Edge services such as VPN, Load Balancer in the ESG are stored in NSX Manager postgres DB. Continuous and frequent generation of events and alarms from hosts and ESGs overwhelm postgres database in the NSX Manager. Purge task running in the NSX Manager to remove the old tasks from postgres database at regular intervals. This purge task cannot keep up with the number of events and alarms being generated causing postgres to increase in size and result in ‘/common’ to be filled to 100% capacity. 

Resolution

This issue is resolved in VMware NSX for vSphere 6.4.5
 

 

Attachments

purge_systemevents_alarms_auditLogs.sql get_app