2023-04-28T18:33:15.944295+00:00 <Edge-name> kernel - - - [13094823.723467] audit: audit_backlog=1187323 > audit_backlog_limit=8192
2023-04-28T18:33:15.944297+00:00 <Edge-name> kernel - - - [13094823.723469] audit: audit_lost=17229471 audit_rate_limit=0 audit_backlog_limit=8192
2023-04-28T18:33:15.944298+00:00 <Edge-name> kernel - - - [13094823.723470] audit: backlog limit exceeded
/var/log/core/
core.java
core.opsAgent
core.perl
core.python3
/var/log/core/
core.datapathd
core.lb-dispatcher
The kernel contains a defect related to the queuing of audit backlog messages. Due to this, backlog queue length can exceed the maximum configured queue length. New backlog entries are added to the queue but none are ever removed. Eventually this leads to the Manager/Edge completely running out of memory and multiple processes encountering memory allocation errors and crashes.
This is fixed with VMware NSX-T 3.2.3 and above
This is fixed with VMware NSX-T 4.1.2 and above
sed -i '/GRUB_CMDLINE_LINUX/s/audit=1//' /etc/default/grub
update-grub
Multiple Manager/Edge processes are repeatedly crashing and generating cores causing dataplane impact