NSX controller syslog file does not rotate and consumes all space in /var/log
search cancel

NSX controller syslog file does not rotate and consumes all space in /var/log

book

Article ID: 318598

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
  • NSX controllers version is 6.3.6 or 6.4.1.
  • NSX controllers /var/log partition is full
 # show status
Disk Usage:
Filesystem     1K-blocks    Used Available Use% Mounted on
devtmpfs         2009152       0   2009152   0% /dev
/dev/sda3        3997376 1175992   2595288  32% /
/dev/sda1         999320   41908    888600   5% /boot
/dev/sda7        5029504  418308   4332668   9% /image
/dev/sda4        3997376  149212   3622068   4% /var/cloudnet/data
/dev/sda5        5029504 5013120         0 100% /var/log
/dev/sda6        1998672    3116   1874316   1% /config

 
  • You can collect NSX controller log bundle but some log files are missing (for example /var/log/syslog.1)
  • NSX Controllers memory may increase leading to Control cluster instabilities.


Environment

VMware NSX for vSphere 6.4.x
VMware NSX for vSphere 6.3.x

Cause

Because of photon OS rsyslog changes in NSX controllers 6.4.1 and 6.3.6, rsyslogd keeps writing all of outputs on the same file even after the file is rotated. Hence, /var/log/syslog becomes empty and /var/log/syslog.1 might grow until disk full.
When the NSX controllers /var/log partition is full, it can cause the controller memory to increase such that it cannot respond to network requests in a timely manner.
This can lead to the controllers being unable to handle requests such as staying in quorum. If more than two controllers leave the quorum, then the controller cluster will be down and the transport nodes will not get any updates until the controller cluster is back in quorum

Resolution

This issue is resolved in NSX 6.3.7 and NSX 6.4.2.
If you were previously impacted while running on NSX 6.3.6 or 6.4.1, contact Broadcom support and mention this article.

Workaround:
If you cannot upgrade, contact Broadcom support and mention this article so we can apply a workaround.