NSX Controller disk 100% full
search cancel

NSX Controller disk 100% full

book

Article ID: 339171

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

One or all of the controller nodes disk that is part of an NSX Cluster gets full and gets into a disconnected state.

Environment

VMware NSX for vSphere 6.4.x
VMware NSX for vSphere 6.3.x

Cause

crond job for restarting the rsyslog service has wrong parameter of "reload" instead of "restart".

Resolution

This issue is resolved in VMware NSX Data Center for vSphere 6.4.2 and 6.3.7.

Workaround:

1. Edit the /etc/logrotate.d/rsyslog file:

 

/var/log/syslog

{

rotate 5

size 100M

missingok

notifempty

compress

postrotate

reload rsyslog >/dev/null 2>&1 || true <--- REPLACE RELOAD WITH RESTART

endscript

}

 

/var/log/mail.info

/var/log/mail.warn

/var/log/mail.err

/var/log/mail.log

/var/log/daemon.log

/var/log/kern.log

/var/log/auth.log

/var/log/user.log

/var/log/lpr.log

/var/log/cron.log

/var/log/debug

/var/log/messages

{

rotate 4

size weekly

missingok

notifempty

compress

sharedscripts

postrotate

reload rsyslog >/dev/null 2>&1 || true <--- REPLACE RELOAD WITH RESTART

endscript

}

 

 

 

 

 

 

 

-------

 

2. Run this command to identify the biggest file under /var/log:

 

$ find /var/log -type f -exec ls -l {} \;|sort -k 5n|awk '{size=$5;var[1024**3]="Gb"; var[1024**2]="Mb";var[1024]="Kb"; for (x=1024**3; x>=1024; x/=1024) {if (size >=x){printf "%6.2f %s\t%s\n", size/x,var[x],$9;break}}}'

 

3. Then delete that file:

 

$ rm /var/log/filename

 

4. If df -h still shows 100% of /var/log, then compare the disk usage with du -h /var/log and if there is a difference, that means the rsyslog process still holds the deleted file.

 

5. Get the PID of the rsyslog and the command line to start the process:

 

$ ps -aux | grep rsyslog

 

6. Then get all the files open by the rsyslog process to verify it still holds the deleted file:

 

$ ls -l /proc/<PID>/fd | grep deleted

 

or

 

$ /usr/sbin/lsof | grep deleted

 

7. Kill the process:

 

$ kill -9 PID

 

8. Start the process again by running the command line from the output of step 5.:

 

$ /usr/sbin/rsyslogd -n