Filesystem Size Used Avail Use% Mounted ontmpfs 9.5G 1.3M 9.5G 1% /run/dev/sda3 11G 4.2G 5.6G 43% /tmpfs 48G 4.5M 48G 1% /dev/shmtmpfs 5.0M 0 5.0M 0% /run/lock/dev/mapper/nsx-repository 31G 12G 19G 39% /repository/dev/mapper/nsx-tmp 9.6G 169M 9.0G 2% /tmp/dev/mapper/nsx-secondary 98G 5.8G 88G 7% /nonconfig/dev/mapper/nsx-var+dump 20G 24K 19G 1% /var/dump/dev/mapper/nsx-var+log 37G 32G 2.9G 92% /var/log /dev/sda1 942M 7.2M 870M 1% /boot/dev/mapper/nsx-config__bak 29G 3.0G 25G 11% /config_bak/dev/mapper/nsx-config 29G 1.4G 27G 6% /config/dev/mapper/nsx-image 62G 20G 40G 33% /imagetmpfs 9.5G 8.0K 9.5G 1% /run/user/1007tmpfs 9.5G 8.0K 9.5G 1% /run/user/0
Example: /var/log/proton# ls -lrt nsxapi*
-rw-r----- 1 uproton uproton 262145339 Feb 8 06:49 nsxapi.60.log-rw-r----- 1 uproton uproton 262151464 Feb 8 07:04 nsxapi.59.log-rw-r----- 1 uproton uproton 262144771 Feb 8 07:24 nsxapi.58.log-rw-r----- 1 uproton uproton 262144117 Feb 8 07:38 nsxapi.57.log-rw-r----- 1 uproton uproton 262144176 Feb 8 08:21 nsxapi.56.log-rw-r----- 1 uproton uproton 262144614 Feb 8 08:38 nsxapi.55.log-rw-r----- 1 uproton uproton 262144068 Feb 8 09:32 nsxapi.54.log-rw-r----- 1 uproton uproton 262144355 Feb 8 09:48 nsxapi.53.log-rw-r----- 1 uproton uproton 262144273 Feb 8 10:09 nsxapi.52.log-rw-r----- 1 uproton uproton 262144149 Feb 8 10:18 nsxapi.51.log-rw-r----- 1 uproton uproton 262144435 Feb 8 10:45 nsxapi.50.log-rw-r----- 1 uproton uproton 262145156 Feb 8 11:32 nsxapi.49.log-rw-r----- 1 uproton uproton 262144092 Feb 8 11:46 nsxapi.48.log-rw-r----- 1 uproton uproton 262144137 Feb 8 11:57 nsxapi.47.log-rw-r----- 1 uproton uproton 262144072 Feb 8 12:04 nsxapi.46.log-rw-r----- 1 uproton uproton 262144142 Feb 8 12:09 nsxapi.45.log-rw-r----- 1 uproton uproton 262144835 Feb 8 12:18 nsxapi.44.log-rw-r----- 1 uproton uproton 262144298 Feb 8 12:33 nsxapi.43.log
manager_health.manager_disk_usage_high or manager_health.manager_disk_usage_very_high indicating the log partition has exceeded capacity thresholds.CRITICAL NSX 3133 [nsx@4413 comp="nsx-manager" subcomp="node-mgmt" username="root" level="CRITICAL" eventFeatureName="manager_health" eventType="manager_disk_usage_very_high" eventSev="critical" eventState="On" entId="########" logger="nsx_monitoring.clientlibrary.event_source"] At the time this alarm was raised, the disk usage for the Manager node disk partition /var/log reached 90% which is at or above the very high threshold value of 90%.
VMware NSX
This issue is caused by the manual uncompression of rolled NSX log files (e.g., .gz archives) directly within the /var/log directory of the NSX Manager.
When these files are manually unzipped, the resulting '.log' files are no longer recognized by the automated rotation and re-compression routines. Consequently, these uncompressed files remain on the file system indefinitely, growing in size until the /var/log partition reaches capacity, which may lead to management plane instability. This manual modification of the log structure is considered an unsupported administrative action.
To fix this issue in the live setup, please follow the below steps:
/var/log/proton/.<log_name>.<number>.log.
Example: nsxapi.1.log, nsxapi.2.log, through nsxapi.20.log.Caution: Do not delete the active log file currently being written to (e.g., nsxapi.log). Only remove the uncompressed historical logs.
Prevention: Do not unzip any log files on the manager under /var/log/. If analysis is needed, copy the log files off the node and unzip them elsewhere.
Monitoring: Use existing alarms manager_health.manager_disk_usage_high and manager_health.manager_disk_usage_very_high for /var/log.