NSX-T Manager alert '/' disk partition has reached 80% which is at or above the high threshold value of 80%
search cancel

NSX-T Manager alert '/' disk partition has reached 80% which is at or above the high threshold value of 80%

book

Article ID: 322585

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX Federated or non-Federated environment.
  • NSX-T Manager UI (Global and Local) - alarms, the following alarms are generated:
    • Event Type: Manager Disk Usage High
    • Error: "The disk usage for the Manager node disk partition / has reached 80% which is at or above the high threshold value of 80%."
  • If you log into the NSX manager appliance as the root and check the disk usage using df -h we see the root partition usage is at 80% or over.
# df -h
Filesystem                   Size  Used Avail Use% Mounted on
udev                          24G     0   24G   0% /dev
tmpfs                        4.8G  7.4M  4.8G   1% /run
/dev/sda2                     11G  8.6G  1.2G  80% /
tmpfs                         24G   54M   24G   1% /dev/shm
tmpfs                        5.0M     0  5.0M   0% /run/lock
tmpfs                         24G     0   24G   0% /sys/fs/cgroup
/dev/sda1                    944M  9.4M  870M   2% /boot
/dev/mapper/nsx-config__bak   29G   72M   28G   1% /config
/dev/mapper/nsx-var+log       27G   11G   16G  41% /var/log
/dev/mapper/nsx-var+dump     9.4G   37M  8.8G   1% /var/dump
/dev/mapper/nsx-repository    31G   15G   15G  50% /repository
/dev/mapper/nsx-tmp          3.7G   17M  3.5G   1% /tmp
/dev/mapper/nsx-image         42G   11G   29G  28% /image
tmpfs                        4.8G     0  4.8G   0% /run/user/1007
tmpfs                        4.8G     0  4.8G   0% /run/user/0
  • In Global or Local Manager appliance syslog /var/log/syslog, we can see entries similar to the below:
2022-09-11T05:01:37.786Z NSX 17871 MONITORING [nsx@6876 alarmId="d30a7034-efcc-43bc-9684-15c4ea82829c" alarmState="OPEN" comp="global-manager" entId="00000000-0000-0009-0000-000000000010" eventFeatureName="manager_health" eventSev="MEDIUM" eventState="On" eventType="manager_disk_usage_high" level="WARNING" nodeId="bcf43e42-22e5-443c-5f52-23b5259f4b98" subcomp="monitoring"] The disk usage for the Manager node disk partition / has reached 80% which is at or above the high threshold value of 80%.
 

To find the directories which are consuming the most space in the /opt/vmware partition, in our case as below we can see it is /opt/vmware/nsx-jar-repository

# du -h --max-depth=1 /opt/vmware | sort -hr
4.9G /opt/vmware
848M /opt/vmware/nsx-jar-repository
531M /opt/vmware/gm-tomcat
436M /opt/vmware/proton-tomcat
326M /opt/vmware/upgrade-coordinator-tomcat
307M /opt/vmware/cross-cloud-upgrade-coordinator-tomcat
301M /opt/vmware/migration-coordinator-tomcat
[...]



Environment

VMware NSX-T Data Center

Cause

Before VMware NSX-T Data Center 3.2.3, the disk space usage of the "/" partition is by default close to the 80% threshold.

Resolution

This issue is resolved from VMware NSX-T Data Center 3.2.3 and VMware NSX 4.1.0.

Workaround:

  • Before proceeding with workaround steps, ensure a backup has been taken and the backup passphrase is known.
  • Deleting files from the filesystem is a non-reversible action, if performed on a system where this procedure is not intended then it may be necessary to restore from backup to reverse the change.


For Managers (
non-Federated):

Note: The deletion of the *.jar files is safe on non-Federated Local Managers.

  1. Log as root on the NSX Manager VM.
  2. Free up space by deleting the jar files using the command: rm -f /opt/vmware/nsx-jar-repository/jars/*.jar.
  3. Repeat from step 1. on each local NSX Manager VMs.


For Global Managers (Federation):

Note: In Federation environments, the Global Manager appliance directory /nonconfig does not have a mapping to a partition and therefore uses the root partition.

  1. Delete .jar files under /opt/vmware/nsx-jar-repository/jars/ from each Global Manager node.
    Log in as root and run: rm -f /opt/vmware/nsx-jar-repository/jars/*.jar
  2. Then, on each Global Manager, run the following commands:
    • service search stop
    • rm -r /nonconfig/search/
    • mkdir -p /config/search
    • chown -R elasticsearch:elasticsearch /config/search/
    • ln -s /config/search /nonconfig/search
    • service search start
  3.  After running through the above steps on all 3 Manager nodes, switch to admin user and trigger full re-indexing on each node:  start search resync all