NSX-T Global Manager alert '/' disk partition has reached 80% which is at or above the high threshold value of 80%
search cancel

NSX-T Global Manager alert '/' disk partition has reached 80% which is at or above the high threshold value of 80%

book

Article ID: 322585

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • NSX Federation environment.
  • NSX-T Global Manager UI, under site Global manager - alarms, the following alarms are generated:
    • Event Type: Manager Disk Usage High
    • Error: "The disk usage for the Manager node disk partition / has reached 80% which is at or above the high threshold value of 80%."
  • If you log into the Global manager appliance as the root and check the disk usage using df -h we see the root partition usage is at 80% or over.
# df -h
Filesystem                   Size  Used Avail Use% Mounted on
udev                          24G     0   24G   0% /dev
tmpfs                        4.8G  7.4M  4.8G   1% /run
/dev/sda2                     11G  8.6G  1.2G  80% /
tmpfs                         24G   54M   24G   1% /dev/shm
tmpfs                        5.0M     0  5.0M   0% /run/lock
tmpfs                         24G     0   24G   0% /sys/fs/cgroup
/dev/sda1                    944M  9.4M  870M   2% /boot
/dev/mapper/nsx-config__bak   29G   72M   28G   1% /config
/dev/mapper/nsx-var+log       27G   11G   16G  41% /var/log
/dev/mapper/nsx-var+dump     9.4G   37M  8.8G   1% /var/dump
/dev/mapper/nsx-repository    31G   15G   15G  50% /repository
/dev/mapper/nsx-tmp          3.7G   17M  3.5G   1% /tmp
/dev/mapper/nsx-image         42G   11G   29G  28% /image
tmpfs                        4.8G     0  4.8G   0% /run/user/1007
tmpfs                        4.8G     0  4.8G   0% /run/user/0
  • In Global Manager appliance syslog /var/log/syslog, we can see entries similar to the below:
2022-09-11T05:01:37.786Z NSX 17871 MONITORING [nsx@6876 alarmId="d30a7034-efcc-43bc-9684-15c4ea82829c" alarmState="OPEN" comp="global-manager" entId="00000000-0000-0009-0000-000000000010" eventFeatureName="manager_health" eventSev="MEDIUM" eventState="On" eventType="manager_disk_usage_high" level="WARNING" nodeId="bcf43e42-22e5-443c-5f52-23b5259f4b98" subcomp="monitoring"] The disk usage for the Manager node disk partition / has reached 80% which is at or above the high threshold value of 80%.
 

To find the directories which are consuming the most space in the /opt/vmware partition, in our case as below we can see it is /opt/vmware/nsx-jar-repository

# du -h --max-depth=1 /opt/vmware | sort -hr
4.9G /opt/vmware
848M /opt/vmware/nsx-jar-repository
531M /opt/vmware/gm-tomcat
436M /opt/vmware/proton-tomcat
326M /opt/vmware/upgrade-coordinator-tomcat
307M /opt/vmware/cross-cloud-upgrade-coordinator-tomcat
301M /opt/vmware/migration-coordinator-tomcat
[...]


Environment

VMware NSX-T Data Center 3.x
VMware NSX-T Data Center

Cause

In Federation environments, the Global Manager appliance directory /nonconfig does not have a mapping to a partition and therefore uses the root partition.

Resolution

This issue is resolved in NSX 3.2.3 and 4.1.0.

Workaround:
  • Before proceeding with workaround steps, ensure a backup has been taken and the backup passphrase is known.
  • The following procedure is only intended for use in a Federation environment.
  • Do not follow this procedure on a Local Manager in a non-Federation environment.
  • Deleting files from the filesystem is a non-reversible action, if performed on a system where this procedure is not intended then it may be necessary to restore from backup to reverse the change.

For Global managers
Delete .jar files under /opt/vmware/nsx-jar-repository/jars/ from each Global Manager node.

Log in as root and run:
rm -f /opt/vmware/nsx-jar-repository/jars/*.jar

then as root run:
service search stop
rm -r /nonconfig/search/
mkdir -p /config/search
chown -R elasticsearch:elasticsearch /config/search/
ln -s /config/search /nonconfig/search
service search start


After running through the above steps on all 3 Manager nodes, switch to admin user and trigger full re-indexing on each node:
start search resync all
 

For non Global environments:

The jar files can be deleted using the command: 

rm -f /opt/vmware/nsx-jar-repository/jars/*.jar

Helping to reduce the space on the managers.
No need to change the search partition mounting, as that issue is not present on non Global managers..