NSX Intelligence appliance running out of disk space
book
Article ID: 315189
calendar_today
Updated On:
Products
VMware NSX
Issue/Introduction
Symptoms:
The NSX Intelligence appliance is running out of disk space.
The NSX Intelligence appliance /data partition becomes full.
The NSX Intelligence feature is not functioning and the User Interface (UI) is not responding.
Environment
VMware NSX-T Data Center 2.5.x VMware NSX-T Data Center
Cause
This issue occurs because the historical/inactive Spark worker directories located under /data/spark/worker parent directory are not being removed and that /data/spark/worker/driver-* directories contain the Spark's internal stdout log file, which is not rotating and keeps growing in size even when the appliance is not processing any new data.
This is causing the appliance to run out of disk space in the timeframe of several weeks to several months and eventually NSX Intelligence becomes non-functional.
Resolution
This is a known issue affecting VMware NSX-T Data Center 2.5.x.
Workaround: To work around this issue:
If the issue has already occurred
Delete all the data from /data/spark/worker/ directory by running this command:
rm -rf /data/spark/worker/*
Delete all the data from /data/spark/flowCorrelator/checkpoints directory by running this command:
rm -rf /data/spark/flowCorrelator/checkpoints/*
Delete all the data from /data/spark/local/ directory by running this command:
rm -rf /data/spark/local/*
Reboot the NSX Intelligence appliance.
To prevent this issue from occurring
Edit this file on NSX Intelligence appliance:
/opt/apache-spark_2.x.x/conf/spark-defaults.conf
Append the following 3 lines at the end of the file and save the file:
Append the following configuration to the end of the value of SPARK_WORKER_OPTS environmental variable and save the file: (Don't forget the hyphen before the D at the beginning of each line.)