The disk usage on Bare Metal Edge /image disk partition goes high while generating the support bundle.
search cancel

The disk usage on Bare Metal Edge /image disk partition goes high while generating the support bundle.

book

Article ID: 345923

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • To unblock the customer if a situation as above may arise.


Symptoms:
  • The disk utilization of the Edge spikes during the support bundle collection.
  • Alarms regarding the high-disk usage are generated when the utilization spikes above the threshold of 90%

*Relevant log’s location*:

/var/log/
2022-xx-xxTxx:xx:xx.xxxZ nsx-edge NSX 2430 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="CRITICAL" eventFeatureName="edge_health" eventType="edge_disk_usage_very_high" eventSev="critical" eventState="On"] The disk usage for the Edge node disk partition /image has reached 100% which is at or above the very high threshold value of 90%.

Environment

VMware NSX-T Data Center

Cause

  • The /image partition has a few files/logs within it, for example it houses the nsx-file store, which stores the support bundle (temporarily) and the capture files.
  • During an upgrade, the /image partition is also used to temporarily store the unpacked upgrade bundle files, which could be many GB in size.
  • So, it is expected that the /image partition will see increased usage while a support bundle is being generated. Seeing an alarm means that either there are extra files in the nsx file-store using up some space, or the temporary bundle files are large. the var/dir is where the temp bundle files are stored. 
  • Prior to version 3.2.2, we see a very large log file at /var/log/lb/access.log. This is meant to be rotated, but log rotation does not seem to be working for this file, therefore resulting in a high disk usage every time a support bundle is generated.
  • The logrotate config for the LB(/etc/logrotate.d/nsx-edge-lb) rotates the "/var/log/lb/*/*/*.log", this doesn't cover the access.log in "/var/log/lb/" directory.

Resolution

The issue has been fixed in NSX-T version 3.2.2 and above, where the /var/log/lb/access.log would be rotated when the size reaches 10M, and it would be rotated daily.

Workaround:
  • For the issue on this specific article where the access.log continues to grow. Clear the access.log using
                    echo > /var/log/lb/access.log

Additional Information

Impact/Risks:
  • The I/O read/write operation will be impacted.
  • System may perform poorly when the disk utilization hits over the threshold.