High Disk Utilization alert on NSX-T Manager appliance and /tmp is at 100%

Products

VMware NSX

Issue/Introduction

High Disk Utilization Alarm Triggered on NSX Manager Appliance
A high number of API calls are made to the NSX Manager — for example, when importing a large number of Distributed Firewall (DFW) rules into NSX
The issue is observed only when POST /api/v1/administration/audit-logs API is invoked periodically against the NSX Manager. This can be validated from the /var/log/nvpapi/api_access.log file on the NSX Manager
The /tmp directory has reached 100% utilization, and the /var/log directory is at 90% capacity
The /var/log utilization returns to normal after a few minutes
The df -h command reports /tmp as 100% full

Upon running du -sh /tmp/* on the NSX Manager node, a large file named nsx_audit_########-####-####-####-###########.log is identified as consuming significant disk space.

root@nsxmgr:~# du -sh /tmp/*
4.0K /tmp/appliance_form_factor
68K /tmp/hsperfdata_corfu
36K /tmp/hsperfdata_nsx
36K /tmp/hsperfdata_nsx-cbm
36K /tmp/hsperfdata_nsx-idps
36K /tmp/hsperfdata_nsx-messaging
68K /tmp/hsperfdata_nsx-replicator
36K /tmp/hsperfdata_nsx-search
4.0K /tmp/hsperfdata_root
36K /tmp/hsperfdata_ucminv
36K /tmp/hsperfdata_uphc
36K /tmp/hsperfdata_uproton
36K /tmp/hsperfdata_uproxy
36K /tmp/hsperfdata_uuc
4.0K /tmp/livetrace
4.0K /tmp/node_stat.txt
3.7G /tmp/nsx_audit_########-####-####-####-###########.log
4.0K /tmp/pktcap
4.0K /tmp/snap-private-tmp
8.0K /tmp/systemd-private-#############################-ntp.service-FTGyUm
8.0K /tmp/systemd-private-#############################-systemd-logind.service-zB1cQj
8.0K /tmp/systemd-private-#############################-systemd-resolved.service-62GMxn
8.0K /tmp/systemd-private-#############################-systemd-timedated.service-xPWlDo
0 /tmp/tmpwxalkxn##########################################################
4.0K /tmp/upgrade_time_estimation.json
4.0K /tmp/upgrade_troubleshooting8rx288ux
4.0K /tmp/upgradecjl2qk12
4.0K /tmp/vmware-root_1239-4248614932

Environment

VMware NSX

Cause

Repeated POST /api/v1/administration/audit-logs calls created temporary files named nsx_audit_########-####-####-####-###########.log in the /tmp directory on the NSX Manager node. This led to /tmp filling up.

Resolution

This is expected behavior, and the NSX engineering team is looking into ways to improve it so that users don’t experience /tmp filling up.

To resolve the issue and clear the /tmp directory, follow one of the workarounds listed below.

Workaround-1:
Truncating the log file to zero using the below command below will help reduce /tmp disk utilization:
truncate -s 0 /tmp/<nsx_audit_########-####-####-####-###########.log> <<<<< Replace the file name before running the command

Workaround-2:
Reboot the NSX Manager node to clear the /tmp directory and reduce disk utilization.

Additional Information

Note: POST /api/v1/administration/audit-logs API is executed on a NSX manager node to display audit logs from all manager nodes inside the management plane cluster.
Reference: Collect audit logs from registered manager nodes.

If the above steps in this KB do not resolve the issue, raise a support ticket with Broadcom support selecting NSX as the product.

Handling Log Bundles for offline review with Broadcom support.