Disk write of NSX Edge VMs periodically spikes on the hour
book
Article ID: 322871
calendar_today
Updated On:
Products
VMware NSX
Issue/Introduction
Symptoms:
vCenter performance chart of Edge VMs shows disk write periodically spikes on the hour.
Other VMs might be suffered from degraded storage performance if many Edge VMs reside in the same physical storage.
Many 8MB files are generated on the hour in /var/log/journal/<machine-id> .
Environment
VMware NSX-T Data Center VMware NSX-T Data Center 3.x
Cause
Edge appliances run integrity checker on the hour. It executes find / -print0 | xargs -0 to check integrity of many files in the appliance. Since 3.1.0 auditd logs execve system calls and the logs are stored in journal log.
Integrity checker passes tremendous numbers of arguments to xargs, and all the arguments of execve logs are considered as field name by journald. So field hash table of a journal file grows rapidly beyond the threshold, and the file is rotated immediately. Each journal file is 8MB at minimum. Thus 8MB journal files rotate so fast and so many journal files are generated that large amount of disk write is triggered on the hour.
Manager VMs are not affected because auditd does not log execve system calls.
Resolution
This is a known issue affecting NSX-T 3.1.0 - 3.1.2.1.
One of them eliminates the periodic storage spike.
Additional Information
Impact/Risks: Edge VMs trigger large disk write on the hour, at the same time. It might degrade datastore performance if many Edge VMs reside in the same physical storage. Other VMs might be suffered from such degraded storage performance.