Service NodeManager is running but not healthy
Service FlinkContainer is not running
df -h
you will see that partition /var/log
is exceeding upto 90% storage occupation disk space use.VMware Aria Operations for Networks 6.11.0
Log rotate service logic/ mechanism has issues in rotating logs for nginx, syslog, warn,
etc.
After log rotation is performed it triggers a reload call to the respective service to reload the log files, so that it will continue to write to correct file. This mechanism is broken and as a result files grow huge in size making /var/log >90% which ends up with service health issues for NodeManager
and FlinkContainer
This issue is fixed in 6.12.1 release version (or later), which can be downloaded when you click here
If you cannot upgrade at this time, there is a patch (6.11.0.P2.1699450092.patch.bundle) on top of Aria Operations for Networks Version 6.11 available which fixes the log rotation issue.
Before upgrading or applying the patch, you must resolve the disk space issue as described in the Workaround below:
Workaround
Below is the is available workaround which needs to be executed on all the Aria Operations for Networks affected node(s) (Platform(s) and collector(s), prior to applying the patch or upgrading to version 6.12.
1. Take a Putty/SSH session to Aria Operations for Networks affected node(s) (Platform(s) and collector(s)
2. Login with username support
3. Execute below command to switch to ubuntu user.
ub
4. You will need to run these commands on each node that is effected.
cd /var/log
To identify the size of the files such as warn,syslog.1 and auth.log.1
on the affected nodes
ls -lrth
5. Look at the last 3 to 4 files, e.g. as below:
-rw-r----- 1 syslog adm 2.5G Jul 19 16:19 warn
-rw-r----- 1 syslog adm 4.6G Jul 19 16:19 syslog
-rw-r----- 1 syslog adm 3.9G Jul 19 16:19 auth.log
6. Execute below command to rotate the logs manually
sudo dd if=/dev/null of=/var/log/secure
sudo dd if=/dev/null of=/var/log/warn
sudo dd if=/dev/null of=/var/log/syslog.1
sudo dd if=/dev/null of=/var/log/auth.log.1
7. Clean up the warn,syslog.1 and auth.log.1
on the affected nodes
To do so execute below commands:
sudo rm -rf warn
sudo rm -rf syslog.1
sudo rm -rf auth.log.1
8. Delete access.log, access.1.log
from /var/log/nginx
on all the nodes.
To do so execute below commands:
sudo su
cd nginx/
ls -lrth
sudo rm -rf access.log
sudo rm -rf access.log.1
9. Restart the syslog
and ngnix
service on all the nodes, execute below commands:
systemctl restart syslog.service
systemctl restart nginx.service
10. Post executing above mentioned steps, execute below command to validate the size of /var/log
using below command:
df -f
If there is a cluster setup then execute below command:
./run_all.sh df -h
The size of /var/log
should now show less than 60%
12. Once the size of /var/log is < 60%, you must upgrade to v6.12.1 or above (recommended) or apply the P2 patch to v.6.11 as per instructions below.
Applying the Aria Operations 6.11 P2 Patch
Now the 6.11 P2 patch can be applied to Aria Operations for Networks 6.11 as per steps mentioned below. If you are upgrading to Aria operations for Networks to version 6.12.0 or 6.12.1 then you can ignore these instructions to apply the P2 patch.
Download Aria Operations for Networks Version 6.11 P2 patch from the attachment section in this Knowledge Base Article.
File Name: VMware-AriaOpNetworks.6.11.0.P2.1699450092.patch.bundle
File Size: 668.2 MiB
Checksums Values:
MD5SUM: CBE80EB278AD7A351ED84D1B0B7EC933
SHA1SUM: 1AD8B3352E1F6606B7E5EF7435E16A2DE23197C8
SHA-256: 9444734D2CB24AA5D78E107A676F30D81FFFA533CCEDA9759D817BB29F146781
Important Note:
If the 6.11.0-P1 patch is already applied then there is no need for below workaround steps but if P2 patch is the first patch to be applied on top of 6.11 GA version then perform below steps and execute the commands before your upgrade to the P2 patch.
ub
sudo su
mkdir -p /usr/local/lib/python3.6/dist-packages/cli
ln -sf /usr/local/lib/python3.8/dist-packages/cli/tool_manager_runner.py /usr/local/lib/python3.6/dist-packages/cli/tool_manager_runner.py
3. Upload patch bundle from Aria Operations for Networks GUI.
Procedure to apply patch bundle via Aria Operations for Networks GUI: