root@<VIDM-FQDN> [ ~ ]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 7.9G 0 7.9G 0% /dev
tmpfs 7.9G 12K 7.9G 1% /dev/shm
tmpfs 7.9G 808K 7.9G 1% /run
tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup
/dev/sda4 17G 16G 0 100% /
tmpfs 7.9G 184K 7.9G 1% /tmp
/dev/sda2 119M 26M 87M 23% /boot
/dev/mapper/db_vg-db 20G 502M 19G 3% /db
/dev/mapper/tomcat_vg-horizon 20G 2.8G 16G 15% /opt/vmware/horizon
tmpfs 1.6G 0 1.6G 0% /run/user/1001
tmpfs 1.6G 0 1.6G 0% /run/user/0
Navigate to Identity & Access Management > Directories > Sync now) is failing with the following error.Failed to save config to diskVMware Identity Manager 3.3.x
A large number of increase in disk space usually we see in under the /var/logs, /var/log/message* and /var/log/messages/* folder.
vIDM appliance /root file system is full, for example usage exceeds about 90% and more.
You will see these below errors:
Dev/sda4 or Dev/sda2 / 100%
Before proceeding any further, in vCenter take a snapshot of all appliances in the vIDM cluster(non-memory, quiesced).
Note: If extending the disk space per option 3, delete the snapshots and instead clone the appliances.
Option 1: Rotated log files still opened by rsyslogd
Rotated log files that are renamed, while unlinked from the filesystem, are sometimes still in use by rsyslog.
This can be observed with the command:
lsof +L1 | grep delete | grep "rsyslogd"
E.g.:
rsyslogd 512314 root 15w REG 8,4 97697061 0 196612 /var/log/messages-1770865801 (deleted)
rsyslogd 512314 root 97w REG 8,4 97697061 0 196612 /var/log/messages-1770865801 (deleted)
These files can sometimes take up enough space to trigger 100% usage of the / partition.
Monitor the usage of the root partition with:
df -B M | grep -iE "filesystem|sda4"
To immediately clear up space, restart rsyslog with:
systemctl restart rsyslog
Often times this will close the file descriptions and release disks blocks, consequently clearing up a significant amount of disk space.
Verify the freed space with:
df -B M | grep -iE "filesystem|sda4"
If this hasn't cleared enough disk space, proceed to one of the other options listed below.
Option 2: Clearing journal files
cd /var/log/auditdo ls -lhtruncate -s 0 audit.log
/etc/logrotate.conf from weekly rotation to daily rotation and then run command logrotate /etc/logrotate.conf /etc/cron.d/hzniptables file. The hzniptables file is present under /etc/cron.d
*/1 * * * * cat /dev/null >/var/log/messages
/usr/local/horizon/conf/runtime-config.properties on each node:
analytics.deleteOldData=true
analytics.maxQueryDays=90
vi /etc/rsyslog.conf(and remove all the input methods) after confirming that /usr/local/horizon/scripts/enableRSyslog.hzn status shows no syslog present. (Copy backup before edit)ps aux | grep rsyslog
systemctl status rsyslog
systemctl restart rsyslog
Option 3: Resize / partition of the appliances
Resize the /dev/sda4 disk to 20G on all appliances of the cluster by following the steps detailed in this KB: How to Increase vIDM appliance disk space (broadcom.com)
Option 4: Investigate disk usage
cat /etc/logrotate.conf
cat /etc/cron.d/hzniptables
Note: Check the result of the both the commands again, so that we could compare and check which directory has consumed more space.du -ah / 2>/dev/null --exclude=/opt/vmware/horizon --exclude=/db | sort -rh | head -n 20
/var folder:
du -ah /var 2>/dev/null | sort -rh | head -n 20
Note: Observe any dip in the Partition Utilization metric or is it still incrementing? You can check the output of above commands for next couple of days more./etc/systemd/journald.conf for SystemMaxUse property (by default 100M).du -sh /var/cache/ if normal in MB sizelsof | grep '(deleted)'view /etc/rsyslog.conf where the input methods remove instead of cache. Then change it on a single node to monitor if this resolves the issue, Apply the same changes to the other nodes after few days.cat /usr/local/horizon/conf/db.pwd
psql -U postgres saas
copy (select * from "CacheEntry") to '/tmp/CacheEntry.csv’ with csv;
Note: Can run the same above command again after 2-3 days and compare the result.du -ah / 2>/dev/null | sort -rh | head -n 100
find ./ -type f -size +100M | less
find ./ -type d -exec sh -c 'echo -n "{}: " && find "{}" -type f | wc -l' \; | awk '$2 > 100' | sort -k2,2nr |less
If the required space is still not getting reclaimed, follow the below KB to check the GC logs: Uncompressed gc.logs are causing the /root partiton to run out of space in VMware Identity Manager.
Notes:1 # The nodes can be on higher load due to mis-configuration of the connectors, we can distribute the load across all the nodes and can check improvement in the disk utilization.
Notes:2 # Kindly reboot the nodes so that jvm and other CPU processes will release the temporary files.
Notes:3 # Validate the KB (HW-134096 - VMware Identity Manager Connector may fail to communicate due to config-state.json corruption (broadcom.com)) to restore the configuration files if they are corrupt or empty.