VMware Cloud Foundation SDDC-Manager nfs-mount 100% Full
search cancel

VMware Cloud Foundation SDDC-Manager nfs-mount 100% Full

book

Article ID: 318268

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

In this article we discuss how to resolve the issue of SDDC-Manager nfs-mount 100% full


Symptoms:

  • The nfs repo on SDDC manager is either full or getting close to it. When you run prechecks you will see errors about it being less than required (OR)
  • ​​​LCM-Bundle-repo, datastore usage on disk alert: only 25GB free out of 502GB.  Need help in expanding the NFS share in SDDC manager to resolve the errors
  • the nfs repo can be full because of previous scheduled backups as well 
  • You can cross verify by running du - shc * and clear the old backups 



Environment

VMware Cloud Foundation 3.9.x

Cause

  • As per the retention policy properties, cron job should delete the nsx backup files more than 2 weeks also lcm backup files should not exceed more than 5 files.
  • This can be checked by looking at the following file:
    • root@sddcmgr-1 [ /nfs/vmware/vcf/nfs-mount/backup/scripts ]# cat nsx_backup_config.properties
      # All properties are mandatory.
      # Property BACKUP.RETENTION.HOURS is used for retention policy. All backup files past # 'N' hours will be retained and rest will be deleted in current day.
      # Constraints : Min 2
      BACKUP.RETENTION.HOURS=6
      
      # Property BACKUP.RETENTION.DAYS is used for retention policy. A single latest backup file # for a day for 'N' number of days will be retained and rest will be deleted in that day.
      # Example, if there are hourly backups configured and say there are backups for each hour, # files will be evo-nsx-******-23_0-****.backupproperties,
      # evo-nsx-******-22_0-****.backupproperties, evo-nsx-******-21_0-****.backupproperties,....
      # In above file list latest would be at 23:00, so file evo-nsx-******-23_0-**** # will be retained.
      # Constraints : Min 1
      BACKUP.RETENTION.DAYS=7
      
      # Property BACKUP.RETENTION.WEEKS is used for retention policy. A single latest backup # file for a week for 'N' number of weeks will be retained and rest will be deleted in # that week.
      # Example, if there are hourly backups configured and say there are backups for each hour, # files on the last day of the week ie on Sunday will be evo-nsx-******-23_0-Sun***.backupproperties,
      # evo-nsx-******-22_0-Sun***.backupproperties, evo-nsx-******-21_0-Sun***.backupproperties,....
      # In above file list latest would be at 23:00, so file evo-nsx-******-23_0-Sun*** # If there is no file available on Sunday then previous day will be checked until # start day of the week.
      BACKUP.RETENTION.WEEKS=2
      
      # Enable log LEVEL
      # Allowed levels INFO and DEBUG
      BACKUP.LOGGER.LEVEL=INF
  • In this case the backup is taking most of the space in the directory.

So, the cronjob for cleaning the nsxbackup files is failing due to an issue with the script nsxbackupcleaner.py and fills the space of NFS share and this will need cleaned up. 

Resolution

  • Make a copy of the file nsxbackupcleaner.py before making necessary changes.
  • cat the file nsxbackupcleaner.py and copy the contents into Note++ so you can see the line number to help with making the changes.
    • root@sddcmgr-1 [ /nfs/vmware/vcf/nfs-mount/backup/scripts ]# vi nsxbackupcleaner.py
  • Here is what needs to be changed, there are 3 lines:
    • Line #142 -->> backup_files_for_day.sort(key=lambda x: os.path.getctime(x), reverse=True)
      • Change to -->>
        • backup_files_for_day = sorted(backup_files_for_day, key=lambda x: os.path.getctime(x), reverse=True)
    • Line # 190 -->> backupfilesForDay.sort(key=lambda x: os.path.getctime(x), reverse=True)
      • Change to -->>
        • backupfilesForDay = sorted(backupfilesForDay, key=lambda x: os.path.getctime(x), reverse=True)
    • Line #260 -->> files.sort(key=lambda x: os.path.getctime(x), reverse=True)
      • Change to -->>
        • files = sorted(files, key=lambda x: os.path.getctime(x), reverse=True)Once, the file is updated then rerun the command to cleanup the backup files.
  • Finally run: 
    • root@sddcmgr-1 [ /nfs/vmware/vcf/nfs-mount/backup/scripts ]# ./nsx_backup_cleanup.sh
  • Run df -h again to test if that freed up space



Additional Information

Impact/Risks:

  • When making any change you must take a full backup of the system you are running on in case of failure.
  • Pre-checks can have issues.
  • Data may not be displayed or SDDC is not responding



Attachments

nsxbackupcleaner get_app