Automated space cleanup utility (space_utility) consumes high memory on vCenter Server 8.0 U3h
search cancel

Automated space cleanup utility (space_utility) consumes high memory on vCenter Server 8.0 U3h

book

Article ID: 422390

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • Multiple services on vCenter might observe out of memory errors and stops responding, some of the service specific logs (such as Topology service) will show out of memory errors as below :

    /var/log/vmware/topologysvc/topology-svcs.log

    java.lang.OutOfMemoryError: GC overhead limit exceeded

    Note: This is just a sample service log file, this issue can impact other services as well.
  • Process "/usr/lib/applmgmt/support/space_utility" shows high memory usage while verifying the stats using "top" command :

    PID   USER      PR  NI    VIRT    RES    %CPU  %MEM     TIME+ S COMMAND
    #### root       20  0    307.7m  293.6m  0.0   1.8    0:04.29 S python3 /usr/lib/applmgmt/support/space_utility --report-file-path /var/log/vmware/ --report-file-nam+
    #### root       20  0    348.0m  276.3m  0.0   1.7    0:04.52 S python3 /usr/lib/applmgmt/support/space_utility --report-file-path /var/log/vmware/ --report-file-nam+
    #### root       20  0    307.8m  264.6m  0.0   1.7    0:04.19 S python3 /usr/lib/applmgmt/support/space_utility --report-file-path /var/log/vmware/ --report-file-nam+

Environment

  • vCenter Server 8.0 U3h
  • vCenter 9.0.1

Cause

  • vCenter Server 8.0 Update 3h has an automated space cleanup utility which is scheduled to run everyday at midnight to clean up old logs.
  • In rare situations, due to a race condition, this scheduled cleanup task might conflict with the file-based backup and restore (FBBR) configured with an NFS mount point. As a result, the space_utility process may use more memory than usual due to the traverse of the mounted NFS share. Additionally, if the NFS share is unmounted by FBBR while the space_utility traverses the NFS share, the space_utility process may hang due to known NFS limitations and that would cause a memory leak. If that issue happens repeatedly, the accumulated memory leak may cause performance degradation and/or out-of-memory service errors.

Resolution

This is a known issue with vCenter 8.0 U3h / 9.0.1 and Broadcom Engineering is actively working towards fixing this issue in a future patch release.

Workaround

Terminate the running space_utility processes and disable the automated execution of this utility by following below steps :

  1. Login to vCenter via SSH
  2. Change the shell to 'bash' using 'shell' command if it is configured on Appliance shell.

    Connected to service

        * List APIs: "help api list"
        * List Plugins: "help pi list"
        * Launch BASH: "shell"

    Command> shell
    Shell access is granted to root
    root@<hostname> [ ~ ]#

  3. Identify the Process IDs (pid) of the running space_utility processes.

    ps aux | grep space_utility

  4. Terminate the PIDs noted from Step 3.

    kill -9 <pid or space separated pids>

  5. Disable the daily execution of the script by deleting the cron job.

    rm -f /etc/cron.d/space_util.cron

  6. (Optional) Restart the services if other services are impacted by the high memory usage.

    service-control --stop --all
    service-control --start --all

    Note: This is an optional step and services needs to be restarted only if vCenter Server functionality is impacted due to high memory usage of "space_utility" process.