File '/storage/log/vmware/vmware-updatemgr/updatemgr-vmon.log.stderr' is very large causing vCenter services to not start
search cancel

File '/storage/log/vmware/vmware-updatemgr/updatemgr-vmon.log.stderr' is very large causing vCenter services to not start

book

Article ID: 381212

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • Several services including vmware-vpxd are in stopped state due to /storage/log being above 95% full as checked using:
    • service-control --status --all
  • The vCenter user interface shows "no healthy upstream" 
  • VAMI shows Storage Critical error like "File system /storage/log has run out of storage space. Increase the size of disk /storage/log."
  • The /storage/log directory is full or nearly full when running the command
    • df -h
  • Similar snippets may be in the file: /storage/log/vmware/vmware-updatemgr/updatemgr-vmon.log.stderr
    • Starting service process with pid: ####.
      INFO:ComponentScanner:Effective components: ['Intel-NVMe-Vol-Mgmt-Dev-Plugin_2.7.2173-2vmw.803.0.0.24022510', 'VMware-SDHCI-Driver_1.0.3-3vmw.803.0.0.24022510', 'Broadcom-bcm-mpi3_8.8.1.0.0.0-1vmw.803.0.0.24022510', 'Pensando-ionic-en_20.0.0-56vmw.803.0.0.24022510', 'ESXi_8.0.3-0.0.24022510', 'Mellanox-nmlx5_4.23.6.2-7vmw.803.0.0.24022510', 'Micron-mtip32xx-native_3.9.8-1vmw.803.0.0.24022510', 'MRVL-E3-Ethernet-iSCSI-FCoE_3.0.230.0-1OEM.700.1.0.15843807', 'Broadcom-lsi-mr3_7.728.02.00-1vmw.803.0.0.24022510', 'VMware-NVMeoF-RDMA_1.0.3.9-1vmw.803.0.0.24022510', 'VMware-Host-Client_2.18.0-23593406', 'V
      Mware-iser_1.1.0.2-1vmw.803.0.0.24022510', 'Intel-icen_1.12.5.0-1OEM.800.1.0.20613240', 'VMware-nvme-pcie-plugin_1.0.0-1vmw.803.0.0.24022510', 'VMware-vmkusb_0.1-22vmw.803.0.0.24022510', 'dell-osname-idrac-component_8.0.0-A02', 'dell-fac-dcui-component_8.0.0-A05', 'Broadcom-lsi-msgpt2_20.00.06.00-4vmw.803.0.0.24022510', 'VMware-oem-dell-plugin_1.1.0-2vmw.803.0.0.24022510', 'Intel-irdman_1.4.4.0-1OEM.800.1.0.20143090', 'VMware-dwi2c_0.1-7vmw.803.0.0.24022510', 'VMware-cndi-drivers_1.2.10.0-1vmw.803.0.0.24022510', 'Broadcom-lsi-msgpt3_17.00.13.00-3vmw.803.0.0.24022510', 'VMware-nvmxnet3_2.0.0.31-12vmw
      ...
      'VMware-dwi2c_0.1-7vmw.803.0.0.24022510', 'esxio-update_8.0.3-0.0.24022510']
      INFO:vmware.esximage.ImageManager.SoftwareSpecMgr:Image validation result: {'info': [{'id': 'com.vmware.vcIntegrity.lifecycle.Validate.Success', 'message': {'id': 'com.vmware.vcIntegrity.lifecycle.Validate.Success', 'default_message': 'The image is valid.', 'args': []}, 'resolution': None, 'time': 'YYYY:DD:MM:SSZ'}], 'warnings': [], 'errors': []}
      DEBUG:ImageServiceLogger:Heart beat at YYYY:DD:MM:SS for command: --taskid ############################# --threadid 703139 software --validate Log Count 3 MP count 80139 Gap 0:00:10.847077

Environment

vCenter Server 8.0 Update 3

Cause

The outage occurs when the /storage/log partition reaches 100% capacity due to the unchecked growth of a specific Update Manager log file (updatemgr-vmon.log.stderr), preventing necessary vCenter services from initializing.

Resolution

To workaround this issue edit the following file: /etc/vmware/vmware-vmon/svcCfgfiles/updatemgr.json

Please follow the steps below:

  • Take a snapshot of the vCenter VM.
  • Open an SSH session to the vCenter as root
  • Go to the location:
    • cd /etc/vmware/vmware-vmon/svcCfgfiles/
  • Remove line as below from file updatemgr.json using vi editor:
    • "StreamRedirectFile" : "%VMWARE_LOG_DIR%/vmware/vmware-updatemgr/updatemgr-vmon.log",
  • Restart the update manager service:
    • service-control --restart vmware-updatemgr

Additional Information

  • In some cases, the "vmware-stsd" service may be down, and this will cause the updatemgr service to fail on restart.
  • As seen below, we can see that "sts" is one of the dependent services:
    • service-control --list-dependencies vmware-updatemgr
      lookupsvc
      sts
      vpxd
      vpxd-svcs
  • To resolve, start the service:
    • service-control --start vmware-stsd
  • Once that is successful, then you can restart the update manager service:
    • service-control --restart vmware-updatemgr