vSAN service logs on vCenter grow rapidly to large size and do not rotate as expected.
search cancel

vSAN service logs on vCenter grow rapidly to large size and do not rotate as expected.

book

Article ID: 326831

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

This KB is written to provide notice that this issue is known, provide a workaround, and information on where the issue is resolved.

Symptoms:
The following service logs grow rapidly and fill the /storage/log directory on vCenter resulting in vCenter and vCenter services crash and services being unable to start or restart.
Log files: vmware-vsan-health-runtimelog, vmware-vsan-health-service.log , vmware-vsan-health-summary-result.log, and vsanvcmgmtd.log

You will see a lack of recent compressed log files in the .gz or .tgz format created.
You will see an extremely large log file such as 10+ GB 

Example (your file will have different size and date stamp, possible different vSAN file name):
In /storage/log/vmware/vsan-health/ directory:
14663550211 Dec 14 19:19 vmware-vsan-health-service.log

  • When trying to start the vmware-vsan-health the output error shows:

    Scenario1:
    # service-control --start vmware-vsan-health
    Operation not cancellable. Please wait for it to finish...
    Performing start operation on service vsan-health...
    Error executing start on service vsan-health. Details {
        "componentKey": null,
        "problemId": null,
        "detail": [
            {
                "translatable": "An error occurred while starting service '%(0)s'",
                "localized": "An error occurred while starting service 'vsan-health'",
                "id": "install.ciscommon.service.failstart",
                "args": [
                    "vsan-health"
                ]
            }
        ],
        "resolution": null
    }
    Service-control failed. Error: {
        "componentKey": null,
        "problemId": null,
        "detail": [
            {
                "translatable": "An error occurred while starting service '%(0)s'",
                "localized": "An error occurred while starting service 'vsan-health'",
                "id": "install.ciscommon.service.failstart",
                "args": [
                    "vsan-health"
                ]
            }
        ],
        "resolution": null
    }

 

  • On the log /var/log/vmware/vsan-health/vmware-vsan-health-runtime.log.stderr we can see the output:
    # cat /var/log/vmware/vsan-health/vmware-vsan-health-runtime.log.stderr
    Starting service process with pid: 11532.
    Traceback (most recent call last):
      File "/usr/lib/vmware-vpx/vsan-health/VsanMgmtServer.py", line 377, in <module>
        SetupLogging(gCmdOptions, logdir)
      File "/usr/lib/vmware-vpx/vsan-health/VsanMgmtServer.py", line 194, in SetupLogging
        Logger.InitLogging(logDir)
      File "/usr/lib/vmware-vpx/vsan-health/logger/Logger.py", line 232, in InitLogging
        SetupLoggers(loggerConfFile, logDir)
      File "/usr/lib/vmware-vpx/vsan-health/logger/Logger.py", line 210, in SetupLoggers
        configInFile = UpdateFormatterHandlerPath(f)
      File "/usr/lib/vmware-vpx/vsan-health/logger/Logger.py", line 168, in UpdateFormatterHandlerPath
        config = json.load(configFile)
      File "/usr/lib/python3.5/json/__init__.py", line 268, in load
        parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
      File "/usr/lib/python3.5/json/__init__.py", line 319, in loads
        return _default_decoder.decode(s)
      File "/usr/lib/python3.5/json/decoder.py", line 339, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "/usr/lib/python3.5/json/decoder.py", line 357, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1)

 

  • The /etc/vmware-vsan-health/logger.conf could show up as empty:
    # ls -la /etc/vmware-vsan-health/logger.conf
    -rwxr--r--  1 vsan-health users        0 Jul 17 22:04 logger.conf


    Even if the file is not empty, proceed with the workaround.

    Scenario2:
    The following errors could show up if you had a cert change recently in VC:


    root@vcenter [ /var/log/vmware/vsan-health ]# cat vmware-vsan-health-runtime.log.stderr
    Starting service process with pid: 26934.
    Traceback (most recent call last):
      File "/usr/lib/python3.5/logging/config.py", line 558, in configure
        handler = self.configure_handler(handlers[name])
      File "/usr/lib/python3.5/logging/config.py", line 731, in configure_handler
        result = factory(**kwargs)
      File "/usr/lib/vmware-vpx/vsan-health/logger/Handlers.py", line 62, in __init__
        self.startLoggingTime = self.updateStartLoggingTime()
      File "/usr/lib/vmware-vpx/vsan-health/logger/Handlers.py", line 137, in updateStartLoggingTime
        '%Y-%m-%dT%H:%M:%S')
      File "/usr/lib/python3.5/_strptime.py", line 510, in _strptime_datetime
        tt, fraction = _strptime(data_string, format)
      File "/usr/lib/python3.5/_strptime.py", line 343, in _strptime
        (data_string, format))
    ValueError: time data '\n' does not match format '%Y-%m-%dT%H:%M:%S'

    During handling of the above exception, another exception occurred:

    Traceback
      File "/usr/lib/vmware-vpx/vsan-health/VsanMgmtServer.py", line 351, in <module>
        SetupLogging(gCmdOptions, logdir)
      File "/usr/lib/vmware-vpx/vsan-health/VsanMgmtServer.py", line 182, in SetupLogging
        Logger.InitLogging(logDir)
      File "/usr/lib/vmware-vpx/vsan-health/logger/Logger.py", line 295, in InitLogging
        SetupLoggers(loggerConfFile, logDir)
      File "/usr/lib/vmware-vpx/vsan-health/logger/Logger.py", line 278, in SetupLoggers
        VsanLogDictConfigurator(config).configure()
      File "/usr/lib/python3.5/logging/config.py", line 566, in configure
        '%r: %s' % (name, e))
    ValueError: Unable to configure handler 'vsanHealthFile': time data '\n' does not match format '%Y-%m-%dT%H:%M:%S'

Environment

VMware vSAN 6.7.x
VMware vSAN 7.0.x

Cause

In certain circumstances the /etc/vmware-vsan-health/logger.conf configuration file is not updated to the new format used by the vSAN health service on vCenter after an update. This leads to situations where the log rotation does not work as expected.

Resolution


This issue is resolved in vCenter server version 6.7U3j and later, and 7.0 U1 and later where even if an older configuration file is present, it will be updated to the new format.

Workaround:
As a workaround the following process may be followed:

1. Remove all the huge log files under /var/log/vmware/vsan-health
2. Remove /etc/vmware-vsan-health/logger.conf
3. Restart vsan-health with: vmon-cli -r vsan-health

Additional Information

Impact/Risks:
This issue can cause vCenter services to crash and be unable to restart.