"statsmonitor service" service not starting after VCSA reboot
search cancel

"statsmonitor service" service not starting after VCSA reboot

book

Article ID: 318752

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

  • statsmonitor service not starting after reboot of VCSA 
  • CPU, Memory, and Database does not have any graphs
  • Manual start of statsmonitor is working fine
  • In StatsMonitor.log located in the directory /var/log/vmware/applmgmt, you see entries similar to
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12D51700] [Originator@6876 sub=ThreadPool] Entering worker thread loop
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12D10700] [Originator@6876 sub=ThreadPool] Thread enlisted
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12D10700] [Originator@6876 sub=ThreadPool] Entering IO thread loop
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=ThreadPool] Thread pool fair initial threads spawned. IO: 2, Min workers: 4, Max workers: 13, Reservation ratio: 9
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=ThreadPool] Thread enlisted
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=Default] Syscommand enabled: true
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12CCF700] [Originator@6876 sub=ThreadPool] Thread enlisted
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12CCF700] [Originator@6876 sub=ThreadPool] Entering IO thread loop
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12C8E700] [Originator@6876 sub=ThreadPool] Thread enlisted
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12C8E700] [Originator@6876 sub=ThreadPool] Entering fair thread loop
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=Default] ReaperManager Initialized
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=StatsMonitor] Setting up signal handlers
YYYY-MM-DDTHH:MM:SS error StatsMonitor[7F2A12E15780] [Originator@6876 sub=StatsMonitor] Failed to register handler for signal: 0
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=StatsMonitor] Initializing
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=LinuxStatsProvider(738847377312)] Registered 162 stats (sources:114, derivatives:48)
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=StatsMonitor] Found 12 file systems and 7 dirs in config for monitoring.
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=LinuxStorageStatsProvider(738847702096)] Registered 3 stats
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=StatsMonitor] SqliteStorageEngine using SQLite version: 3.17.0
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=StatsMonitor] Received signal 15
YYYY-MM-DDTHH:MM:SS info StatsMonitor[7F2A12E15780] [Originator@6876 sub=StatsMonitor] Shutting down

 

Environment

VMware vCenter Server Appliance 6.5.x
VMware vCenter Server Appliance 6.7.x

Cause

This issue occurs due to start up timeout for StatsMonitor Service. The startup DB health check completed in a given time, occasionally, the IO might be too slow to start the service.

Resolution

The solve the issue change the timeout for the applmgmt service so that it is started after the statsmonitor service has successfully started with its delay.

To do this, follow the below steps given below:

1) Go to the below directory:

   cd /etc/vmware/vmware-vmon/svcCfgfiles/
   
2) Backup the applmgmt and statsmonitor json files:

   mkdir /root/backup
   cp /etc/vmware/vmware-vmon/svcCfgfiles/applmgmt.json /root/backup/
   cp /etc/vmware/vmware-vmon/svcCfgfiles/statsmonitor.json /root/backup/
   
3) Change permissions on the existing json files:

   chmod 700 applmgmt.json
   chmod 700 statsmonitor.json

4) Manually increase the timeout in applmgmt.json file to 600 seconds or 1200 seconds (using VI editor). For the below example, 600 seconds has been used.
   The timeout in the this file is 60 seconds by default.

   /etc/vmware/vmware-vmon/svcCfgfiles/applmgmt.json:
   

   {
    "Name" : "applmgmt",
    "PreStartCommand" : "/usr/lib/applmgmt/support/scripts/prestart-applmgmt.sh",
    "StartCommand" : "/usr/lib/applmgmt/applmgmt.launcher",
    "ApiHealthCommand": "/usr/bin/python",
    "ApiHealthCommandArgs": [
        "/usr/lib/applmgmt/applmgmt_vmonhealth.py"
    ],
    "DependsOn" : ["statsmonitor"],
    "StartTimeout" : 600,
    "StopTimeout" : 20,
    "StartupType" : "AUTOMATIC",
    "DumpLiveCoreOnApiHealthFail" : false,
    "StreamRedirectFile": "%VMWARE_LOG_DIR%/vmware/applmgmt/applmgmt_vmonsvc",
    "RecoveryActionProfiles" :
      {
        "DEFAULT" :
        {
            "CRASH" : ["RESTART_SERVICE", "RESTART_SERVICE", "NO_ACTION"],
            "HEALTHFAIL" : ["RESTART_SERVICE", "RESTART_SERVICE", "NO_ACTION"]
        },
        "FAILOVER" :
        {
            "CRASH" : ["RESTART_SERVICE", "RESTART_SERVICE", "NO_ACTION"],
            "HEALTHFAIL" : ["RESTART_SERVICE", "RESTART_SERVICE", "NO_ACTION"]
        }
      }
   }


 
5) Modify statsmonitor service config for vMon to set higher startup timeout:

   sed -i '/StartTimeout/d' /etc/vmware/vmware-vmon/svcCfgfiles/statsmonitor.json
   sed -i '/ApiHealthFile/a "StartTimeout": 600,' /etc/vmware/vmware-vmon/svcCfgfiles/statsmonitor.json

   This adds a StartTimeout": 600 to the statsmonitor.json
   
6) Stop and start statsmonitor service explicitly:

   /usr/lib/vmware-vmon/vmon-cli -k statsmonitor
   /usr/lib/vmware-vmon/vmon-cli -i statsmonitor
   
7) Reboot the vCenter VM.
 
Note: Upon reboot, all the services should be up and running in between 5-10 minutes. 
After this, login to the VAMI page with root credentials should be successful.
Change permissions on the existing json files to the original permissions

chmod 444 applmgmt.json
chmod 444 statsmonitor.json