Some Services on the SDDC Manager Controller Virtual Machine do not start automatically after a reboot
search cancel

Some Services on the SDDC Manager Controller Virtual Machine do not start automatically after a reboot

book

Article ID: 316893

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

Symptoms:
  • VMware Cloud Foundation was recently updated to 2.3.1 or 2.3.2.
  • The SDDC Manager Controller virtual machine was rebooted after being upgraded.
  • The SDDC Manager web interface might not be reachable.
  • DNS resolution for hosts and virtual machines in the VMware Cloud Foundation environment might fail.
  • The Lifecycle Management section of the SDDC Manager web interface might be blank or display an error.
  • Running the command systemctl status scs.service in an SSH or console session on the SDDC Manager Controller VM shows that the System Controller Service is not started or enabled:

    * scs.service - VCF System Controller Service
       Loaded: loaded (/etc/systemd/system/scs.service; disabled; vendor preset: enabled)
       Active: inactive (dead)


Environment

VMware Cloud Foundation 2.3.x

Cause

The System Controller Service is responsible for watching other critical services and restarting them if they fail. If the service is not enabled, it will not start automatically on the next reboot of the SDDC Manager Controller VM. Without this service, the DNS service will not be started and the Lifecycle Management service may not be restarted if it is unable to start during the boot process.

Resolution

This is a known issue affecting VMware Cloud Foundation 2.3.x. There is currently no resolution.

Workaround:
Complete the following steps to work around the issue:
  1. Log into the SDDC Manager Controller virtual machine as root using SSH or a console session.
  2. Run the following command:

    systemctl enable scs.service

     
  3. If the service is not running, reboot the SDDC Manager Controller VM.
  4. (Optional) Confirm that the System Controller service is enabled and running:

    systemctl status scs.service
     
    Note: You will see output similar to the following:
     
    * scs.service - VCF System Controller Service
       Loaded: loaded (/etc/systemd/system/scs.service; enabled; vendor preset: enabled)
       Active: active (running) since Thu 2018-05-10 21:51:13 UTC; 20h ago
     Main PID: 630 (java)
       CGroup: /system.slice/scs.service
               |-  630 /usr/java/jre-vmware/bin/java -Ddaemon.pidfile=/opt/vmware/scs/scsd.pid -cp /opt/vmware/scs/jars/scsd-2.6.1-RELEASE-jar-with-dependencies...
               |-13649 /usr/bin/python /opt/vmware/scs/scripts/SCSHelper -n cassandra -o diag -l /opt/vmware/scs/logs/scsdiag
               |-13650 /bin/bash /opt/vmware/scs/scripts/cassandra-diag.sh
               |-13655 /bin/sh /opt/vmware/cassandra/apache-cassandra-2.2.4/bin/nodetool status
               `-13705 /usr/java/jre-vmware/bin/java -javaagent:/opt/vmware/cassandra/apache-cassandra-2.2.4/bin/../lib/jamm-0.3.0.jar -cp /opt/vmware/cassandra...

    May 11 17:51:07 sddc-manager-controller java[630]: success=True
    May 11 17:51:07 sddc-manager-controller java[630]: op=status
    May 11 17:51:07 sddc-manager-controller sudo[13629]:     root : TTY=unknown ; PWD=/opt/vmware/scs/logs ; USER=root ; COMMAND=/opt/vmware/scs/scripts...s -n pug
    May 11 17:51:07 sddc-manager-controller sudo[13629]: pam_unix(sudo:session): session opened for user root by (uid=0)
    May 11 17:51:07 sddc-manager-controller java[630]: status=0
    May 11 17:51:07 sddc-manager-controller java[630]: running=True
    May 11 17:51:07 sddc-manager-controller java[630]: returncode=0
    May 11 17:51:07 sddc-manager-controller java[630]: name=lcm
    May 11 17:51:07 sddc-manager-controller java[630]: success=True
    May 11 17:51:07 sddc-manager-controller java[630]: op=status
    Hint: Some lines were ellipsized, use -l to show in full.

     
  5. (Optional) Confirm that the Lifecycle Management service is enabled and running:

    systemctl status lcm.service
     
    Note: You will see output similar to the following:
     
    * lcm.service - LCM app
       Loaded: loaded (/etc/systemd/system/lcm.service; enabled; vendor preset: enabled)
       Active: active (running) since Thu 2018-05-10 21:51:45 UTC; 19h ago
     Main PID: 2415 (java)
       CGroup: /system.slice/lcm.service
               `-2415 /usr/java/jre-vmware/bin/java -Xmx3072m -XX:MaxPermSize=512m -Dspring.profiles.active=evo -Djava.io.tmpdir=/home/vrack/lcm/tmp -classpath ...

    May 10 21:51:39 sddc-manager-controller systemd[1]: Starting LCM app...
    May 10 21:51:39 sddc-manager-controller systemd[1]: lcm.service: PID file /home/vrack/lcm/logs/lcm.pid not readable (yet?) after start: No such file...irectory
    May 10 21:51:45 sddc-manager-controller systemd[1]: lcm.service: Supervising process 2415 which is not our child. We'll most likely not notice when it exits.
    May 10 21:51:45 sddc-manager-controller systemd[1]: Started LCM app.
    Hint: Some lines were ellipsized, use -l to show in full.

     
  6. (Optional) Confirm that the DNS service is enabled and running:

    systemctl status unbound.service
     
    Note: You will see output similar to the following:
     
    * unbound.service - Unbound recursive Domain Name Server
       Loaded: loaded (/etc/systemd/system/unbound.service; enabled; vendor preset: enabled)
       Active: active (running) since Thu 2018-05-10 21:51:13 UTC; 20h ago
     Main PID: 732 (unbound)
       CGroup: /system.slice/unbound.service
               `-732 /usr/sbin/unbound -d

    May 11 17:51:52 sddc-manager-controller unbound[732]: [732:0] info: receive_udp on interface: 2 172.30.0.14 172.30.0.14
    May 11 17:51:52 sddc-manager-controller unbound[732]: [732:0] info: send_udp over interface: 2 172.30.0.14 172.30.0.14
    May 11 17:51:52 sddc-manager-controller unbound[732]: [732:0] info: receive_udp on interface: 2 172.30.0.14 172.30.0.14
    May 11 17:51:52 sddc-manager-controller unbound[732]: [732:0] info: send_udp over interface: 2 172.30.0.14 172.30.0.14
    May 11 17:51:53 sddc-manager-controller unbound[732]: [732:0] info: receive_udp on interface: 2 172.30.0.14 172.30.0.14
    May 11 17:51:53 sddc-manager-controller unbound[732]: [732:0] info: send_udp over interface: 2 172.30.0.14 172.30.0.14
    May 11 17:51:55 sddc-manager-controller unbound[732]: [732:0] info: receive_udp on interface: 2 172.30.0.14 172.30.0.14
    May 11 17:51:55 sddc-manager-controller unbound[732]: [732:0] info: send_udp over interface: 2 172.30.0.14 172.30.0.14
    May 11 17:51:57 sddc-manager-controller unbound[732]: [732:0] info: receive_udp on interface: 2 172.30.0.14 172.30.0.14
    May 11 17:51:57 sddc-manager-controller unbound[732]: [732:0] info: send_udp over interface: 2 172.30.0.14 172.30.0.14

     


Additional Information

To determine if your environment will encounter this issue, review the output of the command systemctl status scs.service. If the service is listed as disabled in the Loaded line item, the service will not be started automatically on the next boot. The following example demonstrates a running System Controller Service that is currently disabled and will not start on the next boot of the SDDC controller VM:

* scs.service - VCF System Controller Service
   Loaded: loaded (/etc/systemd/system/scs.service; disabled; vendor preset: enabled)
   Active: active (running) since Wed 2018-05-09 19:24:44 UTC; 1 day 1h ago
 Main PID: 6407 (java)
    Tasks: 27
   CGroup: /system.slice/scs.service
           `-6407 /usr/java/jre-vmware/bin/java -Ddaemon.pidfile=/opt/vmware/scs/scsd.pid -cp /opt/vmware/scs/jars/scsd-2.6...

May 10 21:22:47 sddc-manager-controller java[6407]: success=True
May 10 21:22:47 sddc-manager-controller java[6407]: op=status
May 10 21:22:47 sddc-manager-controller sudo[5893]:     root : TTY=unknown ; PWD=/opt/vmware/scs/logs ; USER=root ; CO...n pug
May 10 21:22:47 sddc-manager-controller sudo[5893]: pam_unix(sudo:session): session opened for user root by (uid=0)
May 10 21:22:47 sddc-manager-controller java[6407]: status=0
May 10 21:22:47 sddc-manager-controller java[6407]: running=True
May 10 21:22:47 sddc-manager-controller java[6407]: returncode=0
May 10 21:22:47 sddc-manager-controller java[6407]: name=lcm
May 10 21:22:47 sddc-manager-controller java[6407]: success=True
May 10 21:22:47 sddc-manager-controller java[6407]: op=status
Hint: Some lines were ellipsized, use -l to show in full.