[vCenter] vmware-sps service does not start
search cancel

[vCenter] vmware-sps service does not start

book

Article ID: 323292

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Assist with starting SPS service


Symptoms:
  • Customer can't create VM's on vSAN Datastores
  • vmware-sps service is stopped and does not start manually

 

  • In the /var/log/vmware/vmware-sps/sps.log:

[main] WARN opId=sps-Main-xxxxxx-xx com.vmware.vim.storage.common.serviceclient.vpxd.impl.VpxdClientImpl - loginByToken request timedout, cancelling the task scheduled XX.XX.XX.XX [main] ERROR opId=sps-Main-xxxxxx-xx com.vmware.vim.storage.common.task.retry.CallableRetryDecorator - Caught exception - com.vmware.vim.storage.common.serviceclient.vpxd.VpxdException: Error while doing login to VPXD service

  • Command "service-control --start vmware-sps" fails after some time.
  • sps.log shows:

Caused by: java.util.concurrent.TimeoutException  at com.vmware.vim.vmomi.core.impl.BlockingFuture.get(BlockingFuture.java:102) ~[vlsi-core.jar:?]   at com.vmware.vim.storage.common.serviceclient.vpxd.impl.VpxdClientImpl.loginByToken(VpxdClientImpl.java:169) ~[storage-commons-1.0.jar:?]  ... 208 more

[main] INFO opId=sps-Main-xxxxxx-xxx com.vmware.vim.storage.common.util.OperationIdUtil - OperationID present in invoker thread, adding suffix and re-using it - sps-Main-xxxxxx-xxx-xxxxxx-xxx.

 [main] INFO opId=sps-Main-xxxxxx-xxx com.vmware.vim.storage.common.util.OperationIdUtil - OperationID present in invoker thread, adding suffix and re-using it - sps-Main-xxxxxx-xxx-xxxxxx-xxx.

 [main] INFO opId=sps-Main-xxxxxx-xxx com.vmware.vim.storage.common.util.OperationIdUtil - OperationID present in invoker thread, adding suffix and re-using it - sps-Main-xxxxxx-xxx-xxxxxx-xxx.

[pool-3-thread-11] INFO opId=sps-Main-xxxxxx-xxx com.vmware.vim.storage.common.task.CustomThreadPoolExecutor - [VLSI-client] Active thread count is: 11, Core Pool size is: 20, Queue size: 0, Time spent waiting in queue: 1 millis

[jaeger.RemoteReporter-QueueProcessor] WARN opId=sps-Main-xxxxxx-xxx io.jaegertracing.internal.reporters

.RemoteReporter - FlushCommand execution failed! Repeated errors of this command will not be logged.

io.jaegertracing.internal.exceptions.SenderException: Failed to flush spans.

       at io.jaegertracing.thrift.internal.senders.ThriftSender.flush(ThriftSender.java:116) ~[jaeger-thrift-1.8.0.jar:1.8.0]   at io.jaegertracing.internal.reporters.RemoteReporter$FlushCommand.execute(RemoteReporter.java:158) ~[jaeger-core-1.8.0.jar:1.8.0]  at io.jaegertracing.internal.reporters.RemoteReporter$QueueProcessor.run(RemoteReporter.java:179) [jaeger-core-1.8.0.jar:1.8.0] at java.lang.Thread.run(Thread.java:750) [?:1.8.0_362]

Caused by: io.jaegertracing.internal.exceptions.SenderException: Could not send 2 spans  at io.jaegertracing.thrift.internal.senders.UdpSender.send(UdpSender.java:86) ~[jaeger-thrift-1.8.0.jar:1.8.0]  at io.jaegertracing.thrift.internal.senders.ThriftSender.flush(ThriftSender.java:114) ~[jaeger-thrift-1.8.0.jar:1.8.0]  ... 3 more
Caused by
...org.apache.thrift.transport.TTransportException: Cannot flush closed transport at io.jaegertracing.thrift.internal.reporters.protocols.ThriftUdpTransport.flush(ThriftUdpTransport.java:151) ~[jaeger-thrift-1.8.0.jar:1.8.0]  at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73) ~[libthrift-0.14.1.jar:0.14.1] at org.apache.thrift.TServiceClient.sendBaseOneway(TServiceClient.java:66) ~[libthrift-0.14.1.jar:0.14.1]  at io.jaegertracing.agent.thrift.Agent$Client.send_emitBatch(Agent.java:70) ~[jaeger-thrift-1.8.0.jar:1.8.0]   at io.jaegertracing.thrift.internal.senders.ThriftSender.flush(ThriftSender.java:114) ~[jaeger-thrift-.8.0.jar:1.8.0] ... 3 more
 

  • In /var/log/vmware/vpxd.log you see:
    • grep -i "sps-Main-240307-157-240309-910" vpxd-126.log 
      info vpxd[05694] [Originator@6876 sub=vpxLro opID=sps-Main-xxxxxx-xxx-xxxxxx-xxx-xx] [VpxLRO] -- BEGIN lro-5678 -- ServiceInstance -- vim.ServiceInstance.retrieveContent -- ########-####-####-####-########8814 info vpxd[05694] [Originator@6876 sub=vpxLro opID=sps-Main-xxxxxx-xxx-xxxxxx-xxx-xx] [VpxLRO] -- FINISH lro-5678 info vpxd[05686] [Originator@6876 sub=vpxLro opID=sps-Main-xxxxxx-xxx-xxxxxx-xxx-xx] [VpxLRO] -- BEGIN lro-5679 -- ServiceInstance -- vim.ServiceInstance.retrieveInternalContent -- xxxxxx-xxx-xxxxxx-xxx-xx info vpxd[05686] [Originator@6876 sub=vpxLro opID=sps-Main-xxxxxx-xxx-xxxxxx-xxx-xx] [VpxLRO] -- FINISH lro-5679 info vpxd[05623] [Originator@6876 sub=vpxLro opID=sps-Main-xxxxxx-xxx-xxxxxx-xxx-xx] [VpxLRO] -- BEGIN lro-5680 -- SessionManager -- vim.SessionManager.loginByToken -- xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx info vpxd[05623] [Originator@6876 sub=UserDirectorySso opID=sps-Main-xxxxxx-xxx-xxxxxx-xxx-xx] GetUserInfoInternal(VSPHERE.LOCAL\sps-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, false) res: VSPHERE.LOCAL\sps-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx info vpxd[05623] [Originator@6876 sub=AuthorizeManager opID=sps-Main-xxxxxx-xxx-xxxxxx-xxx-xx] [Auth]: User VSPHERE.LOCAL\sps-xxxxxx-xxx-xxxxxx-xxx-xxxxxxxxxxxx warning vpxd[05623] [Originator@6876 sub=AuthorizeManager opID=sps-Main-xxxxxx-xxx-xxxxxx-xxx-xx] Refresh function is not configured.User data can't be added to scheduler.User name: VSPHERE.LOCAL\sps-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx warning vpxd[05623] [Originator@6876 sub=Vmomi opID=sps-Main-xxxxxx-xxx-xxxxxx-xxx-xx] VMOMI activation LRO failed; <<xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, <TCP 'xxx.x.x.x : 8085'>, <TCP 'xxx.x.x.x : 47688'>>, SessionManager, vim.SessionManager.loginByToken, <vim.version.v8_0_1_0, internal, 8.0.1.0>, {stm: {<io_obj p:0x00007f3b8400c1b0, h:91, <TCP '127.0.0.1 : 8085'>, <TCP 'xxx.x.x.x : 47688'>>, id: 105, state(in/out): 3/1}, session: <xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, <TCP 'xxx.x.x.x : 8085'>, <TCP 'xxx.x.x.x : 47688'>>, req: {POST, /sdk}}>, N3Vim5Fault12NoPermission9ExceptionE(Fault cause: vim.fault.NoPermission info vpxd[05623] [Originator@6876 sub=vpxLro opID=sps-Main-240307-157-240309-910-9f] [VpxLRO] -- FINISH lro-5680 error vpxd[05623] [Originator@6876 sub=Default opID=sps-Main-240307-157-240309-910-9f] [VpxLRO] -- ERROR lro-5680 -- xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -- SessionManager -- vim.SessionManager.loginByToken: :vim.fault.NoPermission ---- warning vpxd[05623] [Originator@6876 sub=Vmomi opID=sps-Main-xxxxxx-xxx-xxxxxx-xxx-xx] VMOMI activation L RO failed; <<xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, <TCP 'xxx.x.x.x : 8085'>, <TCP 'xxx.x.x.x : 47688'>>, SessionManager, vim.Sessi onManager.loginByToken, <vim.version.v8_0_1_0, internal, 8.0.1.0>, {stm: {<io_obj p:0x00007f3b8400c1b0, h:91, <TCP 'xxx.x.x.x : 80 85'>, <TCP 'xxx.x.x.x : 47688'>>, id: 105, state(in/out): 3/1}, session: <xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, <TCP 'xxx.x.x.x : 8085'>, <TCP 'xxx.x.x.x : 47688'>>, req: {POST, /sdk}}>, N3Vim5Fault12NoPermission9ExceptionE(Fault cause: vim.fault.NoPermission --> ) --> [context]zKq7AVECAQAAAKD8SAEYdnB4ZAAA0cdUbGlidm1hY29yZS5zbwAAYQpGAFEGRwB1zUyBhWkPAXZweGQAgcyHGgGBW464AYERp7gBgeuyuAGBTxu4AYGE7 rcBgps0RQFsaWJ2aW0tdHlwZXMuc28AgQjVSwIDrRMdbGlidm1vbWkuc28AgZSZLQKBGqVKAoFitUoCgdvdSQKBn4tKAgA5mjsAEuo7AOK2UgSHfwBsaWJwdGhyZWFkLnN vLjAABS82D2xpYmMuc28uNgA=[/context] info vpxd[05623] [Originator@6876 sub=vpxLro opID=sps-Main-240307-157-240309-910-9f] [VpxLRO] -- FINISH l ro-5680 error vpxd[05623] [Originator@6876 sub=Default opID=sps-Main-240307-157-240309-910-9f] [VpxLRO] -- ERROR lro-5680 -- xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -- SessionManager -- vim.SessionManager.loginByToken: :vim.fault.NoPermission --> Result: --> (vim.fault.NoPermission) { --> faultCause = (vmodl.MethodFault) null, --> faultMessage = <unset>, --> object = 'vim.Folder:########-####-####-####-########4e95:group-d1', --> privilegeId = "System.View", --> missingPrivileges = (vim.fault.NoPermission.EntityPrivileges) [ --> (vim.fault.NoPermission.EntityPrivileges) { --> entity = 'vim.Folder:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx:group-d1', --> privilegeIds = (string) [ --> "System.View" --> ] --> } --> ] --> msg = "" --> } --> Args: --> --> Arg locale: --> "en_US"

 

Environment

VMware vCenter Server 8.0
VMware vCenter Server 8.0.1
VMware vCenter Server 8.0.2

Cause

Missing permission for the vpxd-svcs solution user in VCSA.
From above  vpxd.log we see It is looking for VSPHERE.LOCAL\sps-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Resolution

  1. Locate the User that has a missing permission in the vpxd.log
example :VSPHERE.LOCAL\sps-<########-####-####-####-########2065>
  1. Go to the vCenter web UI client and login as [email protected] 
  2. Click on the Root object of the VCSA   
  3. Go to permissions tab
  4. Click Add.
  5. search for the string of the user that the VC is looking for example sps->xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx>
  6. Check  the  "Propagate to children" option and select Administrator role.
  7. Save the change.
  8. Restart the sps service with command vmon-cli -r sps
Note: If you get and error from the GUI please follow the workaround of this KB

Workaround:
The workaround understood is to re-add the Service account to the Administrators group. You should be able to find out the SPS service account as highlighted above in vpxd.log. (you may filter by the keyword GetUserInfoInternal) Then Run the below command as mentioned:
 
  1. Take offline snapshots of all the vCenters in ELM mode
  2. Take SSH for VC in question with root privileges and run below command : /usr/lib/vmware-vmafd/bin/dir-cli group modify --name Administrators --add sps-xx-xx-xx-xx-xx
sample: /usr/lib/vmware-vmafd/bin/dir-cli group modify --name Administrators --add  sps-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  1. Restart the sps service with command vmon-cli -r sps


Additional Information

SPS service start failing with VpxdException: Error while doing login to VPXD service

[vCenter] vmware-sps service does not start

Impact/Risks:

High since the customer's environment will be impacted since sps controls the vMotion and the creation of VM's.